Skip to content

[Feature] Add Dask integration #205

@lpilz

Description

@lpilz

I'm currently looking around for a python package which is able to extract some diagnostics from WRF output and do computations in a distributed way, as I will handle very large datasets in the future. As salem is built on pure python and uses xarray in the background, I prefer it over wrf-python (also because development on wrf-python pretty much ceased).

I already tried to get salem to run with dask in the most naive way possible, however I stumbled over the problem that workers die when executing the open_wrf_dataset function with the error distributed.protocol.core - CRITICAL - Failed to deserialize and TypeError: Could not serialize object of type ImplicitToExplicitIndexingAdapter. I had a quick look into the salem code and I think this might be due to the FakeVariable objects not having serializers associated with them. Does this sound sensible to you?

Since I know that you, @fmaussion, don't have a lot of time to spend on this project I thinking about adding dask compatibility myself. However, before jumping the gun I just wanted to ask if there was any prior work I could maybe build on or if you have any advice.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions