When loading from one disk, making parallel data request usually does not improve I/O speed. But in some parallel hardware context or if the data need to be inflated (e.g. because ZFP compression was used) parallel loading of data chunks can improve I/O significantly (i.e. 8x speedup has been observed).
Two Xdas mechanisms could be improved:
- The
__array__ method of the VirtualArray class could have a parallel optional argument that could also be configured by xdas.config.set("parallel_read", 8).
- The
DataArrayLoader class could load several chunks in a parallel fashion also with a parallel optional argument (right now it only loads chunks one by one in a async way).
It would be nice to do some benchmarks to se what is the best, probably implementing the two is the best.
When loading from one disk, making parallel data request usually does not improve I/O speed. But in some parallel hardware context or if the data need to be inflated (e.g. because ZFP compression was used) parallel loading of data chunks can improve I/O significantly (i.e. 8x speedup has been observed).
Two Xdas mechanisms could be improved:
__array__method of theVirtualArrayclass could have aparalleloptional argument that could also be configured byxdas.config.set("parallel_read", 8).DataArrayLoaderclass could load several chunks in a parallel fashion also with aparalleloptional argument (right now it only loads chunks one by one in a async way).It would be nice to do some benchmarks to se what is the best, probably implementing the two is the best.