Open
Conversation
streamed and all-at-once chromosome file saving
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
@simon-anders asked to provide an HDF5 interface to the matrices, saved in the middle steps.
In this PR I'm updating the io to use hdf5 instead of npz.
The current way is to save all in a single file, where hdf5 groups correspond to the chromosomes.
I do not change the function interfaces for now, i.e. it still uses a directory as input, hence the output file name
is hardcoded.
@simon-anders @LKremer please comment here in case you have alternative better ideas, since it was discussed only between me and Simon so far.
A duck-typed object is used to represent the sparse matrix from HDF5 file.
preparefunction got an additional argument to choose between in memory or streamed transformation COO->CSR in HDF5.