-
Notifications
You must be signed in to change notification settings - Fork 53
Description
Hey everyone,
As reported initially by @nbonine , when preprocessing is performed during acquisition of MACSima data, for each cycle an independent folder is created.
The current reader handles this using the current preprocessed_multiple_folders parsing style. This results in separate Image, Table and coordinate system elements for each cycle. The result of this is not what a user typically expects, because all these cycles belong together and typically are analyzed together. Currently there is no straightforward way for the user to specifiy this.
For example:
my_data
- 3_Scan2
--- some_images.tif
- 6_Cycle1
--- some_more_images.tif
- 7_Cycle2
--- even_more_images.tif
is parsed into:
SpatialData object
├── Images
│ ├── '3_Scan2_image': DataTree[cyx] (4, 15275, 27678), (4, 7637, 13839), (4, 3818, 6919), (4, 1909, 3459), (4, 954, 1729)
│ ├── '6_Cycle1_image': DataTree[cyx] (4, 15275, 27678), (4, 7637, 13839), (4, 3818, 6919), (4, 1909, 3459), (4, 954, 1729)
│ └── '7_Cycle2_image': DataTree[cyx] (4, 15275, 27678), (4, 7637, 13839), (4, 3818, 6919), (4, 1909, 3459), (4, 954, 1729)
└── Tables
├── '3_Scan2_table': AnnData (0, 4)
├── '6_Cycle1_table': AnnData (0, 4)
└── '7_Cycle2_table': AnnData (0, 4)
with coordinate systems:
▸ '3_Scan2', with elements:
3_Scan2_image (Images)
▸ '6_Cycle1', with elements:
6_Cycle1_image (Images)
▸ '7_Cycle2', with elements:
7_Cycle2_image (Images)
I propose the following:
- Deprecation of the
autodiscovery of the parsing style to use (the current default!). - Instead the
preprocessed_single_folderbecomes the new default. We change this in such a way that all tifs in the specified path, and all subdirectories are parsed together into 1Imageelement. This would handle the regular case (1 folder with all tifs) and the case that happens when preprocessing is run during acquisition (several subfolders, each with tifs of 1 cycle). - We keep the
preprocessed_multiple_folderoption for the case that a user wants to do batch analysis of multiple ROIs. For example a user could have multiple ROIs of a single well, which are saved into separate folders. In these cases it is desired that the images of subfolders are separated, because they describe different image stacks.
I will submit an example implementation of this. But since this touches on the default settings of the reader, and I am not sure what @berombau intended originally with the different parsing styles I would love to have a discussion on this.