Conversation
alisterburt
left a comment
There was a problem hiding this comment.
Awesome work here as always @davidetorre99 - will follow up with another comment shortly
src/torch_tomogram/tomogram.py
Outdated
| corrected_shifts_yx = corrected_shifts_yx * pixel_spacing | ||
|
|
||
| tilt_stack_full = mrcfile.read(tilt_stack_path) | ||
| included_indices = df["sec"].values - 1 |
There was a problem hiding this comment.
I don't understand why this makes sense, maybe a comment would be useful
alnfile docs say sec corresponds to zero indexed indices into the tilt stack, this - 1 implies the values are one indexed?
also should be made clear that stack is expected to be aretomo output stack (avoid users supplying aretomo input stack and this code implicitly accounting for dark frames which haven't been removed from that stack)
| included_indices = df["sec"].values - 1 | |
| idx_valid = df["sec"].values - 1 |
There was a problem hiding this comment.
Apologies, my mistake. I incorrectly wrote '0 indexed' in the alnfile readme, but it is actually '1 indexed'.
There was a problem hiding this comment.
no big deal, glad we caught it in review! can hopefully avoid someone getting confused in the future 🙂
There was a problem hiding this comment.
Actually, it looks like AreTomo3 switched the SEC indexing from 0-based to 1-based starting in v2.0.10. This isn’t documented in their manual yet, but I noticed this inconsistency by testing alnfile on different versions (I think this is the commit when they introduced ToOneBased: czimaginginstitute/AreTomo3@048b161)
I’ll update alnfile so it automatically handles both conventions internally, while keeping the SEC output normalized to 0-based. For most downstream applications, 0-based indexing seems to be the more practical and consistent choice, but if you have a preference for keeping it 1-based for some workflows, I’m happy to discuss it!
src/torch_tomogram/tomogram.py
Outdated
| tilt_stack_full = mrcfile.read(tilt_stack_path) | ||
| included_indices = df["sec"].values - 1 | ||
| tilt_stack = tilt_stack_full[included_indices] | ||
| tilt_stack = tilt_stack.astype(np.float32) | ||
| tilt_stack -= np.mean(tilt_stack, axis=(-2, -1), keepdims=True) | ||
| tilt_stack /= np.std(tilt_stack, axis=(-2, -1), keepdims=True) |
There was a problem hiding this comment.
hardcoding normalization logic here feels icky...
also I'm used to normalizing on a central 25% crop these days to avoid most cases of hole edges coming into view
There was a problem hiding this comment.
Totally agree, not really sure where to include it though. Perhaps a separate function would make it a bit cleaner?
There was a problem hiding this comment.
spitballing: maybe the class doesn't hold a reference to the stack and instead any methods which require the image data take the image as input? This could be nice for serialization too... what do you think?
great q, love that you're thinking about this!
batching is a really great addition here! I hesitated initially because this locks all patches into reconstructing with identical rotations but... nobody has implemented spatially varying tilt angles so let's not get ahead of ourselves 😂 this is great. During tomogram reconstruction I imagine this blows up memory usage fairly quickly if tomograms are large and you reconstruct everything at once... might be worth taking a batch size parameter there for when people inevitably run into problems
I'm also not sure what's best here and again, love that you're thinking critically about these things. The purist in me says this package should operate on in memory data only, the pragmatist in me says that the use case for quickly enabling people with AT/Etomo alignments to get up and running is a clear win
Wonderful, suggested making it a required input to avoid later issues with CTF accuracy 🙂
I think this is incredible, really exciting to have an actual easy to use torch interface to reconstructing tomograms/local subvolumes |
|
@davidetorre99 thinking about this, now that you've got pixel spacing we should probably
With 2 we should be able to use torch-fourier-rescale, two notes there:
no pressure to add this now/as part of this transition but if you're working in the codebase these are things I'd be thinking about 🙂 |
… size for reconstruction, normalize on central 25%, normalization on the methods, update tests, update examples
|
@alisterburt just want to make sure we're on the same page about the current implementation before splitting it into torch-tilt-series and torch-reconstruct-tomogram.
In the next steps, I would split it into:
I'd love to know what you think! |
|
@davidetorre99 Really awesome work here!
This sounds perfect. Can I help you out with one of these things? |
|
Hey @McHaillet! Thanks a lot for the feedback! At this point it seems mainly about refactoring as specified in the previous comment, so I’ll go ahead and continue with that. If you can review the result afterward to ensure I’ve interpreted everything correctly and everything still makes sense (especially in the structure), that would be really really helpful! Thanks a lot! |
|
Of course, I'll review afterwards! Looking forward to see the updates :) |
@McHaillet @alisterburt with this commit I tried to address #12 #13 #15 #16. Sorry if it comes as a single commit, but I thought that they were largely interconnected. A short summary:
RE: #12 get DFT patches from torch-subpixel-crop:
Added support for Fourier space patch extraction using return_rfft=True in subpixel_crop_2d. Patches are now extracted directly in Fourier space and passed to insert_central_slices_rfft_3d_multichannel for reconstruction. That is the same as before, with backproject_2d_to_3d from torch-fourier-slice, but skipping one FFT step. I like the clarity here, but maybe it's too redundant. Do you think it would be better to directly add the option to backproject_2d_to_3d, or do you have a more streamlined way of reconstruction in mind?
RE: #13 batched reconstruction:
All patches at all positions are now reconstructed simultaneously using the multichannel from insert_central_slices_rfft_3d_multichannel. Added reconstruct_patches_batched() method that should processes everything in parallel. For tiling I swapped the for loop with einops.rearrange.
RE #15 class methods to load etomo and aretomo data directly:
Added Tomogram.from_etomo_directory() and Tomogram.from_aretomo_aln() class methods. I like the idea of only providing the path and then having everything handled within torch-tomogram (I updated the examples for both etomo and aretomo reconstruction to show what it would look like). However, perhaps providing the dataframe as input might offer more flexibility?
RE #16 define translations in Angstroms:
Added optional pixel_spacing parameter. When provided, sample_translations are stored in Angstroms and converted to pixels internally via sample_translations_px. Not sure if that was the way you were envisioning this, though
I tested it on my data and it all works fine.
I'd love to know what you guys think!