Skip to content

daf truncation bugs#677

Merged
ChristopherRabotin merged 2 commits intonyx-space:masterfrom
themouli:fix/daf-truncation-bugs
Mar 26, 2026
Merged

daf truncation bugs#677
ChristopherRabotin merged 2 commits intonyx-space:masterfrom
themouli:fix/daf-truncation-bugs

Conversation

@themouli
Copy link
Copy Markdown
Contributor

Summary

  • Fix off-by-one error in num_records calculation after Chebyshev truncation (chebyshev.rs, chebyshev3.rs), which caused the last record to be unreachable in truncated segments
  • Fix incorrect byte offset in persist() (daf.rs), which wrote data starting 1024 bytes too late, corrupting the file structure and making truncated files unqueryable. Closes Truncation and removal of DAF data corrupt data (summaries OK) #262
  • Add bounds check on end_idx in truncate() to prevent panics when truncating segments with few records (e.g. Mercury/Venus relative to their barycenters, or any body truncated to its full time range)

Improvements

  • Re-truncation is possible now. Allowing to use a subset of data instead of loading entire file.
  • The slice bound checks will pass for small files (like mercury with only one record for entire interval length)

Bug Fixes

#262

Testing and validation

  • Truncate a single segment and verify it can be queried
  • Chain truncations of all 14 segments in 'de440s.bsp' (2024-2030)
  • Verify truncated files can be truncated again (re-truncation)
  • Inspect truncated files to confirm correct time ranges

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness and correctness of DAF file handling, particularly concerning data truncation and persistence. It addresses critical bugs that led to data corruption and incorrect record counts after truncation, while also improving the stability of the truncation process for various data sizes. The changes enable more reliable and flexible manipulation of DAF files, allowing for operations like re-truncation and proper handling of small datasets.

Highlights

  • Data Persistence Fix: Resolved a critical bug in the persist() method that caused incorrect byte offsets when writing DAF files, leading to corrupted and unqueryable truncated files. This specifically addresses issue Truncation and removal of DAF data corrupt data (summaries OK) #262.
  • Chebyshev Truncation Accuracy: Corrected an off-by-one error in the num_records calculation within Chebyshev truncation (Type 2 and Type 3), ensuring that the last record in truncated segments is correctly accessible.
  • Truncation Robustness: Implemented a bounds check on end_idx during truncation to prevent panics when dealing with segments containing very few records, improving the stability of the truncation process.
  • Re-truncation Capability: Enabled the ability to re-truncate previously truncated DAF files, allowing for more flexible data subsetting without reloading the entire original file.
  • Small File Compatibility: Ensured that slice bound checks correctly handle small files, such as those with only a single record, preventing issues during truncation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a data offset issue in the DAF::persist function and corrects indexing logic in the Type2ChebyshevSet and Type3ChebyshevSet data types. Specifically, the end_idx calculation and num_records assignment were adjusted to prevent potential panics and ensure correct data slicing. The review comments suggest improving the DAF::persist function by handling errors gracefully with IoResult instead of using .unwrap(), and point out a remaining off-by-one error in the end_idx calculation for Chebyshev data sets, where a 1-based index from spline_idx is not correctly converted to a 0-based index for slicing.

Comment thread anise/src/naif/daf/daf.rs
Comment thread anise/src/naif/daf/datatypes/chebyshev.rs Outdated
Comment thread anise/src/naif/daf/datatypes/chebyshev3.rs Outdated
@ChristopherRabotin
Copy link
Copy Markdown
Member

Thanks for the bug fix, @themouli ! I'm surprised that the off by one at the end of the chebyshev segments wasn't caught in the validation test cases since these should cover the full ephem definition.

I think there is (or was?) a test case that checked the persist function worked correctly. Could you reenable that test to check that your fix works correctly.

Thanks

@themouli
Copy link
Copy Markdown
Contributor Author

themouli commented Mar 25, 2026

Hello @ChristopherRabotin, thanks for letting me know about the test case. The test (test_spk_truncate_cheby) is already enabled and runs in CI. The test was passing because it only checked metadata (the summary's start_epoch), not the actual data integrity of the persisted file.

The summary headers were written correctly, but the actual Chebyshev coefficient data was written at the wrong byte offset.

The new test loads the files as an actual SPK, queries an epoch within the truncated range, and verifies a non-zero result comes back. The query would fail with a decoding error or return garbage data (not checked in the last test).

Thanks

@themouli themouli force-pushed the fix/daf-truncation-bugs branch from d219424 to 1b5f3a9 Compare March 25, 2026 23:43
@themouli
Copy link
Copy Markdown
Contributor Author

Fixed the formatting issue and squashed the commits.

@ChristopherRabotin ChristopherRabotin merged commit fd119e7 into nyx-space:master Mar 26, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Truncation and removal of DAF data corrupt data (summaries OK)

2 participants