Skip to content

feat: reduce JSON report size (#437)#440

Open
ssrhaso wants to merge 3 commits intomainfrom
437-reduce-json-report-size
Open

feat: reduce JSON report size (#437)#440
ssrhaso wants to merge 3 commits intomainfrom
437-reduce-json-report-size

Conversation

@ssrhaso
Copy link
Copy Markdown
Contributor

@ssrhaso ssrhaso commented Apr 10, 2026

Summary

Reduce JSON report file size by removing large derived arrays and externalising individual record data.

  • Strip fpr, tpr, and roc_thresh arrays from JSON serialisation via a _json_exclude_keys class attribute on the Attack base class. PDF generation is unaffected as it reads from the in-memory dict before serialisation.
  • Externalise LiRA individual record scores to a compressed .npz file, storing a relative filename in JSON for portability. Includes y_pred_proba and y_test to allow ROC recomputation from stored data.
  • Add backwards compatibility guard in LogLogROCModule for JSON files without ROC arrays.

For a CIFAR10-scale dataset, this reduces a single attack JSON from approximately 1.2MB to under 100KB.

Closes #437

@ssrhaso ssrhaso self-assigned this Apr 10, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.52%. Comparing base (1863c65) to head (2b70e0e).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #440   +/-   ##
=======================================
  Coverage   99.51%   99.52%           
=======================================
  Files          23       23           
  Lines        2692     2713   +21     
=======================================
+ Hits         2679     2700   +21     
  Misses         13       13           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ssrhaso ssrhaso requested a review from rpreen April 10, 2026 11:46
@ssrhaso
Copy link
Copy Markdown
Contributor Author

ssrhaso commented Apr 10, 2026

@rpreen Please let me know what you think of this when you get a chance!

@jim-smith
Copy link
Copy Markdown
Contributor

@ssrhaso does this affect the meta-attack?

ssrhaso added 2 commits April 17, 2026 10:15
…t-size

# Conflicts:
#	sacroml/attacks/attack.py
#	sacroml/attacks/likelihood_attack.py
Avoids stray .npz files when callers request individual scores
in-memory but no report on disk (e.g. MetaAttack sub-runs).
@ssrhaso
Copy link
Copy Markdown
Contributor Author

ssrhaso commented Apr 17, 2026

@ssrhaso does this affect the meta-attack?

@jim-smith To my understanding and after checking thoroughly, no it does not.

MetaAttack reads per-record scores from the in-memory sub-attack object, not from the JSON, so the key-stripping doesn't affect it.

Spotted one related side-effect while checking: the new .npz write in LIRAAttack._save_attack_metrics was gated on report_individual only, so MetaAttack sub-runs (which set write_report=False) would have dropped stray lira_individual.npz files in each sub_dir. Pushed a fix to also gate on write_report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[New Feature Request] Reduce JSON report size

2 participants