Skip to content

[TEST] Cross-language conformance test suite for fd5 format #155

@gerchowl

Description

@gerchowl

Parent

Prerequisite for #144 — Multi-language fd5 bindings

Description

Create a suite of canonical fd5 sample files and corresponding expected-result fixtures that any fd5 implementation (Python, Rust, Julia, C/C++, TypeScript) must pass to prove format conformance. The test suite is language-agnostic — it defines what to test, not how.

Motivation

Without a shared conformance suite, each language binding will be tested in isolation against its own understanding of the format. Interoperability bugs (subtle differences in hashing, attribute encoding, dtype mapping, provenance link format) will only surface when users try to exchange files across languages. A conformance suite catches these at development time.

Proposed Structure

tests/conformance/
├── README.md                  # How to use the suite, how to add cases
├── fixtures/
│   ├── minimal.fd5            # Smallest valid fd5 file
│   ├── with-provenance.fd5    # File with source links
│   ├── multiscale.fd5         # File with pyramid/multiscale datasets
│   ├── tabular.fd5            # Compound dataset (event table)
│   ├── complex-metadata.fd5   # Deeply nested metadata groups
│   └── sealed.fd5             # File with verified content hash
├── expected/
│   ├── minimal.json           # Expected root attributes, dataset shapes, dtypes
│   ├── with-provenance.json   # Expected provenance DAG
│   ├── multiscale.json        # Expected pyramid levels and shapes
│   ├── tabular.json           # Expected column names, dtypes, row count
│   ├── complex-metadata.json  # Expected metadata tree
│   └── sealed.json            # Expected hash value and verification result
└── invalid/
    ├── missing-id.fd5         # Missing required root attribute
    ├── bad-hash.fd5           # Content hash doesn't match
    ├── no-schema.fd5          # Missing _schema attribute
    └── expected-errors.json   # What error each invalid file should produce

Test Categories

  1. Structure tests — correct group hierarchy, required attributes present
  2. Data round-trip tests — write values, read them back, compare (dtype, shape, values)
  3. Hash verification tests — sealed files verify correctly, tampered files fail
  4. Provenance tests — DAG traversal returns expected source chain
  5. Schema validation tests — embedded schema validates the file's own structure
  6. Negative tests — invalid files are rejected with appropriate errors

How Bindings Use the Suite

Each language binding includes a test that:

  1. Opens each fixture file using its own reader
  2. Extracts the values specified in the corresponding expected JSON
  3. Asserts equality

This is a black-box test — it doesn't test internal APIs, only the format contract.

Acceptance Criteria

  • Fixture files generated by the Python reference implementation
  • Expected-result JSON files cover all test categories above
  • Python test runner passes against all fixtures (proving the fixtures are correct)
  • README documents how to add new conformance cases
  • Invalid fixtures produce clear, documented rejection reasons

Additional Context

Metadata

Metadata

Assignees

Labels

area:corefd5 core libraryarea:testingTest infrastructure, BATS, pytesteffort:medium1-4 hourspriority:highShould be done in the current milestone

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions