Bootstrap corpus for decy CITL oracle training.
This repository contains C source files from GNU coreutils for training decy's Compiler-in-the-Loop (CITL) oracle. The oracle learns fix patterns for rustc errors, enabling cost-free steady-state C→Rust transpilation.
# Extract coreutils sources
make extract-p0 # Tier P0 (trivial/simple)
make extract-all # All tiers
# Run CITL training cycle
make citl-improve
# Check oracle statistics
make citl-statsexamples/
├── coreutils_yes/ # P0: trivial (50 LOC)
├── coreutils_cat/ # P0: simple (150 LOC)
├── coreutils_cp/ # P1: medium (800 LOC)
├── coreutils_sort/ # P2: complex (1500 LOC)
└── ...
Each example contains:
original.c- Source from GNU coreutils 9.4metadata.yaml- Function annotations, expected errorstranspiled.rs- Generated Rust (after training)
| Code | Description | Coverage |
|---|---|---|
| E0506 | Cannot assign to borrowed | 4 examples |
| E0499 | Multiple mutable borrows | 5 examples |
| E0382 | Use after move | 7 examples |
| E0308 | Type mismatch | 8 examples |
| E0133 | Unsafe required | 5 examples |
| E0597 | Does not live long enough | 3 examples |
| E0515 | Cannot return reference | 1 example |
Import patterns from depyler for faster bootstrap:
make citl-seedSee docs/specifications/examples-spec.md for:
- Selection methodology with 10 peer-reviewed citations
- Metadata schema
- Training workflow
- Validation criteria
- Corpus tooling: MIT
- Coreutils sources: GPL-3.0 (GNU coreutils)