-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
featureImplementation tracking for approved featuresImplementation tracking for approved features
Description
Feature Details
Design and implement a stateful feature store that (a) caches per-ticker feature artifacts (lags, masks, embeddings indices, scalers’ states, selected lags, etc.), and (b) supports incremental updates as new observations arrive, without recomputing the full history. The store should be durable (disk-backed), versioned, and safe to read concurrently during training.
Core goals:
-
$O(\Delta t)$ append-only updates for rolling windows and statistics. - Deterministic, reproducible snapshots for training/checkpoint alignment.
- Clean invalidation when upstream raw data changes (cache busting via checksum/version).
Affected Modules
As stated in the parent issue.
Implementation Checklist
- Define per-ticker state payload (e.g., latest window buffers, masks, selected lag set, scaler states (
$P^2$ /$IQR$ /$\bar{X}$ /$\sigma^2$ ), embedding ID maps, last-ingested timestamp). - Add a version field and a content hash (e.g., SHA256 of raw source metadata & feature config) for invalidation.
- Implement disk-backed store
- Rolling window maintenance
- Incremental scalers
- Lag selection refresh cadence
- Detect upstream raw-data changes
- Unit tests:
• Cold start → fit state → update with Δt > 0 → outputs match full recompute on the same window.
• Concurrency: writer during reader; readers see consistent snapshots.
• Invalidation: modify raw data → state invalidates and recomputes.
• Serialization parity across platforms (endian, dtype, version).
Limitations
As stated in the parent issue.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
featureImplementation tracking for approved featuresImplementation tracking for approved features
Projects
Status
Ready