fev-macro

Overview

fev-macro is a reproducible US real-GDP forecasting benchmark built on fev rolling-window evaluation. It combines historical FRED vintage panels, release-consistent GDP truth from ALFRED, and both classical and factor-style models. The repo is organized around one authoritative pipeline for panel building, evaluation, realtime OOS scoring, and latest-vintage 2025Q4 forecasting.

What this repo does

Build vintage panels for FRED-QD and FRED-MD historical vintages.
Build processed panels using FRED transform codes + MD outlier/trimming semantics (code zip + fbi).
Build GDP release truth table from ALFRED and compute release-vintage q/q and q/q SAAR growth for first/second/third releases.
Run fev evaluation for both processed and unprocessed panels, with different target objectives (LL vs G) and release-truth mappings.
Produce a 2025Q4 one-shot forecast using latest FRED API pulls and processed-mode top models.

Quickstart

make setup
make download-historical
make panel-qd panel-md
make panel-qd-processed panel-md-processed
make build-gdp-releases
make eval-unprocessed-standard
make eval-processed-standard

Modes: processed vs unprocessed

Mode	Covariates	Training objective	KPI for comparison
`unprocessed`	Raw vintage covariates	`log_level` (LL)	q/q SAAR real GDP growth vs ALFRED release truth
`processed`	Transform-code + MD outlier/trimming processed covariates	`saar_growth` (G)	q/q SAAR real GDP growth vs BEA-verified ALFRED first-vintage truth (`qoq_saar_growth_alfred_first_pct`)

Details: docs/data_processing.md

Models included

Core baselines and multivariate models include naive_last, mean, drift, ar4, auto_arima, auto_ets, theta, local_trend_ssm, random_forest, xgboost, factor_pca_qd, mixed_freq_dfm_md, bvar_minnesota_8, bvar_minnesota_20, bvar_minnesota_growth_8, bvar_minnesota_growth_20, chronos2, and LSTM variants lstm_univariate / lstm_multivariate (plus optional ensemble variants).

LSTM variants require PyTorch (torch>=2.2.0). They are included in --profile full, and remain opt-in for other model/profile selections.

python scripts/run_eval_processed.py --models lstm_univariate lstm_multivariate --num_windows 20

Full model catalog: docs/models.md

Real-time evaluation policy

By default, every rolling window trains on an as-of vintage (vintage-correct). Snapshot evaluation is blocked unless you explicitly pass --allow_snapshot_eval. For processed run_eval, release truth defaults to BEA-verified ALFRED q/q SAAR first-vintage growth from data/panels/gdpc1_releases_first_second_third.csv (qoq_saar_growth_alfred_first_pct) via --eval_release_metric alfred_qoq_saar --eval_release_stages first --target_transform saar_growth. ALFRED q/q non-SAAR truth remains available via --eval_release_metric alfred_qoq --target_transform qoq_growth, and realtime SAAR truth remains available via --eval_release_metric realtime_qoq_saar --target_transform saar_growth.

As-of database (ragged-edge realtime)

fev-macro now supports a bitemporal as-of store (DuckDB) for heterogeneous data arrivals/revisions.

Set API key:

export FRED_API_KEY="..."

First backfill + update:

python scripts/sync_alfred_asof_store.py \
  --db data/realtime/asof.duckdb \
  --universe both \
  --backfill_missing \
  --observation_start 1959-01-01

Incremental refresh (daily/hourly):

python scripts/sync_alfred_asof_store.py \
  --db data/realtime/asof.duckdb \
  --universe both \
  --no-backfill_missing \
  --lookback_days 7

Query an as-of snapshot:

python scripts/example_query_asof.py \
  --db data/realtime/asof.duckdb \
  --asof 2019-05-01 \
  --series GDPC1,CPIAUCSL,UNRATE \
  --obs_start 1990-01-01 \
  --out data/realtime/snapshot_2019-05-01.csv

Use as-of snapshots inside realtime OOS:

python scripts/run_realtime_oos.py \
  --mode processed \
  --asof_db data/realtime/asof.duckdb \
  --asof_universe both

Latest-vintage one-shot forecast + 2025Q4 comparison

make fetch-latest && make process-latest && make latest-forecast-processed
make plot-2025q4

Optional BoE evaluation workflow

Use BoE-style schema exports, DM tests, rolling/fluctuation diagnostics, and plots:

pip install -r requirements.txt

python -m fev_macro.boe export --predictions_csv results/realtime_oos/predictions.csv --release_table_csv data/panels/gdpc1_releases_first_second_third.csv --out_dir results/boe_export --truth first --variable GDPC1 --metric levels --forecast_value_col y_hat_level
python -m fev_macro.boe eval --forecasts_csv results/boe_export/boe_forecasts.csv --outturns_csv results/boe_export/boe_outturns.csv --k 0 --benchmark_model naive_last --out_dir results/boe_results
python -m fev_macro.boe plots --forecasts_csv results/boe_export/boe_forecasts.csv --outturns_csv results/boe_export/boe_outturns.csv --variable GDPC1 --source naive_last --metric levels --frequency Q --k 0 --horizon 0 --ma_window 4 --out_dir results/boe_plots

k=0 corresponds to first-release truth under the default export conventions. More details: docs/boe_evaluation.md

Outputs + repo layout

data/historical/: downloaded FRED vintage archives and extracted CSVs
data/panels/: generated vintage panels and GDP release truth table
data/latest/, data/processed/: latest API pulls and processed latest snapshots
results/: evaluation outputs, leaderboards, realtime OOS metrics, and forecast plots
scripts/: core pipeline entrypoints
scripts/dev/: non-core development utilities
docs/: deeper protocol/model/data notes

References

FRED databases historical vintages: https://www.stlouisfed.org/research/economists/mccracken/fred-databases
FRED databases code zip (transform codes + trimming/outliers): https://www.stlouisfed.org/-/media/project/frbstl/stlouisfed/research/fred-md/fred-databases_code.zip?sc_lang=en&hash=82A2EEE1EF3498C0820EB2212531D895
fbi library: https://github.com/cykbennie/fbi
ALFRED: https://alfred.stlouisfed.org
FRED API: https://api.stlouisfed.org/fred
fev (Forecast EValuation library): https://github.com/autogluon/fev

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
data		data
docs		docs
scripts		scripts
src/fev_macro		src/fev_macro
tests		tests
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fev-macro

Overview

What this repo does

Quickstart

Modes: processed vs unprocessed

Models included

Real-time evaluation policy

As-of database (ragged-edge realtime)

Latest-vintage one-shot forecast + 2025Q4 comparison

Optional BoE evaluation workflow

Outputs + repo layout

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

fev-macro

Overview

What this repo does

Quickstart

Modes: processed vs unprocessed

Models included

Real-time evaluation policy

As-of database (ragged-edge realtime)

Latest-vintage one-shot forecast + 2025Q4 comparison

Optional BoE evaluation workflow

Outputs + repo layout

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages