Skip to content

Pipeline output compatibility #76

@jonathanstelman

Description

@jonathanstelman

Epic 6: Pipeline Integration

Ensure the pipeline integrates cleanly with the new monorepo structure, generates stable resort UUIDs, and exposes metadata to the backend.

Tasks

  • Generate `resort_id` UUIDs for all resorts on first pipeline run; persist mapping in `data/resort_id_map.csv`
  • On subsequent runs, look up existing UUIDs from the mapping and only assign new UUIDs to new resorts — never reassign
  • Ensure `data/resorts.csv` (with `resort_id` column) is committed to the repo and readable by the backend at startup
  • Write `data/pipeline_metadata.json` on each run (already implemented in `pipeline.py`); ensure backend exposes `last_pipeline_run` from this file in `GET /meta`
  • Document how to run the pipeline from the new `/pipeline` directory in the monorepo

Data delivery approach

`data/resorts.csv` is committed to the repo. The backend reads it at startup. The scheduled pipeline (Issue #77) commits the updated CSV back to the repo, and a new deploy picks it up. This keeps the backend stateless with no external storage dependency.

Acceptance Criteria

  • Pipeline generates UUIDs on first run and writes `data/resort_id_map.csv`
  • Re-running the pipeline preserves existing UUIDs for existing resorts
  • `data/resorts.csv` includes `resort_id` column after pipeline runs
  • `GET /meta` response includes `last_pipeline_run` timestamp from `pipeline_metadata.json`
  • README (or `/pipeline/README.md`) documents how to run the pipeline in the new structure

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions