SF Pools Schedule Viewer

Centralized, searchable schedules for San Francisco public swimming pools. This app scrapes official pool schedule PDFs from SF Rec & Park, uses an LLM to extract structured schedules, and provides a clean UI to browse by program, day, time, and pool.

Tech

Next.js (App Router), React 19, TypeScript
Tailwind CSS v4 (via @tailwindcss/postcss and @import "tailwindcss")
Vercel AI SDK (ai) with Google Generative AI provider (@ai-sdk/google)
Zod for strict schema validation
GitHub Actions for automated weekly schedule updates

Prerequisites

Node.js 22+
npm
A Google Generative AI API key

Setup

Install dependencies:
```
npm install
```

Create .env.local in the project root and add:

GOOGLE_GENERATIVE_AI_API_KEY=your_key_here

Run the dev server:
```
npm run dev
```
Generate schedules:
```
npm run build-schedules
```

This will scrape PDF URLs, download PDFs, extract schedules, and write public/data/all_schedules.json. View at /schedules.

Architecture

Data Files

data/pools.json — Source of truth for static pool metadata (id, name, shortName, address, pageUrl). Does not change frequently.
public/data/discovered_pool_schedules.json — Scraped PDF URLs (poolId → pdfUrl mapping). Regenerated on each scrape.
data/pdf-manifest.json — Tracks downloaded PDFs by hash to detect changes.
data/extracted/<poolId>.json — Cached LLM extractions per PDF.
public/data/all_schedules.json — Aggregated schedule data for the UI.
data/changelog/ — Change history between schedule updates.

Pipeline Flow

scrape →    download-pdfs →   process-all-pdfs
   ↓             ↓                 ↓
validates   checks hash       preserves data
pool count  downloads if      fails on large
& URLs      content changed   changes

Scrape: Discovers pool pages and PDF URLs from SF Rec & Park. Validates against pools.json — fails if pool count or page URLs change unexpectedly.
Download: Fetches PDFs, checks content hash against manifest. Only downloads if content actually changed (handles URL changes gracefully).
Process: Extracts schedules via LLM, preserves unchanged pool schedules, detects large changes.

Change Detection

The pipeline computes a changelog comparing old vs new schedules:

none/minor: Normal updates, build succeeds
major/wholesale: Large changes detected, build fails in CI (requires manual review)

Set FAIL_ON_LARGE_CHANGES=false locally to bypass this check during development.

Scripts

npm run dev — start Next.js dev server
npm run build — build for production
npm run start — start production build
npm run lint — run ESLint
npm run test — run tests
npm run scrape — scrape pool pages, validate against pools.json, discover PDF URLs
npm run download-pdfs — download changed PDFs into data/pdfs/
npm run process-all-pdfs — extract schedules from PDFs, preserve unchanged, write changelog
npm run build-schedules — full pipeline: scrape → download → process
npm run scrape-alerts — scrape pool alerts from SF Rec & Park
npm run analyze-programs — analyze raw vs canonical program names

How Extraction Works

For each PDF, send content to the LLM with a strict Zod schema
The model extracts:
- Pool metadata (name, season, date range)
- Program entries with day, time, lanes, notes
Pipeline normalizes program names to canonical labels (e.g., "LAP SWIM" → "Lap Swim")
Enriches with static metadata from pools.json (address, URLs)

Automated Updates

GitHub Actions runs weekly to:

Scrape and download new PDFs
Extract schedules from changed PDFs
Commit changes to public/data/ and data/changelog/
Send push notifications via Pushover

If large changes are detected, the build fails and a notification is sent for manual review.

Environment Variables

# Required: Google AI API key for schedule extraction
GOOGLE_GENERATIVE_AI_API_KEY=your_key_here

# Optional: Disable build failure on large changes (for local dev)
FAIL_ON_LARGE_CHANGES=false

# Optional: Force re-extraction even if cache exists
REFRESH_EXTRACT=1

# For CI notifications (GitHub Actions secrets)
PUSHOVER_USER_KEY=...
PUSHOVER_API_TOKEN=...

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github/workflows		.github/workflows
data		data
docs		docs
public		public
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
.prettierrc		.prettierrc
LICENSE		LICENSE
README.md		README.md
eslint.config.mjs		eslint.config.mjs
jest.config.ts		jest.config.ts
next-env.d.ts		next-env.d.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
prd.md		prd.md
tsconfig.json		tsconfig.json
tsconfig.tsbuildinfo		tsconfig.tsbuildinfo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SF Pools Schedule Viewer

Tech

Prerequisites

Setup

Architecture

Data Files

Pipeline Flow

Change Detection

Scripts

How Extraction Works

Automated Updates

Environment Variables

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SF Pools Schedule Viewer

Tech

Prerequisites

Setup

Architecture

Data Files

Pipeline Flow

Change Detection

Scripts

How Extraction Works

Automated Updates

Environment Variables

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages