Skip to content

Refactor update.sh script flags and enhance documentation#132

Merged
chris-c-thomas merged 8 commits intomainfrom
chore/update-script-redesign
Apr 25, 2026
Merged

Refactor update.sh script flags and enhance documentation#132
chris-c-thomas merged 8 commits intomainfrom
chore/update-script-redesign

Conversation

@chris-c-thomas
Copy link
Copy Markdown
Owner

@chris-c-thomas chris-c-thomas commented Apr 25, 2026

This pull request unifies and simplifies the update script workflow for incremental content updates across all sources. The update scripts (update.sh, update-ecfr.sh, update-fr.sh, update-usc.sh) now share a consistent flag scheme, use per-source checkpoints for incremental updates, and improve error handling and usability. Documentation and changelogs have been updated to reflect these changes, and several new features and fixes have been introduced.

Unified Update Script Workflow and CLI Flags:

  • The main update script (update.sh) now updates all sources incrementally by default, using each source's last checkpoint. Source selection is via --source, and source-specific flags (--titles, --days, --from, --to) are now top-level options. Old per-source flag prefixes (e.g., --ecfr-titles, --fr-days, --usc-force) are removed; using them prints a migration hint and exits. ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR8-R21), [[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-e987f59775115a6fa2f7f0d216c4e49d281201a10ed090c24c0a9775ef2ece7fR15-R22), [[3]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7L132-R164), [[4]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L134-R163), [[5]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-d4231f0f6f1d86016afc16fb672e3afd4a76909f0ff26265150373b7504a8ee8L130-R144), [[6]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-810813477a09607cc3bf7b5686f5d6ae5a5e94bcb8e32d056f3bd9ad884da396L146-R182), [[7]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84L6-R31), [[8]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R53-R73), [[9]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R88-R172))

  • All update scripts support new flags: --skip-search, --dry-run, and consistent --force semantics. --dry-run prints the planned actions and exits. --skip-search skips the search reindex step. ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR8-R21), [[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-e987f59775115a6fa2f7f0d216c4e49d281201a10ed090c24c0a9775ef2ece7fR15-R22), [[3]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7L132-R164), [[4]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L134-R163), [[5]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-d4231f0f6f1d86016afc16fb672e3afd4a76909f0ff26265150373b7504a8ee8L130-R144), [[6]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-810813477a09607cc3bf7b5686f5d6ae5a5e94bcb8e32d056f3bd9ad884da396L146-R182), [[7]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84L6-R31), [[8]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R53-R73), [[9]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R88-R172))

Incremental Update Checkpoints and Bootstrapping:

  • Each source now maintains a checkpoint file in downloads/<source>/: eCFR uses .ecfr-titles-state.json, USC uses .usc-release-point, and FR now uses .fr-state.json (with { lastRun, lastDate }). Default runs resume from these checkpoints. If the checkpoint is missing, eCFR/USC bootstrap into a full first-run automatically; FR errors with a hint and requires --from or --days. ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR8-R21), [[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-e987f59775115a6fa2f7f0d216c4e49d281201a10ed090c24c0a9775ef2ece7fR15-R22), [[3]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7L22-R37), [[4]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7L132-R164), [[5]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L134-R163), [[6]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-d4231f0f6f1d86016afc16fb672e3afd4a76909f0ff26265150373b7504a8ee8L130-R144), [[7]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-810813477a09607cc3bf7b5686f5d6ae5a5e94bcb8e32d056f3bd9ad884da396L146-R182))

Improved Error Handling and Usability:

  • The update scripts now provide clearer error messages and migration hints for removed flags and invalid flag combinations. Help text extraction is now cross-platform (using awk instead of sed for macOS compatibility). ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR8-R21), [[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7R296-R298), [[3]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R88-R172))

  • The --verbose / -v flag now works as intended: it is accepted by all update scripts and correctly passed to CLI convert steps. Previously, sub-scripts would reject the flag and exit with "Unknown option". ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR8-R21), [[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-e987f59775115a6fa2f7f0d216c4e49d281201a10ed090c24c0a9775ef2ece7fR15-R22))

Documentation and Changelog Updates:

  • All relevant documentation (README.md, CLAUDE.md, Astro docs) and the changelog are updated to reflect the new unified workflow, flag scheme, checkpoint details, and new features. ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR8-R21), [[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-e987f59775115a6fa2f7f0d216c4e49d281201a10ed090c24c0a9775ef2ece7fR15-R22), [[3]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7L22-R37), [[4]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7L132-R164), [[5]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7R296-R298), [[6]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L134-R163), [[7]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-d4231f0f6f1d86016afc16fb672e3afd4a76909f0ff26265150373b7504a8ee8L130-R144), [[8]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-810813477a09607cc3bf7b5686f5d6ae5a5e94bcb8e32d056f3bd9ad884da396L146-R182))

Script Internals and Refactoring:

  • update-ecfr.sh is refactored to use the new flag scheme, mode resolution, and dry-run planning. It now rejects old flags, detects mutually exclusive options, and prints a detailed plan when --dry-run is used. ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84L6-R31), [[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R53-R73), [[3]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R88-R172))

These changes make the update process more robust, user-friendly, and consistent across all sources.

Single grammar across update.sh and the three sub-scripts. `./scripts/update.sh`
(no args) now updates all sources incrementally from each source's last
checkpoint.

Flag groups: source selection (--source), mode (--force / --deploy-only),
phase control (--skip-deploy / --skip-highlights / --skip-search), source-
scoping (--titles for eCFR/USC; --from/--to/--days for FR), utility
(--dry-run, --verbose, --help).

Removed legacy prefixes:
  --ecfr-titles, --ecfr-all, --ecfr-skip-highlights
  --fr-days, --fr-from, --fr-to
  --usc-force, --usc-skip-highlights
  update-ecfr.sh --all
Each prints a migration hint and exits 1.

New FR checkpoint at downloads/fr/.fr-state.json ({ lastRun, lastDate }):
default invocation resumes from lastDate; bootstrap (no checkpoint) errors
with a hint requiring --from or --days, since FR has no inherent "all".
eCFR/USC bootstrap (missing checkpoint) now logs explicitly and runs a
full first-run automatically.

Search-index strategy under --force:
  Default / single-source --force: incremental (deploy.sh --search-docker --source X)
  --force on all three sources:   single full reindex at the end (no --source)

Other:
  - --help now uses awk (BSD/GNU sed compatible) instead of GNU-only sed.
  - --dry-run forwards to each sub-script and uses `|| true` so one source's
    bootstrap-error doesn't halt the multi-source preview.
  - eCFR title-granularity safety guard skipped in `titles` mode (single-
    title runs only refresh one file and legitimately leave the rest alone).
- Root CLAUDE.md: rewrite the script tree and Build & Dev Commands section
  to reflect the unified grammar; add checkpoint-file docs and bootstrap
  rules.
- Root README.md: rewrite the Incremental Updates section.
- Root CHANGELOG.md: add Unreleased entry describing the rename.
- apps/astro/src/content/docs/cli/commands.md: replace old flag examples
  in the Update Scripts subsection.
- apps/astro/src/content/docs/guides/bulk-download.md: rewrite the
  Update Scripts subsection with checkpoint + bootstrap details.
- apps/astro/src/content/docs/project/changelog.md: mirror Unreleased
  entry.
…ation

PR-review findings addressed:

CRITICAL
- update.sh: empty-array expansion crashed default ./scripts/update.sh under
  bash 3.2 (macOS) + set -u. `printf "%s\n" "${args[@]}"` and call sites
  expanding "${ECFR_ARGS[@]}" etc. all hit `args[@]: unbound variable`.
  Fixed via the bash 3.2-safe ${arr[@]+"${arr[@]}"} pattern; build_common_args
  now emits one flag per echo line instead of via an intermediate array.
- update.sh: removed_flag arms referenced $2 unconditionally, so bare
  --ecfr-titles/--fr-days/--fr-from/--fr-to crashed with `$2: unbound variable`
  before the migration hint could print. Fixed with ${2:-<placeholder>}.

DATA-LOSS
- update-fr.sh: backfill (e.g. --from 2020-01-01 --to 2020-12-31) on a
  machine whose checkpoint already pointed at 2026-04-20 would overwrite
  the cursor with the older date and trigger a multi-year redownload on
  the next default run. Now writes max(CHECKPOINT_DATE, DATE_TO).

ACCURACY / VALIDATION
- update.sh: --titles validation now requires --source to include ecfr.
  USC sub-script doesn't accept --titles; previously --source usc --titles X
  passed through silently and ran the full USC pipeline with no filtering.
- update.sh: help text and docs updated from "eCFR/USC only" to "eCFR only".
- update-usc.sh: header step 9 now says "via local Docker (shipped to VPS)"
  matching update-ecfr.sh / update-fr.sh and the inline rationale.
- update-fr.sh: step numbering aligned with the 8-step header (was N/6).

ROBUSTNESS
- update-fr.sh: checkpoint write now atomic (tmp + rename) so a partial-
  write crash can't corrupt .fr-state.json into something read_checkpoint_date
  fails to parse.
- update-fr.sh: read_checkpoint_date now warns on stderr when the file
  exists but is malformed/missing-lastDate, instead of silently routing
  the user to a misleading "no checkpoint" bootstrap-error.
- update-fr.sh: node -e snippets now pass paths via env vars instead of
  string-substitution, eliminating a single-quote injection risk.
- update.sh: failure-summary line now goes to stderr; sub-script failure
  warning also routes to stderr.

VERIFICATION
- /bin/bash (3.2) ./scripts/update.sh --source ecfr --skip-deploy --skip-search
  → exit 0 (was: bash unbound-variable crash)
- ./scripts/update.sh --ecfr-titles → migration error, exit 1 (was: crash)
- ./scripts/update.sh --source usc --titles 1,17 → "--titles requires
  --source to include ecfr", exit 1 (was: silent full-USC run)
- max-date logic unit-tested: backfill preserves checkpoint, forward
  advances it, bootstrap adopts date_to.
- Malformed checkpoint produces explicit "Warning: ... is malformed"
  before bootstrap fallthrough.
Replace build_common_args() function — which printed flags as text on
stdout to be parsed back into arrays via `while IFS= read -r` — with a
plain top-level COMMON_ARGS=() array. Each per-source dispatch then
appends its own scoping flags plus COMMON_ARGS via standard array
concatenation.

Drops --skip-highlights from the common bag and appends it inline only
for eCFR/USC, which removes the post-build filtered_fr loop that
existed solely to strip --skip-highlights before invoking update-fr.sh
(FR has no highlights step).

Pure refactor: dry-run output, migration errors, --titles validation,
and bash 3.2 compatibility verified identical to before.
Pre-existing bug: update.sh parsed -v/--verbose, set VERBOSE=true, and
forwarded --verbose to every sub-script. None of the three sub-scripts
implemented --verbose, so they rejected it as "Unknown option" and exited 1,
marking each source as FAILED in the orchestrator's summary.

Fix:
- update-ecfr.sh, update-fr.sh, update-usc.sh now accept -v/--verbose and
  pass --verbose through to their convert-{source} CLI invocation. The
  download-* commands don't accept --verbose so they're unchanged.
- Each sub-script's header docstring documents the new flag.

The --verbose flag thus does what the help text always claimed: shells out
to the verbose code path in convert-fr / convert-ecfr / convert-usc, which
all accepted -v/--verbose at the CLI layer all along.
Adds --verbose / -v to the update-script command tables in:
- CLAUDE.md (Build & Dev Commands section)
- README.md (Incremental Updates section)
- apps/astro/src/content/docs/cli/commands.md
- apps/astro/src/content/docs/guides/bulk-download.md

Plus a Fixed entry in the root CHANGELOG and the Astro project changelog
mirror, noting that --verbose was previously broken and now works.
Captured from this session's PR-review findings:
- macOS bash 3.2 + set -u crashes empty-array expansion (fix: ${arr[@]+...})
- set -u + $2 in value-taking case arms crashes before user-error handling
  (fix: ${2:-<placeholder>})
- BSD sed vs GNU sed brace blocks differ; use awk for cross-platform
  multi-line text extraction

All three apply to every script in scripts/ since they all use
set -euo pipefail.
@chris-c-thomas chris-c-thomas force-pushed the chore/update-script-redesign branch from 003d677 to ea3c777 Compare April 25, 2026 01:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the content-pipeline update orchestration to use a unified flag grammar across update.sh and the per-source sub-scripts, adds planning/preview support, and updates documentation to match the new checkpoint/bootstrap behavior.

Changes:

  • Unified flag scheme across update.sh, update-ecfr.sh, update-fr.sh, and update-usc.sh (top-level scoping flags, standardized --force, --skip-search, --dry-run, --verbose).
  • Added/clarified checkpoint + bootstrap behavior (FR uses a new JSON checkpoint; eCFR/USC bootstrap to full first-run when missing).
  • Updated docs/changelogs to reflect the new workflow and examples.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
scripts/update.sh Orchestrator refactor: unified flags, validation, dry-run plan output, and single post-run sitemap/search behavior.
scripts/update-ecfr.sh Migrated to unified flags; adds modes, --dry-run, --skip-search, and verbose passthrough.
scripts/update-fr.sh Adds FR checkpointing (.fr-state.json), mode resolution, --skip-search, --dry-run, and verbose passthrough.
scripts/update-usc.sh Adds modes, --skip-search, --dry-run, and verbose passthrough.
apps/astro/src/content/docs/project/changelog.md Documents the unified update-script behavior as “Unreleased”.
apps/astro/src/content/docs/guides/bulk-download.md Updates guidance/examples for the new orchestrator and checkpoints.
apps/astro/src/content/docs/cli/commands.md Updates update-script examples and links to checkpoint behavior.
README.md Updates “Incremental Updates” section to reflect new flags and usage.
CLAUDE.md Updates repo script documentation and adds notes about bash/macOS pitfalls.
CHANGELOG.md Adds an Unreleased entry describing the unified flag scheme and checkpoint changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/update-fr.sh
Comment thread scripts/update.sh Outdated
Comment thread scripts/update.sh Outdated
Comment thread scripts/update.sh
Three valid findings from Copilot's review of PR #132:

#2 — full search reindex no longer hard-fails on local-only runs
update.sh's RUN_FULL_SEARCH_AFTER path called `deploy.sh --search-docker`
unconditionally on `--force` across all three sources. deploy.sh hard-errors
when VPS_HOST is unset, while sub-scripts auto-fall-back to local-only in
that case. The orchestrator now mirrors that auto-fallback: prints
"Skipping full search reindex" and continues when VPS_HOST is unset.

#3 — `--source ecfr, fr` (with space after comma) now works
Previous parsing left a leading space on the second entry, which then
failed the source validator. Now strips surrounding whitespace per entry
before validation, matching the user-friendly comma-list shape.

#4 — bare value-taking flags print a friendly error
`./scripts/update.sh --source` (or --titles, --from, --to, --days) under
set -u previously crashed with `bash: $2: unbound variable` before any
help could print. Added a `require_value` helper to update.sh,
update-fr.sh, and update-ecfr.sh; bare invocations now print
"Error: --foo requires a value." and exit 1.

False-positive (verified):
Copilot's first finding claimed read_checkpoint_date's `result=$(node -e ...)`
+ `rc=$?` pattern would be killed by set -e on Node failure before the
warning could run. Tested directly: bash does NOT trigger set -e on
assignments containing failing command substitutions. The malformed-
checkpoint warning works as written.
@sonarqubecloud
Copy link
Copy Markdown

@chris-c-thomas chris-c-thomas merged commit b0212e3 into main Apr 25, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build dev enhancement Enhancements made to exisiting features or codebase packages refactor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants