Refactor update.sh script flags and enhance documentation#132
Merged
chris-c-thomas merged 8 commits intomainfrom Apr 25, 2026
Merged
Refactor update.sh script flags and enhance documentation#132chris-c-thomas merged 8 commits intomainfrom
chris-c-thomas merged 8 commits intomainfrom
Conversation
Single grammar across update.sh and the three sub-scripts. `./scripts/update.sh`
(no args) now updates all sources incrementally from each source's last
checkpoint.
Flag groups: source selection (--source), mode (--force / --deploy-only),
phase control (--skip-deploy / --skip-highlights / --skip-search), source-
scoping (--titles for eCFR/USC; --from/--to/--days for FR), utility
(--dry-run, --verbose, --help).
Removed legacy prefixes:
--ecfr-titles, --ecfr-all, --ecfr-skip-highlights
--fr-days, --fr-from, --fr-to
--usc-force, --usc-skip-highlights
update-ecfr.sh --all
Each prints a migration hint and exits 1.
New FR checkpoint at downloads/fr/.fr-state.json ({ lastRun, lastDate }):
default invocation resumes from lastDate; bootstrap (no checkpoint) errors
with a hint requiring --from or --days, since FR has no inherent "all".
eCFR/USC bootstrap (missing checkpoint) now logs explicitly and runs a
full first-run automatically.
Search-index strategy under --force:
Default / single-source --force: incremental (deploy.sh --search-docker --source X)
--force on all three sources: single full reindex at the end (no --source)
Other:
- --help now uses awk (BSD/GNU sed compatible) instead of GNU-only sed.
- --dry-run forwards to each sub-script and uses `|| true` so one source's
bootstrap-error doesn't halt the multi-source preview.
- eCFR title-granularity safety guard skipped in `titles` mode (single-
title runs only refresh one file and legitimately leave the rest alone).
- Root CLAUDE.md: rewrite the script tree and Build & Dev Commands section to reflect the unified grammar; add checkpoint-file docs and bootstrap rules. - Root README.md: rewrite the Incremental Updates section. - Root CHANGELOG.md: add Unreleased entry describing the rename. - apps/astro/src/content/docs/cli/commands.md: replace old flag examples in the Update Scripts subsection. - apps/astro/src/content/docs/guides/bulk-download.md: rewrite the Update Scripts subsection with checkpoint + bootstrap details. - apps/astro/src/content/docs/project/changelog.md: mirror Unreleased entry.
…ation
PR-review findings addressed:
CRITICAL
- update.sh: empty-array expansion crashed default ./scripts/update.sh under
bash 3.2 (macOS) + set -u. `printf "%s\n" "${args[@]}"` and call sites
expanding "${ECFR_ARGS[@]}" etc. all hit `args[@]: unbound variable`.
Fixed via the bash 3.2-safe ${arr[@]+"${arr[@]}"} pattern; build_common_args
now emits one flag per echo line instead of via an intermediate array.
- update.sh: removed_flag arms referenced $2 unconditionally, so bare
--ecfr-titles/--fr-days/--fr-from/--fr-to crashed with `$2: unbound variable`
before the migration hint could print. Fixed with ${2:-<placeholder>}.
DATA-LOSS
- update-fr.sh: backfill (e.g. --from 2020-01-01 --to 2020-12-31) on a
machine whose checkpoint already pointed at 2026-04-20 would overwrite
the cursor with the older date and trigger a multi-year redownload on
the next default run. Now writes max(CHECKPOINT_DATE, DATE_TO).
ACCURACY / VALIDATION
- update.sh: --titles validation now requires --source to include ecfr.
USC sub-script doesn't accept --titles; previously --source usc --titles X
passed through silently and ran the full USC pipeline with no filtering.
- update.sh: help text and docs updated from "eCFR/USC only" to "eCFR only".
- update-usc.sh: header step 9 now says "via local Docker (shipped to VPS)"
matching update-ecfr.sh / update-fr.sh and the inline rationale.
- update-fr.sh: step numbering aligned with the 8-step header (was N/6).
ROBUSTNESS
- update-fr.sh: checkpoint write now atomic (tmp + rename) so a partial-
write crash can't corrupt .fr-state.json into something read_checkpoint_date
fails to parse.
- update-fr.sh: read_checkpoint_date now warns on stderr when the file
exists but is malformed/missing-lastDate, instead of silently routing
the user to a misleading "no checkpoint" bootstrap-error.
- update-fr.sh: node -e snippets now pass paths via env vars instead of
string-substitution, eliminating a single-quote injection risk.
- update.sh: failure-summary line now goes to stderr; sub-script failure
warning also routes to stderr.
VERIFICATION
- /bin/bash (3.2) ./scripts/update.sh --source ecfr --skip-deploy --skip-search
→ exit 0 (was: bash unbound-variable crash)
- ./scripts/update.sh --ecfr-titles → migration error, exit 1 (was: crash)
- ./scripts/update.sh --source usc --titles 1,17 → "--titles requires
--source to include ecfr", exit 1 (was: silent full-USC run)
- max-date logic unit-tested: backfill preserves checkpoint, forward
advances it, bootstrap adopts date_to.
- Malformed checkpoint produces explicit "Warning: ... is malformed"
before bootstrap fallthrough.
Replace build_common_args() function — which printed flags as text on stdout to be parsed back into arrays via `while IFS= read -r` — with a plain top-level COMMON_ARGS=() array. Each per-source dispatch then appends its own scoping flags plus COMMON_ARGS via standard array concatenation. Drops --skip-highlights from the common bag and appends it inline only for eCFR/USC, which removes the post-build filtered_fr loop that existed solely to strip --skip-highlights before invoking update-fr.sh (FR has no highlights step). Pure refactor: dry-run output, migration errors, --titles validation, and bash 3.2 compatibility verified identical to before.
Pre-existing bug: update.sh parsed -v/--verbose, set VERBOSE=true, and
forwarded --verbose to every sub-script. None of the three sub-scripts
implemented --verbose, so they rejected it as "Unknown option" and exited 1,
marking each source as FAILED in the orchestrator's summary.
Fix:
- update-ecfr.sh, update-fr.sh, update-usc.sh now accept -v/--verbose and
pass --verbose through to their convert-{source} CLI invocation. The
download-* commands don't accept --verbose so they're unchanged.
- Each sub-script's header docstring documents the new flag.
The --verbose flag thus does what the help text always claimed: shells out
to the verbose code path in convert-fr / convert-ecfr / convert-usc, which
all accepted -v/--verbose at the CLI layer all along.
Adds --verbose / -v to the update-script command tables in: - CLAUDE.md (Build & Dev Commands section) - README.md (Incremental Updates section) - apps/astro/src/content/docs/cli/commands.md - apps/astro/src/content/docs/guides/bulk-download.md Plus a Fixed entry in the root CHANGELOG and the Astro project changelog mirror, noting that --verbose was previously broken and now works.
Captured from this session's PR-review findings:
- macOS bash 3.2 + set -u crashes empty-array expansion (fix: ${arr[@]+...})
- set -u + $2 in value-taking case arms crashes before user-error handling
(fix: ${2:-<placeholder>})
- BSD sed vs GNU sed brace blocks differ; use awk for cross-platform
multi-line text extraction
All three apply to every script in scripts/ since they all use
set -euo pipefail.
003d677 to
ea3c777
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Refactors the content-pipeline update orchestration to use a unified flag grammar across update.sh and the per-source sub-scripts, adds planning/preview support, and updates documentation to match the new checkpoint/bootstrap behavior.
Changes:
- Unified flag scheme across
update.sh,update-ecfr.sh,update-fr.sh, andupdate-usc.sh(top-level scoping flags, standardized--force,--skip-search,--dry-run,--verbose). - Added/clarified checkpoint + bootstrap behavior (FR uses a new JSON checkpoint; eCFR/USC bootstrap to full first-run when missing).
- Updated docs/changelogs to reflect the new workflow and examples.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/update.sh | Orchestrator refactor: unified flags, validation, dry-run plan output, and single post-run sitemap/search behavior. |
| scripts/update-ecfr.sh | Migrated to unified flags; adds modes, --dry-run, --skip-search, and verbose passthrough. |
| scripts/update-fr.sh | Adds FR checkpointing (.fr-state.json), mode resolution, --skip-search, --dry-run, and verbose passthrough. |
| scripts/update-usc.sh | Adds modes, --skip-search, --dry-run, and verbose passthrough. |
| apps/astro/src/content/docs/project/changelog.md | Documents the unified update-script behavior as “Unreleased”. |
| apps/astro/src/content/docs/guides/bulk-download.md | Updates guidance/examples for the new orchestrator and checkpoints. |
| apps/astro/src/content/docs/cli/commands.md | Updates update-script examples and links to checkpoint behavior. |
| README.md | Updates “Incremental Updates” section to reflect new flags and usage. |
| CLAUDE.md | Updates repo script documentation and adds notes about bash/macOS pitfalls. |
| CHANGELOG.md | Adds an Unreleased entry describing the unified flag scheme and checkpoint changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Three valid findings from Copilot's review of PR #132: #2 — full search reindex no longer hard-fails on local-only runs update.sh's RUN_FULL_SEARCH_AFTER path called `deploy.sh --search-docker` unconditionally on `--force` across all three sources. deploy.sh hard-errors when VPS_HOST is unset, while sub-scripts auto-fall-back to local-only in that case. The orchestrator now mirrors that auto-fallback: prints "Skipping full search reindex" and continues when VPS_HOST is unset. #3 — `--source ecfr, fr` (with space after comma) now works Previous parsing left a leading space on the second entry, which then failed the source validator. Now strips surrounding whitespace per entry before validation, matching the user-friendly comma-list shape. #4 — bare value-taking flags print a friendly error `./scripts/update.sh --source` (or --titles, --from, --to, --days) under set -u previously crashed with `bash: $2: unbound variable` before any help could print. Added a `require_value` helper to update.sh, update-fr.sh, and update-ecfr.sh; bare invocations now print "Error: --foo requires a value." and exit 1. False-positive (verified): Copilot's first finding claimed read_checkpoint_date's `result=$(node -e ...)` + `rc=$?` pattern would be killed by set -e on Node failure before the warning could run. Tested directly: bash does NOT trigger set -e on assignments containing failing command substitutions. The malformed- checkpoint warning works as written.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



This pull request unifies and simplifies the update script workflow for incremental content updates across all sources. The update scripts (
update.sh,update-ecfr.sh,update-fr.sh,update-usc.sh) now share a consistent flag scheme, use per-source checkpoints for incremental updates, and improve error handling and usability. Documentation and changelogs have been updated to reflect these changes, and several new features and fixes have been introduced.Unified Update Script Workflow and CLI Flags:
The main update script (
update.sh) now updates all sources incrementally by default, using each source's last checkpoint. Source selection is via--source, and source-specific flags (--titles,--days,--from,--to) are now top-level options. Old per-source flag prefixes (e.g.,--ecfr-titles,--fr-days,--usc-force) are removed; using them prints a migration hint and exits. ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR8-R21),[[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-e987f59775115a6fa2f7f0d216c4e49d281201a10ed090c24c0a9775ef2ece7fR15-R22),[[3]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7L132-R164),[[4]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L134-R163),[[5]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-d4231f0f6f1d86016afc16fb672e3afd4a76909f0ff26265150373b7504a8ee8L130-R144),[[6]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-810813477a09607cc3bf7b5686f5d6ae5a5e94bcb8e32d056f3bd9ad884da396L146-R182),[[7]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84L6-R31),[[8]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R53-R73),[[9]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R88-R172))All update scripts support new flags:
--skip-search,--dry-run, and consistent--forcesemantics.--dry-runprints the planned actions and exits.--skip-searchskips the search reindex step. ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR8-R21),[[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-e987f59775115a6fa2f7f0d216c4e49d281201a10ed090c24c0a9775ef2ece7fR15-R22),[[3]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7L132-R164),[[4]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L134-R163),[[5]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-d4231f0f6f1d86016afc16fb672e3afd4a76909f0ff26265150373b7504a8ee8L130-R144),[[6]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-810813477a09607cc3bf7b5686f5d6ae5a5e94bcb8e32d056f3bd9ad884da396L146-R182),[[7]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84L6-R31),[[8]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R53-R73),[[9]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R88-R172))Incremental Update Checkpoints and Bootstrapping:
downloads/<source>/: eCFR uses.ecfr-titles-state.json, USC uses.usc-release-point, and FR now uses.fr-state.json(with{ lastRun, lastDate }). Default runs resume from these checkpoints. If the checkpoint is missing, eCFR/USC bootstrap into a full first-run automatically; FR errors with a hint and requires--fromor--days. ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR8-R21),[[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-e987f59775115a6fa2f7f0d216c4e49d281201a10ed090c24c0a9775ef2ece7fR15-R22),[[3]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7L22-R37),[[4]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7L132-R164),[[5]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L134-R163),[[6]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-d4231f0f6f1d86016afc16fb672e3afd4a76909f0ff26265150373b7504a8ee8L130-R144),[[7]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-810813477a09607cc3bf7b5686f5d6ae5a5e94bcb8e32d056f3bd9ad884da396L146-R182))Improved Error Handling and Usability:
The update scripts now provide clearer error messages and migration hints for removed flags and invalid flag combinations. Help text extraction is now cross-platform (using
awkinstead ofsedfor macOS compatibility). ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR8-R21),[[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7R296-R298),[[3]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R88-R172))The
--verbose/-vflag now works as intended: it is accepted by all update scripts and correctly passed to CLI convert steps. Previously, sub-scripts would reject the flag and exit with "Unknown option". ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR8-R21),[[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-e987f59775115a6fa2f7f0d216c4e49d281201a10ed090c24c0a9775ef2ece7fR15-R22))Documentation and Changelog Updates:
README.md,CLAUDE.md, Astro docs) and the changelog are updated to reflect the new unified workflow, flag scheme, checkpoint details, and new features. ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR8-R21),[[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-e987f59775115a6fa2f7f0d216c4e49d281201a10ed090c24c0a9775ef2ece7fR15-R22),[[3]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7L22-R37),[[4]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7L132-R164),[[5]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-6ebdb617a8104a7756d0cf36578ab01103dc9f07e4dc6feb751296b9c402faf7R296-R298),[[6]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L134-R163),[[7]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-d4231f0f6f1d86016afc16fb672e3afd4a76909f0ff26265150373b7504a8ee8L130-R144),[[8]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-810813477a09607cc3bf7b5686f5d6ae5a5e94bcb8e32d056f3bd9ad884da396L146-R182))Script Internals and Refactoring:
update-ecfr.shis refactored to use the new flag scheme, mode resolution, and dry-run planning. It now rejects old flags, detects mutually exclusive options, and prints a detailed plan when--dry-runis used. ([[1]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84L6-R31),[[2]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R53-R73),[[3]](https://github.com/chris-c-thomas/LexBuild/pull/132/files#diff-357f3948f57d03c40f0aab87697a4f27c20eed54e2f3bf38ec539cc74c8c1c84R88-R172))These changes make the update process more robust, user-friendly, and consistent across all sources.