Agent skills published by Lightfast. Compatible with Claude Code and the Agent Skills specification.
| Skill | Purpose |
|---|---|
foundation-creator |
Draft a top-level foundation document for a product or company primitive: thesis, mission, boundaries, actor model, surfaces, strategic bets, and open questions. |
spec-creator |
Write and update a top-level SPEC.md service specification following a strict template and language guide. |
Each skill is a subdirectory under skills/. To install one into a project:
npx skills add lightfastai/skills --skill foundation-creator
npx skills add lightfastai/skills --skill spec-creatorOr copy the directory directly into .claude/skills/ in your project.
This repo now includes BAML-backed fixture evals for foundation-creator and
spec-creator.
bun install
bun run ci:check
bun run eval:check
bun run eval:typecheck
bun run eval:foundation -- create-foundation-from-vercel-source-packet
bun run eval:foundation -- create-foundation-from-lightfast-founder-notes
bun run eval:foundation -- update-lightfast-foundation-boundary-surface-question
bun run eval:foundation -- update-lightfast-foundation-tighten-overreach
bun run eval:spec -- create-from-vercel-mcp-source-packet
bun run eval:foundation:smoke
bun run eval:spec:smoke
bun run eval:spec -- --all
bun run with-env -- bun ./scripts/run-baml-eval.ts foundation-creator create-foundation-from-cloudflare-source-packet --eval-profile gate --trials 3
bun run with-env -- bun ./scripts/run-baml-eval.ts foundation-creator update-lightfast-foundation-tighten-overreach --eval-profile fast --compare previous,profile:no-skill
bun run with-env -- bun ./scripts/run-baml-eval.ts foundation-creator create-foundation-from-lightfast-founder-notes --eval-profile crossEach run writes packet, brief, candidate document, and evaluation report
artifacts under skills/<skill>/evals/runs/.
bun run eval:check is the cheap deterministic CI guard. It validates eval
manifests, fixture paths, validation regexes, smoke membership, and BAML runner
function wiring without calling any model.
Current foundation-creator corpus includes:
create-foundation-from-vercel-source-packetcreate-foundation-from-cloudflare-source-packetcreate-foundation-from-lightfast-founder-notescreate-foundation-from-harbor-care-source-packetupdate-lightfast-foundation-boundary-surface-questionupdate-lightfast-foundation-tighten-overreach
The runner now also writes:
deterministic_checks.json— reference-driven checks derived from the skill'stemplate.mdandlanguage.mdtiming.json— per-stage local timingsummary.json— per-trial LLM status + combined statusbenchmark.json— aggregated status counts and timing summaries across all trials
When --compare is used, the run directory also includes:
comparison.json— head-to-head summary across variants, all judged by the current skill's evaluatorvariants/<label>/...— per-variant packet/brief/candidate/report artifacts andbenchmark.json
When --all is used, the runner executes every eval in the selected skill
manifest and writes a suite directory under skills/<skill>/evals/runs/ with:
suite.json— aggregate status summary for every eval in the manifest<eval-name>/...— the normal per-eval artifacts for each manifest entry
Suite mode exits nonzero if any eval has a non-Pass combined status, making it
suitable for CI gates.
When --smoke is used, the runner executes only manifest entries marked with
"smoke": true. The package scripts eval:foundation:smoke and
eval:spec:smoke are the intended lightweight CI commands.
When --deterministic-only <path> is used, the runner validates an existing
candidate.md artifact against deterministic reference checks without calling
the candidate model or LLM judge. The path can point to a candidate.md, a run
directory, or a suite directory:
bun run eval:spec -- update-add-single-nongoal-preserve-system-overview --deterministic-only skills/spec-creator/evals/runs/<run>/candidate.mdUpdate-mode validation contracts can use skip_base_check_ids when a packet
explicitly asks to preserve legacy text that would fail a generic create-mode
style rule. Keep these skips narrow and pair them with required/forbidden
patterns for the actual requested edit.
Current comparison variants:
current— working tree prompt stackprevious—HEAD~1snapshot of the skillprofile:no-skill— intentionally under-scaffolded baseline profile for measuring how much the foundation-specific prompt constraints matter
Current eval profiles:
fast— candidate and judge both run onopenai/gpt-5.4-minigate— candidate runs onopenai/gpt-5.4-mini, judge runs onopenai/gpt-5.4prod— candidate uses the skill's default authoring model frombaml_src/clients.baml, judge runs onopenai/gpt-5.4cross— candidate runs onopenai/gpt-5.4-mini, judge runs onanthropic/claude-opus-4-7
The default authoring client in each skill's baml_src/clients.baml is
openai/gpt-5.4 for higher-quality foundation/spec generation. Eval profiles
override that default so the tuning loop can stay on cheaper candidate models.
fast is the default when --eval-profile is omitted.
Model profiles are applied as overlay fixtures, so prompt comparisons against
previous or profile:no-skill stay on the same candidate/judge model split.
The cross profile requires Anthropic model access through Vercel AI Gateway.
Local JSON artifacts remain the source of truth. Optional Braintrust export can be enabled with:
bun run eval:spec -- create-from-vercel-mcp-source-packet --reporter local,braintrustBraintrust export requires BRAINTRUST_API_KEY. The default project is
lightfast-skills, which can be overridden with BRAINTRUST_PROJECT.
Experiment names are generated as:
<capability-id>.<suite-mode>.<profile>.<run-kind>.<yyyymmdd-HHMM>.<git-sha>
Examples:
foundation-doc.smoke.fast.model.20260423-0423.6cbdaa4
service-spec.smoke.fast.deterministic.20260423-0422.6cbdaa4
service-spec.compare.gate.model.20260423-0530.6cbdaa4
Use stable capability_id values in manifests instead of relying on mutable
skill package names. Current values are foundation-doc and service-spec.
Optional Braintrust environment variables are BRAINTRUST_EXPERIMENT for
manual curated runs and BRAINTRUST_ORG for org selection.
Braintrust can also be inspected from the terminal without opening the UI:
bun run braintrust:list -- --limit 5
bun run braintrust:latest -- --capability foundation-doc
bun run braintrust:latest -- --capability service-spec
bun run braintrust:show -- foundation-doc.smoke.fast.model.20260423-1015.0a10e79These commands use Braintrust's API and BTQL directly, summarize experiment
rows, and print combined status counts, LLM status counts, deterministic
failures, open issues, timing, and per-eval row status. They require
BRAINTRUST_API_KEY and use BRAINTRUST_PROJECT when set.
Braintrust also provides an optional beta bt CLI for listing experiments,
running BTQL, and syncing experiment data locally:
curl -fsSL https://bt.dev/cli/install.sh | bash
bt experiments list --project lightfast-skills --env-file .env --json --no-input
bt sql "SELECT id, input, scores FROM experiment('<experiment-id>') LIMIT 20" --env-file .env --json --no-input
bt sync pull experiment:<experiment-name> --project lightfast-skills --env-file .envEval manifests also carry lightweight taxonomy metadata
(scenario_type, input_shape, ambiguity_level, domain_profile,
primary_risks) so benchmark runs can be grouped by failure mode. Shared
taxonomy guidance lives in evals/TAXONOMY.md.
When --trials N is used, the run directory contains trial-1/, trial-2/,
... plus a top-level benchmark.json.
bun run eval:* loads .env automatically through dotenv-cli, so
AI_GATEWAY_API_KEY can live in the repo-local .env without manual
source steps.
CI runs bun run ci:check on every pull request and push to main. Live smoke
evals run only when AI_GATEWAY_API_KEY is configured in GitHub Actions
secrets. Braintrust export remains optional: if BRAINTRUST_API_KEY is present,
CI uses local,braintrust; otherwise local JSON artifacts remain the source of
truth.
For other local commands that should inherit .env, use:
bun run with-env -- bun ./scripts/run-baml-eval.ts foundation-creator create-foundation-from-vercel-source-packetMIT