Testing in this repo is about protecting behavioral invariants, not maximizing line coverage. Coverage can be a useful smell detector, but the contract we care about is “does runtime behavior stay correct under real message flow?”
Prefer tests that lock down cross-component behavior:
- deterministic state transitions,
- message ownership and scoping,
- concurrency safety under runtime switching,
- user-visible failure/recovery semantics.
If a change can break a user flow but has no regression test, the test suite is incomplete even if coverage looks good. The test suite should answer "will this feel correct to users?" not just "did lines execute?"
Use a layered approach:
- pure reducers/state transitions first,
- component update/message-flow tests second,
- rendering contract tests where layout/visibility behavior matters,
- integration/correctness suites for protocol-level guarantees.
Mock only where boundaries are truly external or expensive. For local dependencies like SQLite/log/theme, real instances are often clearer and more reliable than deep mocks. The bias here is toward realistic behavior and lower false confidence.
Chat and onboarding both rely on strict lifecycle semantics. The most important invariants include:
- turn scoping and stream ordering,
- single terminal outcome semantics,
- cancellation behavior and persistence rules,
- deterministic onboarding transitions and safe gate navigation.
When a bug is fixed in these areas, add a focused regression test in the same change. That practice has the highest long-term ROI for stability.
For normal local validation:
task doFor architecture/lifecycle guardrails only:
task lint:naming
task lint:architectureFor chat-focused work:
go test ./internal/core/chat ./internal/boundary/chat ./internal/app/chat/... -count=1For explicit integration or correctness runs:
task test:integration
task test:integration:live
task test:correctness
task test:correctness:powersync-replay
task test:tags:compileUse these targeted workflows when debugging protocol/sync behavior that unit tests alone cannot validate.
CI intentionally runs checks in parallel lanes instead of one monolithic
task do step:
lint:task lint(includes architecture and naming guards),unit:task test,integration: hermetic integration tests (task test:integration),gen-check: regeneration drift check against committed snapshots,powersync-replay: deterministic replay correctness check.tagged-compile: compile guards for opt-in test tags (task test:tags:compile).
Live integration tests are intentionally split into a separate lane:
integration-live:task test:integration:liveagainst non-production envs (nightly/manual), not required for every PR.
Security checks run in a dedicated workflow with open-source scanners
(govulncheck and osv-scanner) so they can evolve independently from the
main PR gate.
Recommended required checks on master:
PR CI / required-checks
Recommended non-blocking (advisory) checks:
Security / dependency-scansWorkflow Lint / actionlint(path-scoped; do not require globally)