Testing

Testing in this repo is about protecting behavioral invariants, not maximizing line coverage. Coverage can be a useful smell detector, but the contract we care about is “does runtime behavior stay correct under real message flow?”

What to optimize for

Prefer tests that lock down cross-component behavior:

deterministic state transitions,
message ownership and scoping,
concurrency safety under runtime switching,
user-visible failure/recovery semantics.

If a change can break a user flow but has no regression test, the test suite is incomplete even if coverage looks good. The test suite should answer "will this feel correct to users?" not just "did lines execute?"

Practical test strategy

Use a layered approach:

pure reducers/state transitions first,
component update/message-flow tests second,
rendering contract tests where layout/visibility behavior matters,
integration/correctness suites for protocol-level guarantees.

Mock only where boundaries are truly external or expensive. For local dependencies like SQLite/log/theme, real instances are often clearer and more reliable than deep mocks. The bias here is toward realistic behavior and lower false confidence.

High-risk areas that must stay protected

Chat and onboarding both rely on strict lifecycle semantics. The most important invariants include:

turn scoping and stream ordering,
single terminal outcome semantics,
cancellation behavior and persistence rules,
deterministic onboarding transitions and safe gate navigation.

When a bug is fixed in these areas, add a focused regression test in the same change. That practice has the highest long-term ROI for stability.

Command workflows

For normal local validation:

task do

For architecture/lifecycle guardrails only:

task lint:naming
task lint:architecture

For chat-focused work:

go test ./internal/core/chat ./internal/boundary/chat ./internal/app/chat/... -count=1

For explicit integration or correctness runs:

task test:integration
task test:integration:live
task test:correctness
task test:correctness:powersync-replay
task test:tags:compile

Use these targeted workflows when debugging protocol/sync behavior that unit tests alone cannot validate.

CI model

CI intentionally runs checks in parallel lanes instead of one monolithic task do step:

lint: task lint (includes architecture and naming guards),
unit: task test,
integration: hermetic integration tests (task test:integration),
gen-check: regeneration drift check against committed snapshots,
powersync-replay: deterministic replay correctness check.
tagged-compile: compile guards for opt-in test tags (task test:tags:compile).

Live integration tests are intentionally split into a separate lane:

integration-live: task test:integration:live against non-production envs (nightly/manual), not required for every PR.

Security checks run in a dedicated workflow with open-source scanners (govulncheck and osv-scanner) so they can evolve independently from the main PR gate.

Branch protection checks

Recommended required checks on master:

PR CI / required-checks

Recommended non-blocking (advisory) checks:

Security / dependency-scans
Workflow Lint / actionlint (path-scoped; do not require globally)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing

What to optimize for

Practical test strategy

High-risk areas that must stay protected

Command workflows

CI model

Branch protection checks

FilesExpand file tree

testing.md

Latest commit

History

testing.md

File metadata and controls

Testing

What to optimize for

Practical test strategy

High-risk areas that must stay protected

Command workflows

CI model

Branch protection checks