Skip to content

feat(ws): harden auth, Redis bridge, and origin handling#3

Open
Snider wants to merge 155 commits intomainfrom
dev
Open

feat(ws): harden auth, Redis bridge, and origin handling#3
Snider wants to merge 155 commits intomainfrom
dev

Conversation

@Snider
Copy link
Copy Markdown
Contributor

@Snider Snider commented Apr 27, 2026

Routine dev→main PR opened by Hephaestus PR-cadence task.

This PR exists to:

  1. Trigger CodeRabbit auto-review of accumulated dev work
  2. Surface any blocking review feedback before merge
  3. Keep main current with dev once approved

ahead_by: 152 commit(s) (per gh api compare)

If CodeRabbit clears with no blocking comments, this PR is eligible for gh pr merge --merge (real merge commit, no force-push). Conflicts and review feedback should be addressed on the dev branch before merge.

Co-authored-by: Hephaestus hephaestus@cladius

Summary by CodeRabbit

  • New Features

    • Strict same‑origin WebSocket validation by default; configurable origin checks
    • Per‑client subscription limits and stronger subscription validation
    • More configurable reconnecting client (max attempts, error callbacks, flexible message handling)
    • New bearer/query token authenticators and improved Redis bridge lifecycle/restart behaviour
  • Bug Fixes

    • Authentication claims are deep‑cloned to prevent post‑auth mutation
    • API key/bearer parsing is case‑insensitive and empty user IDs are rejected
  • Chores

    • Module path and Go toolchain updated; large test and coverage expansion
  • Documentation

    • Updated architecture/history docs to reflect origin‑check default behaviour

Snider and others added 30 commits April 14, 2026 15:18
Spark pass — clean build + tests.

- auth.go: new authentication surface (73 lines)
- ws.go: socket additions to consume the new auth path

Co-Authored-By: Virgil <virgil@lethean.io>
…callbacks

- auth.go: AuthResult.Authenticated as RFC-compatible alias for Valid;
  success results keep both fields in sync
- ws.go: HubStats.Subscribers — total subscriber count across active channels
- ws.go: ReconnectingClient.OnMessage accepts raw []byte OR decoded Message
  callbacks; newline-batched frames dispatched per-message to decoded handlers
- Regression coverage for auth alias, subscriber totals, raw reconnect handling

Verified: go test ./... + -race + go vet pass

Co-Authored-By: Virgil <virgil@lethean.io>
Banned stdlib imports must route through core primitives. Replace
strings.EqualFold with core.Lower comparison for Bearer scheme
matching. Preserves case-insensitive behaviour; tests pass.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Virgil <virgil@lethean.io>
Claude and others added 23 commits April 16, 2026 02:38
… (AX-10)

Adds tests/cli/ws/Taskfile.yaml with canonical build/test/vet/default
targets plus test-unit and REDIS_ADDR-gated test-integration (skipped
cleanly when no Redis endpoint set). default deps-chains build/test/vet
per Wave 2 convention.

Co-authored-by: Codex <noreply@openai.com>
Via-codex-lane: supervised by Cerberus on Athena #104 request
Closes tasks.lthn.sh/view.php?id=308
Removes github.com/stretchr/testify from go.mod/go.sum; rewrites
assert/require in auth_test.go, redis_test.go, ws_test.go to stdlib
t.Fatalf patterns. go mod tidy, go vet, go test all clean.

Closes tasks.lthn.sh/view.php?id=724

Co-authored-by: Codex <noreply@openai.com>
Via-codex-lane: Cladius-solo dispatch (Mac codex CLI)
ws.go (WebSocket hub) + redis.go (pub/sub bridge) use stdlib for
intrinsic transport primitives: bytes (frame assembly), net/http
(upgrade handshake), sync (connection-map guards), time (ping-pong
deadlines), crypto/rand (masking keys/nonces), crypto/tls (wss config).
No core.* equivalents exist at this layer. Added `// Note:` annotations
on each import.

Closes tasks.lthn.sh/view.php?id=725

Co-authored-by: Codex <noreply@openai.com>
Updated `coreerr "dappco.re/go/core/log"` → `coreerr "dappco.re/go/log"`
in auth.go, errors.go, redis.go, ws.go. No stale path remains in .go.

Closes tasks.lthn.sh/view.php?id=723

Co-authored-by: Codex <noreply@openai.com>
- Migrate module path: dappco.re/go/core/ws -> dappco.re/go/ws
- Bump dappco.re/go/* deps to v0.8.0-alpha.1 in go.mod (any forge.lthn.ai/core/* paths migrated to canonical dappco.re/go/* form)

Co-Authored-By: Athena <athena@lthn.ai>
… (#726)

- HubConfig gains AllowedOrigins []string (empty = allow all, dev only)
- NewHubWithConfig derives CheckOrigin from AllowedOrigins when no
  custom CheckOrigin supplied
- NewHub() logs production warning when defaulting to allow-all
- Tests _Good/_Bad/_Ugly + updated default-origin handler tests

Race suite passes (75.3s).

Co-authored-by: Codex <noreply@openai.com>
Closes tasks.lthn.sh/view.php?id=726
Annotated bytes, net/http, strings as AX-6 structural for WebSocket
upgrade boundary, HTTP request/response, origin/host/channel
normalization. redis.go already annotated.

Closes tasks.lthn.sh/view.php?id=725

Co-authored-by: Codex <noreply@openai.com>
Converted testify usage in auth_test.go, errors_test.go, redis_test.go,
ws_test.go to stdlib testing.T patterns. Added test_stdlib_helpers_test.go
for shared deepEqual/nil/empty/contains/eventually/no-panic predicates.
Removed testify line from go.mod.

Closes tasks.lthn.sh/view.php?id=724

Co-authored-by: Codex <noreply@openai.com>
…306)

Replace crypto/rand + encoding/hex sourceID generation with core.ID().
crypto/tls retained with // AX-6-exception: Redis TLS transport config
comment — there's no clean go-io wrapper for TLS config and Redis
TLS connections need it directly.

go build ./... + TestRedisBridge_UniqueSourceIDs pass.

Co-authored-by: Codex <noreply@openai.com>
Closes tasks.lthn.sh/view.php?id=306
bytes.Split / bytes.TrimSpace in the reconnect Message dispatch path
replaced with strings.Split / strings.TrimSpace, then []byte(frame) for
the json.Unmarshal call. bytes import removed.

go build ./... passes.

Co-authored-by: Codex <noreply@openai.com>
Closes tasks.lthn.sh/view.php?id=307
…55 (#304)

Per Option A from the ticket: WebSocket inherently needs HTTP — RFC
6455 defines WebSocket as an HTTP/1.1 upgrade. go-ws IS the ws library
itself (not a consumer of ws), so wrapping HTTP behind go-io would
create circular-dep risk for no clean win.

Lands minimal annotations:
* ws.go — // AX-6-exception: WebSocket requires HTTP upgrade (RFC 6455)
  on the net/http import + a structural-usage block near the first
  http.Request API surface
* auth.go — same import annotation + authentication-during-upgrade
  block explaining authenticators receive the upgrade-time request

go build ./... passes. Diff confined to the 2 allowlist files.

Co-authored-by: Codex <noreply@openai.com>
Closes tasks.lthn.sh/view.php?id=304
# Conflicts:
#	auth_test.go
#	redis.go
#	redis_test.go
#	ws.go
#	ws_test.go
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 27, 2026

📝 Walkthrough

Walkthrough

Finalises and normalises authentication results with claims deep-cloning, tightens origin checks to deny-by-default, hardens and lifecycle-proofs the Redis bridge, enforces per-client subscription limits and identifier validation in the Hub, refactors reconnecting client behaviour, and replaces testify in tests with stdlib helpers; module path and deps updated.

Changes

Cohort / File(s) Summary
Authentication & Claims
auth.go, auth_test.go
Adds AuthResult.Authenticated, finalisation helpers (normalizeAuthResult, finalizeAuthResult), reflection-based deep-cloning of Claims (depth-limited, rejects unsupported kinds), ensures non-blank UserID, adapts AuthenticatorFunc, reworks APIKeyAuthenticator (internal snapshot keys), adds NewBearerTokenAuth and NewQueryTokenAuth, and expands tests (claim cloning matrix, authenticator adapter and integration tests).
Hub & WebSocket Core
ws.go, ws_bench_test.go
Switches subscription target to prefer Message.Channel, enforces origin-check deny-by-default using HubConfig.CheckOrigin/AllowedOrigins, adds MaxSubscriptionsPerClient and identifier validation, routes subscription changes when hub running, stamps/validates outbound messages, expands stats, and updates reconnecting client behaviour (errors, backoff, polymorphic OnMessage). Bench updates mirror API/import changes.
Redis Bridge
redis.go, redis_test.go
Bounds and JSON-decodes Redis envelopes, validates channel prefix and ProcessID, uses core.ID() for source, adds mutex-protected lifecycle (ctx/cancel/pubsub/client) and stopListener(), timeout-scoped publish/subscribe contexts, delivery via hub internals with nil-hub guards, and extensive rewritten tests including cross-bridge integration and self-echo suppression.
Errors & Sentinels
errors.go, errors_test.go
Swaps logging import to dappco.re/go/log, introduces exported errors ErrMissingUserID, ErrInvalidAuthClaims, and ErrSubscriptionLimitExceeded, and adds tests verifying sentinel strings and wrapping behaviour.
Testing Helpers & Suites
test_stdlib_helpers_test.go, ax7_v090_test.go
Adds stdlib-based test helpers (testEqual, testIsNil, testIsEmpty, testContains, testEventually, etc.) and a large new integration test suite including a fake Redis server and many hub/redis/authentication scenarios. Tests replace testify usage across codebase.
Build, Modules & CI Tasks
go.mod, tests/cli/ws/Taskfile.yaml, .golangci.yml
Module path changed to dappco.re/go/ws, Go toolchain bumped to 1.26.2, dependency reshuffle (adds dappco.re/go, dappco.re/go/log, removes prior core module), adds replace for log, removes direct testify requirement; Taskfile added for build/test/lint tasks; golangci config adjusted (formatter section and disabled linters).
Documentation
docs/architecture.md, docs/history.md, docs/index.md
Updates docs to document deny-by-default origin policy, default origin matching logic, and history note about origin check change; removes testify from docs dependency list.
Miscellaneous
ws.go (other changes), various tests
Additional small API and behaviour changes across hub/client code (deterministic snapshots, nil-hub guards, wire-format stamping behaviour), widespread test updates to use new helpers and explicit readiness waits.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.80% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the three primary areas of hardening addressed across the changeset: authentication (claims snapshotting, deep cloning, bearer/query token constructors), Redis bridge (process ID validation, lifecycle safety, state management), and origin handling (strict default with CheckOrigin callback support).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

level=error msg="Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: err: exit status 1: stderr: go: cannot load module ../go listed in go.work file: open ../go/go.mod: no such file or directory\ngo: cannot load module ../go-log listed in go.work file: open ../go-log/go.mod: no such file or directory\n"


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (9)
docs/architecture.md (1)

418-418: Minor style nit: remove hyphen from "narrowly-scoped".

Per the static analysis hint, compound modifiers where the first word is an adverb ending in "-ly" do not require a hyphen. Consider changing to "narrowly scoped".

✏️ Suggested fix
-**Strict origin check by default.** The WebSocket handler rejects cross-origin upgrades unless `HubConfig.CheckOrigin` explicitly allows them. The built-in default requires the `Origin` scheme and host to match the request target, and callbacks are treated as deny-by-default if they panic. Production deployments should keep the default or supply a narrowly-scoped override.
+**Strict origin check by default.** The WebSocket handler rejects cross-origin upgrades unless `HubConfig.CheckOrigin` explicitly allows them. The built-in default requires the `Origin` scheme and host to match the request target, and callbacks are treated as deny-by-default if they panic. Production deployments should keep the default or supply a narrowly scoped override.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture.md` at line 418, In the "Strict origin check by default."
paragraph that mentions HubConfig.CheckOrigin, remove the hyphen in
"narrowly-scoped" and change it to "narrowly scoped" so the adverb + modifier
follows the recommended style; update the sentence that currently reads "...or
supply a narrowly-scoped override." to "...or supply a narrowly scoped
override."
errors_test.go (2)

25-30: Redundant nil check.

Lines 25-27 check if err is nil and fail. Lines 28-30 then check the same condition again (err == nil || ...). The first check is redundant since the second already handles the nil case.

✏️ Simplified version
 		t.Run(tt.name, func(t *testing.T) {
-			if err := tt.err; err == nil {
-				t.Errorf("expected error")
-			}
-			if err := tt.err; err == nil || err.Error() != tt.want {
+			if tt.err == nil || tt.err.Error() != tt.want {
 				t.Errorf("expected error %q, got %v", tt.want, err)
 			}
-
 		})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@errors_test.go` around lines 25 - 30, The test contains a redundant nil check
for tt.err: remove the first if block that only checks "if err := tt.err; err ==
nil { t.Errorf(...)}" and keep the combined check that verifies nil OR
mismatched error string (the second block). Ensure the remaining check uses a
single err := tt.err local and reports the expected vs actual with
t.Errorf("expected error %q, got %v", tt.want, err) so nil and string-mismatch
are both handled.

12-34: Missing test coverage for new error sentinels.

The tests only cover the original three sentinels (ErrMissingAuthHeader, ErrMalformedAuthHeader, ErrInvalidAPIKey), but errors.go now also exports ErrMissingUserID, ErrInvalidAuthClaims, and ErrSubscriptionLimitExceeded. Consider extending the test table to include these.

✏️ Suggested addition to the test table
 	tests := []struct {
 		name string
 		err  error
 		want string
 	}{
 		{name: "missing header", err: ErrMissingAuthHeader, want: "missing Authorization header"},
 		{name: "malformed header", err: ErrMalformedAuthHeader, want: "malformed Authorization header"},
 		{name: "invalid api key", err: ErrInvalidAPIKey, want: "invalid API key"},
+		{name: "missing user id", err: ErrMissingUserID, want: "authenticated user ID must not be empty"},
+		{name: "invalid auth claims", err: ErrInvalidAuthClaims, want: "authentication claims are invalid"},
+		{name: "subscription limit exceeded", err: ErrSubscriptionLimitExceeded, want: "subscription limit exceeded"},
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@errors_test.go` around lines 12 - 34, Extend the
TestErrors_AuthSentinels_Good table in TestErrors_AuthSentinels_Good to include
the three new exported sentinels ErrMissingUserID, ErrInvalidAuthClaims, and
ErrSubscriptionLimitExceeded (alongside the existing ErrMissingAuthHeader,
ErrMalformedAuthHeader, ErrInvalidAPIKey); for each new entry set want equal to
the exact error string defined in errors.go and run the same assertions. Locate
the test table in TestErrors_AuthSentinels_Good and add rows for
ErrMissingUserID, ErrInvalidAuthClaims, and ErrSubscriptionLimitExceeded using
their canonical error messages from errors.go so the t.Run checks compare
err.Error() to the correct expected string.
redis.go (2)

274-282: Error precedence: Redis error shadows local broadcast failure.

When PublishBroadcast delivers locally first (line 274), a local error is captured in localErr. However, if Redis publish also fails, redisErr is returned (line 279), and if Redis succeeds, localErr is returned (line 282). This means a Redis failure masks a local failure.

If both errors matter, consider combining them or logging the shadowed error. Current behaviour may be intentional if Redis state is considered authoritative.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@redis.go` around lines 274 - 282, PublishBroadcast currently returns the
Redis error when rb.publish fails, which hides any prior local error from
rb.hub.broadcastMessage; change the return logic around localErr and redisErr so
both failures are preserved — if both localErr and redisErr are non-nil,
return/construct a combined error (or log one and return the other) that
includes both messages (e.g., "local: ...; redis: ..."), if only one is non-nil
return that one; update the block handling rb.hub.broadcastMessage and
rb.publish to aggregate or log the shadowed error rather than letting redisErr
unconditionally overwrite localErr.

134-139: Auto-start behaviour change in constructor.

NewRedisBridge now automatically calls Start(context.Background()) before returning. This is convenient but means callers can no longer defer starting the listener to a later point with a custom context. The docs (line 283 in architecture.md) still show a separate bridge.Start(ctx) call after construction.

If this is intentional, consider updating the architecture documentation to reflect that Start is called automatically with context.Background(), and that callers can call Stop() + Start(customCtx) if they need a different context.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@redis.go` around lines 134 - 139, The constructor NewRedisBridge currently
calls bridge.Start(context.Background()) before returning, which forces
immediate background start and prevents callers from supplying their own
context; revert this by removing the automatic Start call in NewRedisBridge so
it only constructs and returns the bridge (allowing callers to call
bridge.Start(ctx) or bridge.Start(customCtx) later), and update any
tests/examples that assumed auto-start; alternatively, if auto-start is
intended, update the architecture.md docs and examples to state that
NewRedisBridge starts the listener with context.Background() and document using
Stop() + Start(customCtx) to change the running context.
tests/cli/ws/Taskfile.yaml (2)

28-36: Consider adding -race flag to integration tests for consistency.

The test-integration task omits the -race flag, unlike test and test-unit. Race detection during integration tests can catch concurrency issues in the Redis bridge code paths.

-        go test -count=1 ./... -run Integration -tags integration
+        go test -count=1 -race ./... -run Integration -tags integration
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cli/ws/Taskfile.yaml` around lines 28 - 36, The test-integration
Taskfile task named "test-integration" omits the Go race detector; update the go
test invocation in the test-integration cmds (the multi-line shell block that
runs `go test -count=1 ./... -run Integration -tags integration`) to include the
`-race` flag (e.g., `go test -race ...`) so integration runs use the same race
detection as `test` and `test-unit`.

38-43: Default task omits gofmt and golangci-lint from QA sequence.

The default task chains build, test, and vet, but project learnings specify running gofmt -l . and golangci-lint run ./... as part of the full QA sequence before commit.

♻️ Suggested addition
+  fmt:
+    dir: ../../..
+    cmds:
+      - gofmt -l .
+
+  lint:
+    dir: ../../..
+    cmds:
+      - golangci-lint run ./...
+
   default:
     deps:
+      - fmt
       - build
       - test
       - vet
+      - lint

Based on learnings: "Run full QA sequence before commit: gofmt -l . && go vet ./... && golangci-lint run ./... && go test -race ./..."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cli/ws/Taskfile.yaml` around lines 38 - 43, The default Taskfile target
named "default" currently depends only on build, test, and vet; update it to
include the QA steps so the full pre-commit QA sequence runs: add tasks (or
dependencies) that run "gofmt -l .", "golangci-lint run ./...", and the full
test vet chain (e.g., ensure deps for gofmt, golangci-lint and the existing
"vet" and "test" targets are referenced in the "default" task) so the default
deps list contains build, gofmt, golangci-lint, vet, and test.
auth.go (1)

416-429: Use of unsafe.Pointer for unexported field access is intentional but warrants caution.

The setReflectValue function uses unsafe.Pointer to write unexported struct fields during claims cloning. While this enables deep-cloning of claims containing unexported mutable state (as tested), it bypasses Go's type safety guarantees.

Consider adding a brief doc comment explaining the safety invariants:

+// setReflectValue sets dst to value, using unsafe access for unexported fields.
+// The caller must ensure dst is addressable and value is type-compatible.
 func setReflectValue(dst reflect.Value, value reflect.Value) bool {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@auth.go` around lines 416 - 429, Add a short doc comment above
setReflectValue explaining why unsafe.Pointer and reflect.NewAt(dst.Type(),
unsafe.Pointer(dst.UnsafeAddr())) are used and what safety invariants callers
must uphold (e.g., only used for cloning trusted claim structs, destination
value must be addressable and of the exact expected concrete type, no concurrent
access during mutation, and that this intentionally breaks Go export rules);
reference the function name setReflectValue and the use of
dst.UnsafeAddr/unsafe.Pointer/reflect.NewAt so future maintainers understand the
tradeoffs and where to audit if the assumptions are violated.
redis_test.go (1)

1253-1256: Long sleep for subscription settling may cause flaky tests.

A 1-second sleep to allow subscriptions to settle is relatively long and may still be insufficient under load. Consider using the testEventually pattern for more reliable synchronisation.

-	time.Sleep(1 * time.Second)
+	if !testEventually(func() bool {
+		return bridgeA.isListening() && bridgeB.isListening()
+	}, 3*time.Second, 50*time.Millisecond) {
+		t.Fatal("bridges did not start listening in time")
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@redis_test.go` around lines 1253 - 1256, Replace the fixed 1s sleep with a
deterministic wait using the testEventually pattern: remove time.Sleep(1 *
time.Second) and instead poll until the subscription state is settled (for
example by checking a observable condition on bridgeB such as a
Subscribers()/IsSubscribed()/HasReceiver() method or by asserting a test message
is received) using testEventually with a short interval and a reasonable
timeout; keep the defer bridgeB.Stop() in place and ensure the poll asserts
success before proceeding so the test is not flaky.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@auth_test.go`:
- Around line 1436-1446: The if statement checking
testIsNil(hub.config.Authenticator) and the t.Errorf call are split with an
embedded comment, making it hard to read; reformat so the condition is a single
well-formed if (e.g., if !testIsNil(hub.config.Authenticator) { ... }) move or
remove the large comment block outside the if, and make the t.Errorf call
contiguous (t.Errorf("expected nil, got %T", hub.config.Authenticator)) so the
check and error reporting for hub.config.Authenticator and testIsNil are clear
and syntactically correct.
- Around line 268-275: The section comment "Unit tests — AuthenticatorFunc
adapter" is accidentally inside the if block of
TestAPIKeyAuthenticator_NilMap_Good; move that comment out of the function body
(place it before the next test or after the closing brace of
TestAPIKeyAuthenticator_NilMap_Good) so the comment becomes a top-level section
marker again and the test function contains only test logic (refer to
TestAPIKeyAuthenticator_NilMap_Good and the section comment string to locate and
relocate it).

In `@go.mod`:
- Around line 1-10: Update the import in ws_test.go that currently reads coreerr
"dappco.re/go/core/log" to the new module path coreerr "dappco.re/go/log";
locate the import block in ws_test.go (look for the alias coreerr) and replace
the old path, then run go mod tidy/build to verify the test compiles against the
updated dependency.

In `@redis_test.go`:
- Around line 350-357: The section comment "PublishBroadcast — messages reach
local WebSocket clients" is currently placed inside an if block near the
t.Errorf/err.Error() assertion; move that comment out of the conditional and
place it between the two test functions (i.e., above the next test's function
declaration) so it serves as a section header rather than being embedded inside
the if that contains t.Errorf and err.Error().

---

Nitpick comments:
In `@auth.go`:
- Around line 416-429: Add a short doc comment above setReflectValue explaining
why unsafe.Pointer and reflect.NewAt(dst.Type(),
unsafe.Pointer(dst.UnsafeAddr())) are used and what safety invariants callers
must uphold (e.g., only used for cloning trusted claim structs, destination
value must be addressable and of the exact expected concrete type, no concurrent
access during mutation, and that this intentionally breaks Go export rules);
reference the function name setReflectValue and the use of
dst.UnsafeAddr/unsafe.Pointer/reflect.NewAt so future maintainers understand the
tradeoffs and where to audit if the assumptions are violated.

In `@docs/architecture.md`:
- Line 418: In the "Strict origin check by default." paragraph that mentions
HubConfig.CheckOrigin, remove the hyphen in "narrowly-scoped" and change it to
"narrowly scoped" so the adverb + modifier follows the recommended style; update
the sentence that currently reads "...or supply a narrowly-scoped override." to
"...or supply a narrowly scoped override."

In `@errors_test.go`:
- Around line 25-30: The test contains a redundant nil check for tt.err: remove
the first if block that only checks "if err := tt.err; err == nil {
t.Errorf(...)}" and keep the combined check that verifies nil OR mismatched
error string (the second block). Ensure the remaining check uses a single err :=
tt.err local and reports the expected vs actual with t.Errorf("expected error
%q, got %v", tt.want, err) so nil and string-mismatch are both handled.
- Around line 12-34: Extend the TestErrors_AuthSentinels_Good table in
TestErrors_AuthSentinels_Good to include the three new exported sentinels
ErrMissingUserID, ErrInvalidAuthClaims, and ErrSubscriptionLimitExceeded
(alongside the existing ErrMissingAuthHeader, ErrMalformedAuthHeader,
ErrInvalidAPIKey); for each new entry set want equal to the exact error string
defined in errors.go and run the same assertions. Locate the test table in
TestErrors_AuthSentinels_Good and add rows for ErrMissingUserID,
ErrInvalidAuthClaims, and ErrSubscriptionLimitExceeded using their canonical
error messages from errors.go so the t.Run checks compare err.Error() to the
correct expected string.

In `@redis_test.go`:
- Around line 1253-1256: Replace the fixed 1s sleep with a deterministic wait
using the testEventually pattern: remove time.Sleep(1 * time.Second) and instead
poll until the subscription state is settled (for example by checking a
observable condition on bridgeB such as a
Subscribers()/IsSubscribed()/HasReceiver() method or by asserting a test message
is received) using testEventually with a short interval and a reasonable
timeout; keep the defer bridgeB.Stop() in place and ensure the poll asserts
success before proceeding so the test is not flaky.

In `@redis.go`:
- Around line 274-282: PublishBroadcast currently returns the Redis error when
rb.publish fails, which hides any prior local error from
rb.hub.broadcastMessage; change the return logic around localErr and redisErr so
both failures are preserved — if both localErr and redisErr are non-nil,
return/construct a combined error (or log one and return the other) that
includes both messages (e.g., "local: ...; redis: ..."), if only one is non-nil
return that one; update the block handling rb.hub.broadcastMessage and
rb.publish to aggregate or log the shadowed error rather than letting redisErr
unconditionally overwrite localErr.
- Around line 134-139: The constructor NewRedisBridge currently calls
bridge.Start(context.Background()) before returning, which forces immediate
background start and prevents callers from supplying their own context; revert
this by removing the automatic Start call in NewRedisBridge so it only
constructs and returns the bridge (allowing callers to call bridge.Start(ctx) or
bridge.Start(customCtx) later), and update any tests/examples that assumed
auto-start; alternatively, if auto-start is intended, update the architecture.md
docs and examples to state that NewRedisBridge starts the listener with
context.Background() and document using Stop() + Start(customCtx) to change the
running context.

In `@tests/cli/ws/Taskfile.yaml`:
- Around line 28-36: The test-integration Taskfile task named "test-integration"
omits the Go race detector; update the go test invocation in the
test-integration cmds (the multi-line shell block that runs `go test -count=1
./... -run Integration -tags integration`) to include the `-race` flag (e.g.,
`go test -race ...`) so integration runs use the same race detection as `test`
and `test-unit`.
- Around line 38-43: The default Taskfile target named "default" currently
depends only on build, test, and vet; update it to include the QA steps so the
full pre-commit QA sequence runs: add tasks (or dependencies) that run "gofmt -l
.", "golangci-lint run ./...", and the full test vet chain (e.g., ensure deps
for gofmt, golangci-lint and the existing "vet" and "test" targets are
referenced in the "default" task) so the default deps list contains build,
gofmt, golangci-lint, vet, and test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f8c75087-1193-4902-8d94-50e596427e06

📥 Commits

Reviewing files that changed from the base of the PR and between 4b9176e and d526e0d.

⛔ Files ignored due to path filters (3)
  • go.sum is excluded by !**/*.sum
  • go.work is excluded by !**/*.work
  • go.work.sum is excluded by !**/*.sum
📒 Files selected for processing (14)
  • auth.go
  • auth_test.go
  • docs/architecture.md
  • docs/history.md
  • errors.go
  • errors_test.go
  • go.mod
  • redis.go
  • redis_test.go
  • test_stdlib_helpers_test.go
  • tests/cli/ws/Taskfile.yaml
  • ws.go
  • ws_bench_test.go
  • ws_test.go

Comment thread auth_test.go Outdated
Comment thread auth_test.go
Comment thread go.mod
Comment thread redis_test.go Outdated
Critical fix:
- ws_test.go imported coreerr "dappco.re/go/core/log" — stale path
  that no longer exists in the dependency graph after the alias
  migration in commit 121bede. Updated to coreerr "dappco.re/go/log"
  to match auth.go/errors.go/redis.go/ws.go and the go.mod declaration.

Minor cleanups (CodeRabbit-flagged misplaced section comments):
- auth_test.go:268-272: section header "Unit tests — AuthenticatorFunc
  adapter" was embedded inside an if-block; lifted to between the
  two test functions.
- auth_test.go:1438-1443: section header "Integration tests — httptest
  + gorilla/websocket Dial" was embedded inside a t.Errorf argument
  list; lifted to between the test function and the helper.
- redis_test.go:349-353: section header "PublishBroadcast — messages
  reach local WebSocket clients" was embedded inside an if-block;
  lifted to before the next test function.

Pre-existing layout artefacts that survived the testify-removal pass
in HEAD; surfaced by CodeRabbit on the dev→main PR.

Co-authored-by: Hephaestus <hephaestus@cladius>
@Snider Snider changed the title chore(dev→main): routine PR cadence for CodeRabbit review feat(ws): harden auth, Redis bridge, and origin handling Apr 27, 2026
Snider and others added 2 commits April 27, 2026 15:29
15 files modified, +505/-361. Codex follow-up to Hephaestus's earlier
69f9da0 (which fixed the coreerr alias path on test files).

Code fixes:
- redis.go: PublishBroadcast no longer shadows local/Redis errors
- redis.go: NewRedisBridge constructor no longer auto-starts listener
  with background context
- auth.go: setReflectValue unsafe-invariants doc comment
- redis_test.go: replaced fixed 1s bridge-settle sleep with testEventually
- errors_test.go: redundant nil check removed; new sentinel coverage
- tests/cli/ws/Taskfile.yaml: integration tests gain -race; default QA
  gains fmt + lint
- docs/architecture.md: 'narrowly-scoped' style nit

Disposition replies (already-fixed by 69f9da0, codex verified):
- auth_test.go misplaced AuthenticatorFunc section comment
- auth_test.go split nil-authenticator assertion
- ws_test.go stale dappco.re/go/core/log alias
- redis_test.go misplaced PublishBroadcast section comment

Doc/config:
- docs/index.md: removed stale testify dependency row
- .golangci.yml: migrated to v2 schema
- go.mod: added dappco.re/go/log replace for cold-cache verification

Verification: gofmt -l clean, golangci-lint run 0 issues, go vet +
go test -count=1 ./... pass with explicit GOPATH/GOMODCACHE/GOCACHE.

Closes residual findings on #3

Co-authored-by: Codex <noreply@openai.com>
bash /tmp/v090/audit.sh . → verdict: COMPLIANT (all 7 dimensions zero).

Co-authored-by: Codex <noreply@openai.com>
Co-Authored-By: Virgil <virgil@lethean.io>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@auth.go`:
- Around line 485-510: The APIKeyAuthenticator remains backwards-incompatible
for callers who use struct literals (setting APIKeyAuthenticator.Keys) because
Authenticate reads only the private keys snapshot; fix this by ensuring the
private snapshot is populated from the exported Keys when unset: in Authenticate
(or a small helper like ensureKeysInitialized) check if a.keys == nil and if so
set a.keys = cloneStringMap(a.Keys) (handling nil Keys as empty map), then
proceed to use a.keys for lookups; reference APIKeyAuthenticator, Keys, keys,
NewAPIKeyAuth and Authenticate to locate where to add this lazy initialization.

In `@redis_test.go`:
- Around line 156-168: The test TestRedisBridge_InvalidPrefix_Ugly should be
guarded to auto-skip when the shared Redis is unreachable: call skipIfNoRedis(t)
at the start of TestRedisBridge_InvalidPrefix_Ugly before invoking
NewRedisBridge, so the test is skipped if redisAddr is not available;
alternatively, replace the external Redis dependency by using the local fake
Redis helper instead of passing Addr: redisAddr to NewRedisBridge to ensure the
invalid-prefix assertion always runs.

In `@ws.go`:
- Around line 1330-1335: The defer that closes the batch writer inside writePump
(the closed flag and defer calling logCloseError("Client.writePump.writer",
w.Close)) leaks writers because defer runs only when writePump returns; remove
that per-iteration defer and instead explicitly close the writer on every
error/exit path after writing (or refactor the batch write loop into a small
helper function so the defer is scoped to a single iteration). Concretely,
update writePump to stop using the per-iteration defer and call w.Close() (with
logCloseError on non-nil errors) immediately after write or on any write error,
or move the code that creates and uses w into a helper function (e.g.,
writeBatch) where the defer for w.Close() is local to that function.
- Around line 1028-1036: The allowedOriginsCheck currently returns
allowAllOriginsCheck when the provided allowedOrigins slice is empty; change
this to return sameOriginCheck (or an explicit deny function) so that an empty
AllowedOrigins does not permit all origins. Locate the functions
allowedOriginsCheck and allowAllOriginsCheck and update the empty-slice branch
to return sameOriginCheck (and remove or restrict use of allowAllOriginsCheck as
the default) so hubs with authentication do not fall back to permissive origin
checks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3920b04a-b75d-4df5-9645-c6a2886d933a

📥 Commits

Reviewing files that changed from the base of the PR and between d526e0d and 485f272.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (15)
  • .golangci.yml
  • auth.go
  • auth_test.go
  • ax7_v090_test.go
  • docs/architecture.md
  • docs/index.md
  • errors_test.go
  • go.mod
  • redis.go
  • redis_test.go
  • test_stdlib_helpers_test.go
  • tests/cli/ws/Taskfile.yaml
  • ws.go
  • ws_bench_test.go
  • ws_test.go
💤 Files with no reviewable changes (1)
  • docs/index.md
✅ Files skipped from review due to trivial changes (3)
  • docs/architecture.md
  • tests/cli/ws/Taskfile.yaml
  • test_stdlib_helpers_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • errors_test.go
  • redis.go

Comment thread auth.go
Comment on lines 485 to +510
type APIKeyAuthenticator struct {
// Keys maps API key values to user IDs.
// Keys is a construction-time snapshot of API key values to user IDs.
// Treat it as read-only; Authenticate uses the internal snapshot.
Keys map[string]string

keys map[string]string
}

// NewAPIKeyAuth creates an APIKeyAuthenticator from the given keyuserID
// NewAPIKeyAuth creates an APIKeyAuthenticator from the given key-to-userID
// mapping. The returned authenticator validates `Authorization: Bearer <key>`
// headers against the provided keys.
func NewAPIKeyAuth(keys map[string]string) *APIKeyAuthenticator {
return &APIKeyAuthenticator{Keys: keys}
if keys == nil {
return &APIKeyAuthenticator{
Keys: nil,
keys: nil,
}
}

snapshot := cloneStringMap(keys)

return &APIKeyAuthenticator{
Keys: snapshot,
keys: cloneStringMap(snapshot),
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep APIKeyAuthenticator backwards-compatible with struct literals.

Keys is still exported, but Authenticate now reads only the private keys snapshot. Any caller constructing &APIKeyAuthenticator{Keys: ...} will now silently reject every API key after this change. Either hydrate the private snapshot from Keys when it is unset, or remove the exported field so the broken initialisation path is impossible.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@auth.go` around lines 485 - 510, The APIKeyAuthenticator remains
backwards-incompatible for callers who use struct literals (setting
APIKeyAuthenticator.Keys) because Authenticate reads only the private keys
snapshot; fix this by ensuring the private snapshot is populated from the
exported Keys when unset: in Authenticate (or a small helper like
ensureKeysInitialized) check if a.keys == nil and if so set a.keys =
cloneStringMap(a.Keys) (handling nil Keys as empty map), then proceed to use
a.keys for lookups; reference APIKeyAuthenticator, Keys, keys, NewAPIKeyAuth and
Authenticate to locate where to add this lazy initialization.

Comment thread redis_test.go
Comment on lines +156 to +168
func TestRedisBridge_InvalidPrefix_Ugly(t *testing.T) {
hub := NewHub()

_, err := NewRedisBridge(hub, RedisConfig{
Addr: redisAddr,
Prefix: "bad prefix",
})
if err := err; err == nil {
t.Fatalf("expected error")
}
if !testContains(err.Error(), "invalid redis prefix") {
t.Errorf("expected %v to contain %v", err.Error(), "invalid redis prefix")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard this Redis-dependent test with skipIfNoRedis(t).

This case reaches NewRedisBridge with Addr: redisAddr, so it can fail before the invalid-prefix assertion even runs when the shared Redis is unavailable. Add the skip guard here or switch this case to the local fake Redis helper. As per coding guidelines "Redis tests must auto-skip via skipIfNoRedis(t) when Redis at 10.69.69.87:6379 is unreachable".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@redis_test.go` around lines 156 - 168, The test
TestRedisBridge_InvalidPrefix_Ugly should be guarded to auto-skip when the
shared Redis is unreachable: call skipIfNoRedis(t) at the start of
TestRedisBridge_InvalidPrefix_Ugly before invoking NewRedisBridge, so the test
is skipped if redisAddr is not available; alternatively, replace the external
Redis dependency by using the local fake Redis helper instead of passing Addr:
redisAddr to NewRedisBridge to ensure the invalid-prefix assertion always runs.

Comment thread ws.go
Comment on lines +1028 to +1036
func allowAllOriginsCheck(*http.Request) bool {
return true
}

// Populate auth fields when authentication succeeded.
if h.config.Authenticator != nil {
client.UserID = authResult.UserID
client.Claims = authResult.Claims
func allowedOriginsCheck(allowedOrigins []string) func(*http.Request) bool {
allowedOrigins = slices.Clone(allowedOrigins)
if len(allowedOrigins) == 0 {
return allowAllOriginsCheck
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Do not default origin checks to “allow all”.

When AllowedOrigins is empty, this falls back to allowAllOriginsCheck, so a hub with authentication but no explicit origin policy will still accept cross-site browser upgrades from any origin. That re-opens WebSocket CSRF for cookie-backed sessions and undermines the hardening in this PR. Default this path to sameOriginCheck or explicit deny instead.

Suggested fix
 func allowedOriginsCheck(allowedOrigins []string) func(*http.Request) bool {
 	allowedOrigins = slices.Clone(allowedOrigins)
 	if len(allowedOrigins) == 0 {
-		return allowAllOriginsCheck
+		return sameOriginCheck
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func allowAllOriginsCheck(*http.Request) bool {
return true
}
// Populate auth fields when authentication succeeded.
if h.config.Authenticator != nil {
client.UserID = authResult.UserID
client.Claims = authResult.Claims
func allowedOriginsCheck(allowedOrigins []string) func(*http.Request) bool {
allowedOrigins = slices.Clone(allowedOrigins)
if len(allowedOrigins) == 0 {
return allowAllOriginsCheck
}
func allowAllOriginsCheck(*http.Request) bool {
return true
}
func allowedOriginsCheck(allowedOrigins []string) func(*http.Request) bool {
allowedOrigins = slices.Clone(allowedOrigins)
if len(allowedOrigins) == 0 {
return sameOriginCheck
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ws.go` around lines 1028 - 1036, The allowedOriginsCheck currently returns
allowAllOriginsCheck when the provided allowedOrigins slice is empty; change
this to return sameOriginCheck (or an explicit deny function) so that an empty
AllowedOrigins does not permit all origins. Locate the functions
allowedOriginsCheck and allowAllOriginsCheck and update the empty-slice branch
to return sameOriginCheck (and remove or restrict use of allowAllOriginsCheck as
the default) so hubs with authentication do not fall back to permissive origin
checks.

Comment thread ws.go
Comment on lines +1330 to +1335
closed := false
defer func() {
if !closed {
logCloseError("Client.writePump.writer", w.Close)
}
}()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove the per-message defer from writePump.

That defer does not run at the end of the iteration; it runs only when writePump exits. Under sustained traffic this stacks one closure per batch and retains every writer until disconnect. Close the writer explicitly on each error path, or move the batch write into a small helper so any defer is scoped to a single iteration.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ws.go` around lines 1330 - 1335, The defer that closes the batch writer
inside writePump (the closed flag and defer calling
logCloseError("Client.writePump.writer", w.Close)) leaks writers because defer
runs only when writePump returns; remove that per-iteration defer and instead
explicitly close the writer on every error/exit path after writing (or refactor
the batch write loop into a small helper function so the defer is scoped to a
single iteration). Concretely, update writePump to stop using the per-iteration
defer and call w.Close() (with logCloseError on non-nil errors) immediately
after write or on any write error, or move the code that creates and uses w into
a helper function (e.g., writeBatch) where the defer for w.Close() is local to
that function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants