fix: harden certificate, dlq, ssrf, idempotency, and webhook paths#427
fix: harden certificate, dlq, ssrf, idempotency, and webhook paths#427fredcamaral merged 9 commits intodevelopfrom
Conversation
Prevent nil concrete signers from panicking during rotation and make full certificate chains available for hot-reload validation. This keeps TLS reload paths nil-safe while preserving chain data for callers that need it.
Default DLQ metric recording to a noop implementation so retry paths stay nil-safe, and keep panic-recovered retries from dropping messages. The key helpers move into a dedicated file to keep handler and consumer responsibilities narrower.
Ignore nil options safely, respect canceled contexts during validation, and reject additional metadata and special-purpose IPv6 targets. This keeps SSRF policy enforcement consistent even when callers pass malformed configuration.
Surface cached-response marshal failures in logs without breaking the request path, and cover PUT so all mutating methods keep the same replay semantics. This makes silent cache misses diagnosable while preserving fail-open behavior.
Keep signature helpers isolated from delivery flow, retry 429 responses correctly, and preserve the last transport failure when retries are exhausted. This makes webhook failures easier to debug without changing the public delivery API.
Document the updated certificate, DLQ, and idempotency contracts, and record security trade-offs around schema env-var exposure and the remaining proxy transport SSRF gap. This keeps the repo guidance aligned with the code that now ships.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughThis PR makes cross-cutting changes: certificate Manager.Rotate accepts optional intermediates and adds a panic-safe public-key check; LoadFromFilesWithChain returns the DER chain; DLQ always uses a concrete noop metrics implementation and adds RecordLost while tenant-scoped Redis key helpers were moved/implemented in keys.go; idempotency middleware reads the idempotency key from X-Idempotency and logs JSON marshal failures when caching; webhook signature verification was introduced as a dedicated module (HMAC v0/v1) and deliverer retry/error handling was adjusted; SSRF checks, nil-safety, and context validation were extended; several tests were added. Sequence Diagram(s)mermaid Deliverer->>EndpointLister: Request endpoints 🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@commons/net/http/proxy_transport.go`:
- Around line 14-17: Update the misleading TODO: clarify that the transport
already performs DNS resolution and pins the connection by using the resolved IP
(see assignment addr = safeIP.String()), and that the remaining gap is the
TOCTOU window between validation and dialing; recommend migrating to
ssrf.ResolveAndValidate (commons/security/ssrf/validate.go:ResolveAndValidate)
to eliminate that window and centralize DNS resolution/validation rather than
saying DNS resolution/pinning is not done.
In `@commons/security/ssrf/validate.go`:
- Around line 42-44: The call to ctx.Err() in ValidateURL can panic if ctx is
nil; update ValidateURL to check for a nil context before calling ctx.Err()
(e.g., if ctx == nil { return fmt.Errorf("%w: %v", ErrInvalidURL,
errors.New("nil context")) } or equivalent) so the function is nil-safe while
preserving the ErrInvalidURL sentinel in the returned error; ensure the change
references the existing ErrInvalidURL symbol and replaces the direct ctx.Err()
invocation with a nil-guarded path.
In `@commons/webhook/deliverer_test.go`:
- Around line 1147-1169: Rename the test function
TestDeliver_ContextCancelledDuringRetry to a clearer name that reflects what it
actually verifies (e.g., TestDeliver_ListerError_ReturnsPromptly or
TestDeliver_ListerErrorWithContextCanceled); update the test declaration and any
references to it, leaving the test body (mockLister, NewDeliverer,
WithMaxRetries, d.Deliver, assertions) unchanged so the behavior being asserted
remains the same.
In `@commons/webhook/deliverer.go`:
- Around line 616-618: The code currently clones an http.Transport per request
via transport.Clone(), which prevents connection reuse for HTTPS pinned
endpoints; add a concurrent cache (use sync.Map) keyed by the SNI hostname
(e.g., TLS ServerName or derived from the pinned URL) to store and reuse cloned
transports so Clone() is only called once per SNI; modify the code around
transport.Clone() in deliverer.go (where pinned HTTPS deliveries are handled)
to: check cache for existing transport, if missing clone and store it atomically
in the sync.Map, and then use the cached transport for requests; ensure
thread-safety and that the cached transport preserves per-host
TLSClientConfig.ServerName and other pinning settings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 85938093-ffb8-4cd0-b3e5-886c110500d2
📒 Files selected for processing (19)
AGENTS.mdcommons/certificate/certificate.gocommons/certificate/certificate_test.gocommons/dlq/consumer.gocommons/dlq/consumer_test.gocommons/dlq/handler.gocommons/dlq/keys.gocommons/net/http/idempotency/idempotency.gocommons/net/http/idempotency/idempotency_test.gocommons/net/http/proxy_transport.gocommons/security/ssrf/hostname.gocommons/security/ssrf/options.gocommons/security/ssrf/ssrf.gocommons/security/ssrf/ssrf_test.gocommons/security/ssrf/validate.gocommons/systemplane/adapters/http/fiber/dto.gocommons/webhook/deliverer.gocommons/webhook/deliverer_test.gocommons/webhook/signature.go
Fail fast with ErrInvalidURL when callers pass a nil context so URL validation stays nil-safe instead of panicking. Clarify the reverse-proxy transport note to describe the real remaining TOCTOU gap in SSRF resolution.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@commons/security/ssrf/validate.go`:
- Around line 43-49: Add the same nil/canceled-context preflight checks used in
ValidateURL to ResolveAndValidate: at the start of ResolveAndValidate, check if
ctx == nil and return fmt.Errorf("%w: %w", ErrInvalidURL, errors.New("nil
context")), then check ctx.Err() and return fmt.Errorf("%w: %w", ErrInvalidURL,
err) if non-nil; reference the ResolveAndValidate function and reuse the same
ErrInvalidURL wrapping behavior so both exported validators are nil-safe and
fail-fast.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 94568938-d183-4df7-9d70-f93bf68dce0e
📒 Files selected for processing (3)
commons/net/http/proxy_transport.gocommons/security/ssrf/ssrf_test.gocommons/security/ssrf/validate.go
Mirror the ValidateURL preflight in ResolveAndValidate so nil and canceled contexts return ErrInvalidURL instead of reaching DNS resolution paths. Add regression coverage for both nil and canceled contexts.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@commons/security/ssrf/ssrf_test.go`:
- Around line 402-409: Add a new unit test mirroring the canceled-context
coverage present for ResolveAndValidate: create TestValidateURL_CanceledContext
that creates a canceled context (use context.WithCancel + cancel()), calls
ValidateURL(ctx, "https://example.com/hook"), and asserts an error is returned,
that the error is ErrInvalidURL and that the error wraps context.Canceled; place
it alongside TestValidateURL_NilContext to ensure parity with ResolveAndValidate
tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8be78d12-d8e0-4bf1-98da-2eb133bceed1
📒 Files selected for processing (2)
commons/security/ssrf/ssrf_test.gocommons/security/ssrf/validate.go
Reuse the same nil and canceled context guard across both SSRF validators so behavior stays aligned while keeping ResolveAndValidate below the repository complexity threshold. This preserves the existing ErrInvalidURL wrapping and restores a clean CI pipeline.
|
@CodeRabbit -- we need your approve ;) |
|
Tip For best results, initiate chat on the files or code changes.
Once you do that, I'll go ahead and approve the PR! 🚀 |
|
@CodeRabbit approve |
✅ Actions performedComments resolved and changes approved. |
Summary
Testing