Skip to content

NOISSUE - Post-handshake aTLS#582

Merged
drasko merged 6 commits intoultravioletrs:mainfrom
danko-miladinovic:seat-atls
Mar 26, 2026
Merged

NOISSUE - Post-handshake aTLS#582
drasko merged 6 commits intoultravioletrs:mainfrom
danko-miladinovic:seat-atls

Conversation

@danko-miladinovic
Copy link
Copy Markdown
Contributor

What type of PR is this?

This is a feature because it introduces post-handshake attestation with exported authenticators. The work is based on:
https://www.rfc-editor.org/rfc/rfc9261.html
https://datatracker.ietf.org/doc/html/draft-fossati-seat-expat

What does this do?

This PR replaces the legacy pkg/atls certificate-extension-based aTLS implementation with the new Exported Authenticator (EA) based aTLS transport.

The new design keeps crypto/tls as the base secure channel and performs attestation as the first post-handshake message exchange using Exported Authenticators. This aligns the cocos integration with the phase-3 EA/aTLS implementation and removes the old nonce/SNI + VerifyPeerCertificate flow.

For attested connections, the connection flow is now:

  • TLS handshake
  • EA AuthenticatorRequest sent by the client
  • Exported Authenticator returned by the server (with the attestation report)
  • Normal gRPC / HTTP traffic starts

This guarantees that the first post-handshake application data on the TLS channel is the EA exchange.

Which issue(s) does this PR fix/relate to?

No issue.

Have you included tests for your changes?

Yes, tests are included.

Did you document any new/modified feature?

No, the documentation will be updated in a separate PR.

Notes

@danko-miladinovic danko-miladinovic self-assigned this Mar 20, 2026
@danko-miladinovic danko-miladinovic added the enhancement New feature or request label Mar 20, 2026
@danko-miladinovic danko-miladinovic marked this pull request as ready for review March 20, 2026 15:49
@jovan-djukic jovan-djukic self-requested a review March 23, 2026 11:11
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 25, 2026

Codecov Report

❌ Patch coverage is 43.75897% with 784 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.13%. Comparing base (42b0552) to head (1b780ec).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/atls/internal_transport/conn.go 33.12% 74 Missing and 31 partials ⚠️
pkg/atls/ea/authenticator.go 56.88% 49 Missing and 45 partials ⚠️
pkg/atls/ea/request.go 41.86% 58 Missing and 17 partials ⚠️
pkg/atls/ea/certverify.go 41.93% 44 Missing and 10 partials ⚠️
pkg/atls/tls_helpers.go 0.00% 52 Missing ⚠️
pkg/atls/server_tls.go 0.00% 51 Missing ⚠️
pkg/atls/evidence_verifier.go 0.00% 50 Missing ⚠️
pkg/atls/provider.go 0.00% 38 Missing ⚠️
pkg/atls/ea/certificate.go 60.00% 11 Missing and 11 partials ⚠️
pkg/atls/ea/sigscheme.go 43.58% 21 Missing and 1 partial ⚠️
... and 18 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #582      +/-   ##
==========================================
- Coverage   73.57%   67.13%   -6.45%     
==========================================
  Files          96      116      +20     
  Lines        6123     7223    +1100     
==========================================
+ Hits         4505     4849     +344     
- Misses       1204     1794     +590     
- Partials      414      580     +166     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

if req != nil && !bytes.Equal(certMsg.Context, reqCtx) {
return nil, ErrContextMismatch
}
if len(certMsg.Entries) == 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the first message is Certificate (not Finished), the code parses all three messages (Certificate, CertificateVerify, Finished) on lines 242-265. However, if certMsg.Entries is empty, this block returns success without verifying the Finished MAC.

This differs from the proper empty authenticator path (lines 210-235) which correctly verifies the Finished message. An attacker could exploit this by sending Certificate(entries=[]) + CertificateVerify + Finished with arbitrary data in the Finished message.

Per the EA specification, an empty authenticator should be a Finished-only message, not a Certificate with empty entries followed by CertificateVerify and Finished.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this will be changed to so that a Certificate message with zero entries followed by CertificateVerify/Finished will be treated as malformed and will not be accepted as an empty authenticator.

if err != nil {
return nil, err
}
var verifierPolicy eaattestation.VerificationPolicy
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When attPolicy is nil, a zero-value VerificationPolicy is created with nil interface fields. However, VerifyPayload explicitly returns ErrEvidenceVerificationMissing if evidence is present but EvidenceVerifier is nil (line 40), and ErrResultsVerificationMissing if attestation results are present but ResultsVerifier is nil (line 48). Ensure the payload is guaranteed to have no evidence and no attestation results before passing a zero-value policy, or provide default verifiers when attPolicy is nil.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a comment for this situation.

Comment thread pkg/atls/eaattestation/binding_test.go Outdated
return tls.Certificate{Certificate: [][]byte{der}, PrivateKey: priv}, leaf
}

func tls13Client(t *testing.T, cert tls.Certificate) *tls.Conn {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tls13Client helper creates a TLS server (srv) that is never closed. While cli.Close() is called by test cleanup, the server side of the net.Pipe connection may not be properly cleaned up. Return and close both connections or defer cleanup within the helper.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this will be changed.

return nil
}

func equalBytes(a, b []byte) bool {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

equalBytes uses early-return comparison which leaks timing information about where mismatches occur. For cryptographic values like AIKPubHash and Binding, this could enable timing attacks. Use crypto/subtle.ConstantTimeCompare instead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this will be fixed.

return &Conn{Conn: tlsConn, Request: req, ValidationResult: res}, nil
}

func Server(tlsConn *tls.Conn, cfg *ServerConfig) (*Conn, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Server requires cfg.TLSConfig != nil (line 99), but TLSConfig is not used in the function body—only cfg.Identity and cfg.BuildLeafExtensions are used. If identity is provided via cfg.Identity, TLSConfig shouldn't be mandatory.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This will be fixed.

Comment thread pkg/atls/server_tls.go
return tls.Certificate{}, err
}

return tls.Certificate{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting Leaf: template uses the template *x509.Certificate which doesn't have fields like Raw, RawTBSCertificate, etc. populated. The Leaf field should contain the parsed certificate from the DER bytes for proper functionality when Go's TLS stack accesses it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this will be changed.

Comment thread pkg/atls/tls_helpers.go
return nil, fmt.Errorf("failed to load auth certificates: %w", err)
}

tlsConfig := &tls.Config{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tls.Config lacks an explicit MinVersion, defaulting to TLS 1.0 for server connections. Since this ATLS implementation relies on TLS 1.3 keying material (exported authenticators), enforce the minimum version explicitly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this will be changed.

Comment thread pkg/clients/grpc/grpc.go
AttestationPolicy: atls.VerificationPolicyFromEvidenceVerifier(atls.NewEvidenceVerifier(agcfg.AttestationPolicy)),
}

opts = append(opts,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grpc.WithContextDialer is where gRPC enforces connection deadlines and cancellation. Discarding ctx here means a stalled TCP/TLS/ATLS handshake can outlive the caller's deadline and hold subchannel creation open until the underlying socket times out. The pkg/atls package currently exposes only Dial(network, address string, cfg *ClientConfig) with no context support, requiring either a context-aware variant in ATLS or an alternative approach to propagate deadlines

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented. The ctx from grpc.WithContextDialer was previously dropped because pkg/atls only exposed a non-context Dial(...) API. I added context-aware dial support in conn.go and transport.go, and updated grpc.go to call atls.DialContext(ctx, ...).

AttestationPolicy: atls.VerificationPolicyFromEvidenceVerifier(atls.NewEvidenceVerifier(agcfg.AttestationPolicy)),
}

transport.DialTLSContext = func(ctx context.Context, network, addr string) (net.Conn, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once DialTLSContext is set, http.Transport stops applying TLSHandshakeTimeout itself. The ctx parameter carries request cancellation and timeout deadlines, but discarding it prevents the ATLS connection establishment from respecting them. This can cause the connection to hang past the configured timeout or ignore request cancellation.

Additionally, atls.Dial() does not accept a context parameter, and atls.DialWithDialer() internally calls d.Dial() rather than d.DialContext(ctx). For proper timeout and cancellation propagation, the atls package needs a context-aware variant (e.g., DialWithDialerContext) that passes context through the entire TCP/TLS/EA handshake.

Same verification applies as the gRPC dialer: confirm that atls package gains DialContext support so both HTTP and gRPC clients can propagate cancellation and deadlines through the TCP/TLS/EA handshake.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread pkg/ingress/proxy.go
}
tlsConfig.NextProtos = []string{"h2", "http/1.1"}

p.started = true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.started is set before listener creation completes, masking startup failures.

Setting p.started = true at line 151 before the goroutine creates the listener means Start() returns success even if listener creation fails. The caller has no indication that startup failed. Consider synchronizing listener creation before returning, or using a channel/sync mechanism to report errors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented. Start() in proxy.go now creates the listener synchronously before setting p.started = true or returning success. This applies to plain HTTP, regular TLS, and attested TLS. The goroutine only starts Serve(listener) after listener creation succeeds, so bind/listener setup failures are now returned directly to the caller instead of being logged after a false-successful startup.

@drasko drasko merged commit 80bf813 into ultravioletrs:main Mar 26, 2026
8 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants