[client][management] feat: TPM/Secure Enclave certificate-based device authentication#5881
[client][management] feat: TPM/Secure Enclave certificate-based device authentication#5881dexion wants to merge 11 commits intonetbirdio:mainfrom
Conversation
📝 WalkthroughWalkthroughImplements device certificate authentication: adds client-side TPM/SE providers and enrollment manager, server-side attestation, CA backends and builtin CA, device enrollment and revocation flows, HTTP/gRPC device-auth APIs, inventory integrations, dev-stand infra and CI workflows, and related tests and docs. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant TPM as TPM/SE
participant MGMT as ManagementServer
participant CA as CABackend
rect rgba(100,150,255,0.5)
Note over Client,TPM: Enrollment (attestation path)
Client->>TPM: GenerateKey / Build CSR (CN=WG pubkey)
Client->>MGMT: BeginTPMAttestation(EK cert, AK pub, CSR)
MGMT->>MGMT: Verify EK chain, create credential blob, store session
MGMT-->>Client: SessionID + CredentialBlob
Client->>TPM: ActivateCredential(blob) -> secret
Client->>MGMT: CompleteTPMAttestation(SessionID, secret)
MGMT->>CA: SignCSR(CSR)
CA-->>MGMT: Device cert PEM
MGMT-->>Client: Approved + device cert PEM
TPM->>TPM: StoreCert(keyID, cert)
end
rect rgba(100,255,150,0.5)
Note over Client,MGMT: mTLS connection using device cert
Client->>Client: BuildTLSCertificate from TPM
Client->>MGMT: TLS handshake presenting device cert
MGMT->>MGMT: VerifyPeerCert() -> CheckDeviceAuth() -> checkCertRevocation()
MGMT-->>Client: gRPC connection established
end
rect rgba(255,150,100,0.5)
Note over Client: Renewal loop
loop periodic (jittered)
Client->>Client: EnsureCertificate() -> re-enroll if needed
alt new cert issued
Client->>Client: onRenewal -> cancel engine context -> reconnect
end
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
|
|
beLIEai seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
|
@dexion, it looks like this is an issue with your PR overall. over 400K lines of code, 2000 files, and almost 3000 commits. Can you check that? Also, please share more about the driver for the change, and if there was anyone from the team consulted |
c89f90d to
48d16ff
Compare
| # Password: Netbird@dev1 | ||
| staticPasswords: | ||
| - email: "admin@netbird.localhost" | ||
| hash: "$2a$10$CQx4J9U6r/j2dmy/WzvwWuiVRTTAwqi3a7vIk1cCAlg1DCMOmDrWm" |
Add BeginTPMAttestation/CompleteTPMAttestation (two-round TPM credential-activation), AttestAppleSE (single-round Secure Enclave), GetDeviceEnrollmentStatus, SubmitDeviceCertRenewal, and related message types to management.proto.
…interface Add DeviceCertificate, DeviceEnrollment types with GORM embedded DeviceAuthSettings in Account.Settings. Extend Store and AccountManager interfaces with device cert CRUD, enrollment approval/rejection, and CRL/revocation queries. Add cert_approver RBAC role.
Add KeyProvider interface with FileKeyProvider (AES-256-GCM, key derived from file), EnvKeyProvider, and NoOpProvider. Encrypt/decrypt helpers for CA credentials stored in the database.
… chain verification Add CertificateAuthority interface. BuiltinCA: in-process ECDSA P-256 CA with AES-256-GCM-encrypted private key, in-memory CRL, PKCS#12 export. External adapters: SCEP, Smallstep, HashiCorp Vault. CertificateFactory selects backend based on account config. Apple SE and TPM manufacturer root CA pools for attestation chain verification.
…E attestation handler - Server: load device CA on startup, require client cert in Login, check revocation via store Revoked flag (known limitation: active Sync streams not interrupted until reconnect) - DeviceEnrollmentHandler: SubmitCSR, GetEnrollmentStatus, RenewCert RPCs - AttestationHandler: BeginTPMAttestation/CompleteTPMAttestation (two-round credential-activation protocol, TOCTOU-safe via atomic GetAndDelete), AttestAppleSE (CBOR decode + chain verification) - RevocationHandler: disconnect active Sync streams on revocation event - DeviceInventory: Intune and Jamf MDM serial-number lookup adapters
… certificates, inventory
POST/GET /api/accounts/{id}/device-ca — configure CA (built-in or external)
GET /api/accounts/{id}/device-ca/cacert — download CA certificate PEM
POST /api/accounts/{id}/device-auth/ca/test — verify external CA credentials
POST /api/accounts/{id}/device-enrollments/{id}/approve|reject
POST /api/accounts/{id}/device-certs/{id}/revoke
GET/PUT /api/accounts/{id}/device-auth/inventory — serial-number allow-list
Accessible by admin and new cert_approver role.
TPMProvider wraps google/go-tpm: GenerateAK (create Attestation Key under EK hierarchy), ActivateCredential (decrypt management-server challenge to prove AK↔EK co-location on same TPM). Build-tagged implementations for Linux/darwin, stub for other platforms.
EnrollmentManager drives the full lifecycle: generate ECDSA key + CSR, submit to management server, poll until admin approves, store certificate, build mTLS tls.Config, and start renewal loop (7-day threshold, 6-hour check interval). connect.go wraps enrollment in tpmProv.Available() guard — skips on iOS/Android/WASM. On success upgrades gRPC connection to mTLS via NewClientWithCert.
…n tests - device-auth-tests.yml: unit + integration tests with swtpm simulator - tpm-tests.yml: Linux TPM integration tests with hardware simulator - e2e-device-auth.yml: end-to-end tests against dev stand - scripts/e2e-device-auth.sh, e2e-lib.sh: E2E test helpers - scripts/fetch-tpm-roots.go: tool to update embedded TPM root CAs - go.mod: add google/go-tpm v0.9.8
- infrastructure_files/stand-dex/: Docker Compose overlay with swtpm, Dex OIDC, Caddy TLS termination for local mTLS testing - management/cmd/enroll-demo: CLI tool for manual enrollment testing - management/cmd/mtls-demo: CLI tool for mTLS connection verification - tools/dev-stand-init: bootstraps dev stand with test accounts/peers - management/Dockerfile.multistage: include dev-stand-init binary - Makefile: add device-auth targets - .gitignore: exclude stand build artifacts
… limitations Documents the mTLS-on-gRPC architecture (not WireGuard data plane), all three enrollment modes (admin-issued, TPM 2.0, Apple SE), the issueDeviceCert flow, revocation algorithm, and 6 known limitations (L-1 through L-6) deferred to future iterations.
48d16ff to
c139c29
Compare
|
| # Password: Netbird@dev1 | ||
| staticPasswords: | ||
| - email: "admin@netbird.localhost" | ||
| hash: "$2a$10$CQx4J9U6r/j2dmy/WzvwWuiVRTTAwqi3a7vIk1cCAlg1DCMOmDrWm" |
Yes, I realized I hadn't pulled the latest changes from the main repository before branching, which caused the PR to "explode" with unrelated diffs. That's been fixed now — I rebased onto current main, and the actual change set is now much smaller and focused. The driver for this change is the desire to use certificate-based authentication, so we can restrict VPN access to specific clients. With username/password alone, I can connect from any laptop, which is not ideal for our security posture. This same need has been voiced by many others in issue 4773. I did not consult anyone from the core development team beforehand — I wasn't sure who the right point of contact would be, and I didn't see a clear response from the team in the original issue (or I may have missed it). I studied the existing project structure and aimed for the least invasive change that still aligns with the project's overall architecture and design philosophy. Let me know if there's anything else you'd like adjusted. Thanks! |
There was a problem hiding this comment.
Actionable comments posted: 6
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
🟠 Major comments (33)
management/server/http/handlers/device_auth/pem_helpers.go-49-58 (1)
49-58:⚠️ Potential issue | 🟠 MajorReject CA certs that are not yet valid (and guard nil input).
validateCACertcurrently checks expiration only. A CA with a futureNotBeforeis accepted even though it is not valid yet.🛠️ Proposed fix
func validateCACert(cert *x509.Certificate) error { + if cert == nil { + return fmt.Errorf("certificate is nil") + } if !cert.IsCA || !cert.BasicConstraintsValid { return fmt.Errorf("certificate is not a CA (missing BasicConstraints or IsCA flag)") } if cert.KeyUsage&x509.KeyUsageCertSign == 0 { return fmt.Errorf("certificate lacks KeyUsageCertSign") } - if cert.NotAfter.Before(time.Now()) { + now := time.Now() + if cert.NotBefore.After(now) { + return fmt.Errorf("CA certificate is not valid before %s", cert.NotBefore.Format(time.RFC3339)) + } + if cert.NotAfter.Before(now) { return fmt.Errorf("CA certificate has expired at %s", cert.NotAfter.Format(time.RFC3339)) } return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/http/handlers/device_auth/pem_helpers.go` around lines 49 - 58, The validateCACert function should first guard against a nil cert and return an error if nil, and then reject certificates that are not yet valid by checking cert.NotBefore against the current time; update validateCACert to return an error when cert.NotBefore is after time.Now() (e.g., "CA certificate is not valid until ...") in addition to the existing NotAfter expiry check, keeping the existing IsCA/BasicConstraints and KeyUsageCertSign checks.management/server/devicepki/ca.go-43-44 (1)
43-44:⚠️ Potential issue | 🟠 MajorReturn an error from
CACert.Backends that fetch or parse the CA certificate at runtime need a way to distinguish “no CA configured” from “temporary retrieval failure”. Returning only
*x509.Certificateforces callers intonil-based guesswork and makes trust-pool refresh paths fail silently. Please change this contract to(*x509.Certificate, error)and reserveErrCANotFoundfor the true missing-CA case.Based on learnings, methods returning (T, error) should propagate failures instead of making callers infer them from nil values.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/ca.go` around lines 43 - 44, Change the CACert method signature on the CA interface from CACert(ctx context.Context) *x509.Certificate to CACert(ctx context.Context) (*x509.Certificate, error); update every implementation of CACert to return (nil, ErrCANotFound) when the CA is truly missing and return a descriptive error for transient retrieval/parsing failures, and update all callers to handle the error (propagate transient errors and treat ErrCANotFound specifically where a missing CA is acceptable) so trust-pool refresh paths no longer rely on nil checks to infer failures.management/server/devicepki/appleroots/roots.go-177-179 (1)
177-179:⚠️ Potential issue | 🟠 MajorFall back to the cached pool when the refreshed body is malformed.
The package comment promises stale-cache fallback on refresh failures, but
parsePEMPoolerrors currently bypass that path and fail closed even whencachedPoolis still available. After cache expiry, a transient proxy/body corruption would become an avoidable outage instead of reusing the last good root set.🛟 Suggested fix
pool, err := parsePEMPool(pemData) if err != nil { + if cachedPool != nil { + log.Warnf("appleroots: parse response failed, using cached pool: %v", err) + return cachedPool, nil + } return nil, err }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/appleroots/roots.go` around lines 177 - 179, When parsing refreshed PEM data in roots.go, do not fail closed if parsePEMPool(pemData) returns an error while a cachedPool is available; instead, detect parsePEMPool errors and return the existing cachedPool (and nil error) so callers continue using the stale root set. Update the code path around parsePEMPool to check for err, and if cachedPool != nil return cachedPool, nil (optionally log the parse error) rather than returning the parse error directly.management/server/devicepki/vault_adapter.go-228-233 (1)
228-233:⚠️ Potential issue | 🟠 MajorConvert the serial to hexadecimal format before calling Vault revoke.
The CA interface contract specifies that
RevokeCertreceives decimal serials, but Vault'sPOST /pki/revokeendpoint requires theserial_numberparameter in colon- or hyphen-separated hexadecimal format. Passing the decimal string directly will cause Vault to reject revocation requests.Suggested fix
func (v *VaultCA) RevokeCert(ctx context.Context, serial string) error { - body := map[string]interface{}{"serial_number": serial} + vaultSerial, err := formatVaultSerial(serial) + if err != nil { + return fmt.Errorf("devicepki/vault: invalid decimal serial %q: %w", serial, err) + } + body := map[string]interface{}{"serial_number": vaultSerial}Helper function to add below this method:
func formatVaultSerial(decimal string) (string, error) { n, ok := new(big.Int).SetString(decimal, 10) if !ok { return "", fmt.Errorf("parse serial") } hex := strings.ToLower(n.Text(16)) if len(hex)%2 == 1 { hex = "0" + hex } parts := make([]string, 0, len(hex)/2) for i := 0; i < len(hex); i += 2 { parts = append(parts, hex[i:i+2]) } return strings.Join(parts, ":"), nil }Add
math/bigto imports if not already present.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/vault_adapter.go` around lines 228 - 233, RevokeCert is passing a decimal serial directly to Vault but Vault expects colon- or hyphen-separated hexadecimal bytes; add a helper (e.g. formatVaultSerial(decimal string) (string, error)) that parses the decimal with big.Int, converts to lowercase hex, left-pads to even length, splits into byte pairs and joins with ":"; call that helper from RevokeCert to replace body["serial_number"] with the formatted value and return any parse error, and add imports math/big and strings if missing.management/internals/server/server.go-202-204 (1)
202-204:⚠️ Potential issue | 🟠 MajorPotential shared TLS config mutation.
s.certManager.TLSConfig()may return a pointer to a shared or cached*tls.Config. Directly mutatingVerifyPeerCertificateon it could affect other callers. Consider cloning the config before modification.🛡️ Proposed fix to clone the TLS config
- leTLSConfig := s.certManager.TLSConfig() - leTLSConfig.VerifyPeerCertificate = s.DeviceAuthHandler().VerifyPeerCert + baseTLSConfig := s.certManager.TLSConfig() + leTLSConfig := baseTLSConfig.Clone() + leTLSConfig.VerifyPeerCertificate = s.DeviceAuthHandler().VerifyPeerCert🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/internals/server/server.go` around lines 202 - 204, s.certManager.TLSConfig() may return a shared *tls.Config; don't mutate it directly. Before assigning VerifyPeerCertificate, clone the returned config (e.g., call leTLSConfig = leTLSConfig.Clone() or make a shallow copy via local := *leTLSConfig; leTLSConfig = &local), then set leTLSConfig.VerifyPeerCertificate = s.DeviceAuthHandler().VerifyPeerCert and call tls.Listen("tcp", fmt.Sprintf(":%d", s.mgmtPort), leTLSConfig) so the mutation does not affect other users of the original config..github/workflows/device-auth-tests.yml-8-15 (1)
8-15:⚠️ Potential issue | 🟠 MajorInclude
management/server/deviceinventory/**in trigger paths and test execution.The
management/server/deviceinventorypackage exists and provides MDM inventory verification for TPM attestation-based certificate enrollment. Since changes to device inventory validation logic can affect the device auth enrollment flow, this path should be added to the trigger filters on line 8-15 and included in the test run on line 38.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/device-auth-tests.yml around lines 8 - 15, Add the missing glob 'management/server/deviceinventory/**' to the workflow trigger by inserting it into the YAML paths: list (alongside 'management/server/deviceauth/**' etc.), and also include 'management/server/deviceinventory/**' in the test execution list used later in the workflow (the array of packages/tests that the job runs) so device inventory changes trigger and are exercised by the device-auth test job.infrastructure_files/scripts/setup-device-auth.sh-32-35 (1)
32-35:⚠️ Potential issue | 🟠 MajorHarden API calls with timeout/retry controls to prevent hanging automation.
Lines 32, 53, and 72 call remote endpoints without explicit timeouts/retries. In CI/dev stands, transient network issues can stall or abruptly fail this setup flow.
Suggested fix
-SETTINGS_RESPONSE=$(curl -s -w "\n%{http_code}" -X PUT "$MGMT_URL/api/device-auth/settings" \ +SETTINGS_RESPONSE=$(curl -sS --connect-timeout 5 --max-time 30 --retry 3 --retry-all-errors -w "\n%{http_code}" -X PUT "$MGMT_URL/api/device-auth/settings" \ @@ -ENROLLMENTS_RESPONSE=$(curl -s -w "\n%{http_code}" -X GET "$MGMT_URL/api/device-auth/enrollments" \ +ENROLLMENTS_RESPONSE=$(curl -sS --connect-timeout 5 --max-time 30 --retry 3 --retry-all-errors -w "\n%{http_code}" -X GET "$MGMT_URL/api/device-auth/enrollments" \ @@ -CAS_RESPONSE=$(curl -s -w "\n%{http_code}" -X GET "$MGMT_URL/api/device-auth/trusted-cas" \ +CAS_RESPONSE=$(curl -sS --connect-timeout 5 --max-time 30 --retry 3 --retry-all-errors -w "\n%{http_code}" -X GET "$MGMT_URL/api/device-auth/trusted-cas" \Also applies to: 53-54, 72-73
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure_files/scripts/setup-device-auth.sh` around lines 32 - 35, The curl calls that populate SETTINGS_RESPONSE (and the similar calls around lines 53 and 72) lack explicit timeouts/retries; update each curl invocation (including the one that sets SETTINGS_RESPONSE) to add sensible timeout and retry flags such as --connect-timeout 5 --max-time 10 --retry 3 --retry-delay 2 --retry-connrefused (and keep --silent/-s and the existing -w "%{http_code}" behavior), so transient network issues won't hang CI; ensure you preserve the existing response+status capture logic and error handling that checks the HTTP code after the call.infrastructure_files/stand-dex/management.json-2-14 (1)
2-14:⚠️ Potential issue | 🟠 MajorAvoid committed fixed secrets in dev stand management config.
Lines 2, 7, and 13 embed reusable cryptographic/shared-secret material directly in git-tracked config. Even in dev stands, these values are often copied forward and can weaken encryption/auth if reused.
Use runtime-injected values (env/secrets/template rendering) so this file does not carry real or reusable secret material.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure_files/stand-dex/management.json` around lines 2 - 14, The file currently embeds fixed secrets in DataStoreEncryptionKey, TURNConfig.Secret, and Relay.Secret; replace these hardcoded values with runtime-injected references (e.g., environment variables, secrets manager keys, or template placeholders) and remove any real secret material from the committed JSON so the values are populated at deploy/run time (use your config templating or secret retrieval logic to inject DATA_STORE_ENCRYPTION_KEY, TURN_SECRET, RELAY_SECRET into DataStoreEncryptionKey, TURNConfig.Secret, and Relay.Secret respectively).infrastructure_files/stand-dex/dex-device-auth.yaml-24-24 (1)
24-24:⚠️ Potential issue | 🟠 MajorReplace static credentials with runtime-generated or injected values.
Line 24 contains a hardcoded client secret (
netbird-dashboard-secret), and lines 40-46 contain static admin credentials in plaintext. These are also duplicated identically index.yaml. While this appears to be a dev stand (as documented in the Makefile), storing secrets in tracked configuration files is a security anti-pattern that should be replaced with environment variable injection or runtime generation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure_files/stand-dex/dex-device-auth.yaml` at line 24, The manifest contains hardcoded secrets (the literal "secret: netbird-dashboard-secret" and plaintext admin credentials duplicated in dex.yaml); update dex-device-auth.yaml and the duplicate in dex.yaml to remove static values and instead reference injected/runtime secrets—e.g., replace the hardcoded secret string with a reference to a Kubernetes Secret name or a valueFrom/env var placeholder, and move the admin username/password into a Kubernetes Secret (or SealedSecret) or read from environment variables/templated values so credentials are not stored in source control; ensure the resource names you update include the literal secret key "secret: netbird-dashboard-secret" and the admin credential fields so both manifests are changed consistently..github/workflows/e2e-device-auth.yml-7-11 (1)
7-11:⚠️ Potential issue | 🟠 MajorUpdate the paths filter to use the actual stand configuration directory.
The workflow references
netbird-stand/**, but this directory does not exist. The device-auth stand configuration is located ininfrastructure_files/stand-dex/. Changes to the actual stand files (docker-compose, dex config) won't trigger this workflow with the current filter. Replacenetbird-stand/**withinfrastructure_files/stand-dex/**to ensure the workflow runs when stand infrastructure changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-device-auth.yml around lines 7 - 11, Update the workflow paths filter so changes to the actual device-auth stand config trigger the job: replace the current path pattern 'netbird-stand/**' in the paths array with 'infrastructure_files/stand-dex/**'. Edit the paths block in the .github/workflows/e2e-device-auth.yml workflow (the paths: - 'management/**' - 'client/**' - 'netbird-stand/**' - 'scripts/e2e-*.sh' section) to use 'infrastructure_files/stand-dex/**' instead.management/internals/shared/grpc/server.go-845-852 (1)
845-852:⚠️ Potential issue | 🟠 MajorDevice-auth rejection happens after
LoginPeerhas already committed login state.This post-login block returns an error, but it does not undo the peer registration/login work that already happened in
LoginPeer. A rejected certificate or revoked serial can still leave behind peer creation, login bookkeeping, or setup-key consumption for a request that was ultimately denied. Either evaluate the device-auth policy before the mutating call or roll those side effects back on failure.Also applies to: 861-879
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/internals/shared/grpc/server.go` around lines 845 - 852, The device-auth check is performed after LoginPeer has already mutated state, so a rejection can leave peer registration/login side effects (peer creation, login bookkeeping, setup-key consumption) in place; update the LoginPeer flow to run device-auth enforcement before calling the mutating function or, if that’s not possible, implement a rollback path that undoes all side-effects when deviceAuthHandler rejects (ensure you reference and amend LoginPeer, deviceAuthHandler, and accountManager.GetAccountSettings usage), and apply the same fix to the equivalent post-login block around the 861-879 region.management/server/http/handlers/device_auth/ca_config.go-255-257 (1)
255-257:⚠️ Potential issue | 🟠 MajorClear
CAConfigwhen switching CA backends.Updating
CATypedoes not clear the old JSON payload. Switching tobuiltin, or to another backend without immediately supplying its sub-object, leaves the previous Vault/Smallstep/SCEP secrets stored ins.CAConfigand ready to resurface later. ResetCAConfigwhenever the active backend changes.🔧 Suggested fix
func applyCAConfigRequest(s *types.DeviceAuthSettings, req caConfigRequest) error { + prevType := s.CAType if req.CAType != "" { s.CAType = req.CAType } + if s.CAType != prevType { + s.CAConfig = "" + } // Reject mismatched sub-objects to prevent silent misconfiguration.Also applies to: 270-361
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/http/handlers/device_auth/ca_config.go` around lines 255 - 257, When updating s.CAType from req.CAType in the device CA config handler, also clear the existing backend payload stored in s.CAConfig whenever the active backend changes: detect a true change (req.CAType != "" and req.CAType != s.CAType), then set s.CAConfig to its zero value (e.g., nil/empty value) before assigning s.CAType = req.CAType; apply the same change to all other code paths in this handler where s.CAType is assigned so Vault/Smallstep/SCEP data cannot persist when switching backends.management/internals/shared/grpc/server.go-174-180 (1)
174-180:⚠️ Potential issue | 🟠 MajorAdvertise device-cert capabilities only when mTLS is actually available.
deviceCertAuthEnabledis hard-coded totrue, soGetServerKeyannouncesDEVICE_CERT_AUTH/DEVICE_ATTESTATIONeven when the gRPC server is running without client-cert TLS support. Clients can then enroll and try to reconnect with mTLS against a server that cannot accept it. Gate this on the real transport/policy configuration instead.Also applies to: 226-233
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/internals/shared/grpc/server.go` around lines 174 - 180, The deviceCertAuthEnabled flag is hard-coded true causing GetServerKey to advertise DEVICE_CERT_AUTH/DEVICE_ATTESTATION even when mTLS/client-cert TLS is not configured; change initialization of deviceCertAuthEnabled to reflect the actual transport/policy (e.g., check the server TLS config or auth policy that enables client certificate verification) and use that boolean wherever deviceCertAuthEnabled is referenced (including the other similar block around the GetServerKey/advertise logic at the later section mentioned) so the server only advertises device-cert capabilities when client-cert TLS/mTLS is actually available.client/internal/enrollment/manager.go-65-70 (1)
65-70:⚠️ Potential issue | 🟠 MajorStale pending state can block every future enrollment attempt.
The state file is resumed solely on
Status == pending;WGPublicKeyis stored but never checked, and the rejected path never clears or updates the state. After a WG key rotation or one rejection, each restart will keep polling the stale enrollment ID instead of submitting a fresh request.Also applies to: 117-129, 409-410
management/internals/shared/grpc/server.go-853-856 (1)
853-856:⚠️ Potential issue | 🟠 MajorDo not continue with empty device-auth settings on store errors.
If
GetAccountSettingsfails, the code only logs a warning and proceeds toCheckDeviceAuthwithdevAuthSettings == nil. That turns a transient store failure into a fail-open auth path. Return an internal error instead of continuing without policy data.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/internals/shared/grpc/server.go` around lines 853 - 856, The code currently logs a warning when s.accountManager.GetStore().GetAccountSettings(...) returns an error and then proceeds with accountSettings == nil, which results in a fail-open path when calling CheckDeviceAuth; change the behavior so that when settingsErr != nil you return an internal gRPC error (include the settingsErr text) instead of continuing — locate the call to GetAccountSettings (accountSettings, settingsErr := s.accountManager.GetStore().GetAccountSettings(ctx, store.LockingStrengthNone, peer.AccountID)) and replace the Warnf branch with an immediate return of an appropriate internal error (e.g., status.Errorf(codes.Internal, "failed to load account settings for device auth check on peer %s: %v", peerKey, settingsErr)) so CheckDeviceAuth is only called with valid accountSettings.client/internal/enrollment/manager.go-102-107 (1)
102-107:⚠️ Potential issue | 🟠 MajorVerify the TPM key before reusing a stored certificate.
This fast path returns a valid leaf cert even if
device-keydisappeared after a TPM reset.BuildTLSCertificatethen fails forever, butEnsureCertificatekeeps short-circuiting on the stale cert and never repairs itself. Treat "cert present but key missing" as not enrolled and re-enroll.🔧 Suggested fix
existing, err := m.tpmProvider.LoadCert(ctx, deviceKeyID) if err == nil && certIsValid(existing) { - log.Debug("enrollment: found valid device certificate, skipping enrollment") - return existing, nil + if _, keyErr := m.tpmProvider.LoadKey(ctx, deviceKeyID); keyErr == nil { + log.Debug("enrollment: found valid device certificate, skipping enrollment") + return existing, nil + } else if !errors.Is(keyErr, tpm.ErrKeyNotFound) { + return nil, fmt.Errorf("enrollment: load device key: %w", keyErr) + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/enrollment/manager.go` around lines 102 - 107, The fast-path should not reuse a stored certificate unless the corresponding TPM key is still present; modify the logic around m.tpmProvider.LoadCert and certIsValid so that after finding a valid cert you also verify the TPM key (e.g., call a key-existence check on m.tpmProvider using deviceKeyID or attempt to load the key) and only return the cert if the key exists; if the key is missing, treat the cert as not enrolled (don't return it) so EnsureCertificate will re-enroll and BuildTLSCertificate can succeed.client/internal/enrollment/manager.go-146-150 (1)
146-150:⚠️ Potential issue | 🟠 MajorOnly fall back to Mode A on
ErrAttestationNotSupported.Any unexpected TPM error here silently downgrades hardware-backed enrollment to the non-attested path. That hides real TPM failures and weakens the security model on machines that are supposed to attest. Only downgrade on the explicit "not supported" sentinel; propagate all other errors.
🔧 Suggested fix
ap, apErr := m.tpmProvider.AttestationProof(ctx, deviceKeyID) if apErr == nil && len(ap.EKCert) > 0 { log.Infof("enrollment: using TPM credential activation for peer %s", m.wgPubKey) return m.enrollWithTPMAttestation(ctx) } + if apErr != nil && !errors.Is(apErr, tpm.ErrAttestationNotSupported) { + return nil, fmt.Errorf("enrollment: get attestation proof: %w", apErr) + } log.Infof("enrollment: submitting CSR via Mode A (no attestation) for peer %s", m.wgPubKey)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/enrollment/manager.go` around lines 146 - 150, The current check silently downgrades to non-attested enrollment for any TPM error; change the logic in the block around m.tpmProvider.AttestationProof so you only fall back to the non-attested path when apErr equals the sentinel ErrAttestationNotSupported. If apErr is nil and ap.EKCert is present, call enrollWithTPMAttestation as before; if apErr is non-nil and not ErrAttestationNotSupported, propagate/return that error (do not continue to non-attested enrollment). Refer to m.tpmProvider.AttestationProof, the apErr variable, the ErrAttestationNotSupported sentinel, and enrollWithTPMAttestation to locate and update the code.management/server/devicepki/attestation.go-30-39 (1)
30-39:⚠️ Potential issue | 🟠 MajorSet
KeyUsageswhen verifying EK certificate chains.
x509.Certificate.VerifyassumesTLS server authusage whenVerifyOptions.KeyUsagesis empty. EK certificates are endorsement keys, not server certificates. Once manufacturer roots are bundled, this default will cause valid TPM certificates to be rejected. SetKeyUsages: []x509.ExtKeyUsage{x509.ExtKeyUsageAny}in both verification paths to fix this.🔧 Suggested fix
- opts := x509.VerifyOptions{Roots: pool} + opts := x509.VerifyOptions{ + Roots: pool, + KeyUsages: []x509.ExtKeyUsage{x509.ExtKeyUsageAny}, + }Also applies to: line 68
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/attestation.go` around lines 30 - 39, VerifyEKCertChain currently calls ekCert.Verify with VerifyOptions that omit KeyUsages, which defaults to TLS server auth and will reject valid EK certs; update the VerifyOptions in VerifyEKCertChain (the opts used with ekCert.Verify) to include KeyUsages: []x509.ExtKeyUsage{x509.ExtKeyUsageAny} and make the same change for the other EK certificate Verify call in this file (the second verification near line 68) so both verification paths accept endorsement-key usage.management/internals/shared/grpc/device_enrollment.go-53-60 (1)
53-60:⚠️ Potential issue | 🟠 MajorDo not short-circuit renewal when the CSR changed.
Lines 53-60 key idempotency only on
WGPublicKey. AfterrenewDeviceCertresets the old request topending, the client's next renewal submission just reuses that record and discards the fresh CSR. That blocks key rotation and can make admin approval sign the stale CSR again.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/internals/shared/grpc/device_enrollment.go` around lines 53 - 60, The current idempotency check in GetEnrollmentRequestByWGKey only keys off WG public key and returns an active request (existing.IsActive()), which allows a renewed CSR to be ignored; change the logic in the enrollment flow (around GetEnrollmentRequestByWGKey, existing.IsActive(), encryptEnrollResponse and the path used by renewDeviceCert) to also compare the stored CSR (or a CSR fingerprint/hash) against the incoming CSR, and only short-circuit and return the existing DeviceEnrollResponse when both the WGPublicKey and CSR match; if the CSR differs, treat it as a new renewal (do not return the old active request) so the fresh CSR is persisted/approved instead of reusing the stale record.client/internal/tpm/tpm_windows.go-76-89 (1)
76-89:⚠️ Potential issue | 🟠 MajorSeparate key-storage availability from attestation availability.
These segments expose inconsistent capabilities:
Available()reports success as soon as the platform crypto provider exists, but both attestation operations are still unsupported, andActivateCredentialdoes not even returnErrAttestationNotSupported. In attestation-only enrollment mode, that looks like a supported TPM until the flow fails halfway through.Also applies to: 194-204
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/tpm/tpm_windows.go` around lines 76 - 89, Available() currently returns true when the crypto provider exists even though attestation functions are unsupported; update the logic so windowsProvider.Available() only reports key-storage availability (check/open/free provider as now) and do not claim attestation support. Add or implement a separate attestation check (e.g., AttestationAvailable or a boolean returned by an attestation-capability helper) and ensure attestation paths (notably ActivateCredential and the functions referenced around lines 194-204) explicitly return ErrAttestationNotSupported when attestation is unavailable. Adjust callers to use the new attestation check or handle ErrAttestationNotSupported accordingly.management/server/http/handlers/device_auth/handler.go-243-271 (1)
243-271:⚠️ Potential issue | 🟠 MajorAdd a compensating rollback after certificate issuance.
Lines 243-271 call
SignCSRbefore either store write is durable. IfSaveDeviceCertificateorSaveEnrollmentRequestfails, the CA has already minted a valid cert, but the enrollment state is still inconsistent or pending. On a retry, this can issue a second cert instead of repairing the first failure.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/http/handlers/device_auth/handler.go` around lines 243 - 271, The code signs the CSR via SignCSR and then writes to store with SaveDeviceCertificate and SaveEnrollmentRequest, but if either store write fails the issued cert remains valid; update the flow to perform a compensating rollback: after cert := from SignCSR and before returning on any error from SaveDeviceCertificate or SaveEnrollmentRequest, call the CA revocation API with the certificate identifier (e.g., ca.RevokeCert(ctx, cert.SerialNumber) or the appropriate revoke method on your CA) to revoke the just-issued cert, handle/log any revoke error, and ensure the handler returns the original store error (or a combined error) so retries don’t result in duplicate valid certificates; implement this in the block around SignCSR / certToPEM / NewDeviceCertificate / SaveDeviceCertificate / SaveEnrollmentRequest and make the revocation call idempotent and safe to run if the cert was not fully recorded.management/server/deviceinventory/intune.go-129-145 (1)
129-145:⚠️ Potential issue | 🟠 MajorAllow real device serial formats in the Intune lookup.
Lines 130-135 only accept decimal digits, but this path is fed from
SystemSerialNumber, and many real laptop/mac serials are alphanumeric. That makes inventory-gated enrollment fail before the Graph request for legitimate devices.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/deviceinventory/intune.go` around lines 129 - 145, The current strict digit-only validation in IsRegistered rejects legitimate alphanumeric system serials (ekSerial); remove the for-loop that enforces decimal-only and instead validate/allow common serial formats (e.g., letters, digits and common separators like - and _) or drop the custom validation and rely on proper escaping before the Graph call. Concretely, update the IsRegistered function to stop returning an error for non-digit characters in ekSerial and either replace the check with a permissive regex such as /^[A-Za-z0-9_-]+$/ applied to ekSerial or remove it entirely and keep using url.QueryEscape when building apiURL so legitimate device serials are accepted.management/server/http/handlers/device_auth/ca_test_handler.go-142-148 (1)
142-148:⚠️ Potential issue | 🟠 MajorKeep the CA test endpoint read-only and time-bounded.
Line 143 reuses the live
caFactory. With the defaultdevicepki.NewCA, a builtin CA can be created/persisted on first use, soPOST /device-auth/ca/testcan mutate account state even though the handler says it won't. That factory call is also outside the 15s step timeout.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/http/handlers/device_auth/ca_test_handler.go` around lines 142 - 148, The CA test handler currently calls the live caFactory (caFactory) with the request context, which may create/persist a builtin CA (devicepki.NewCA) and runs outside the per-step timeout; update the handler to use a read-only, time-bounded CA creation: create a new context with a 15s timeout (context.WithTimeout(ctx, 15*time.Second)), then call a non-mutating/read-only factory (e.g., a new h.caFactoryReadOnly or devicepki.NewCAReadOnly variant) that does not persist state or create resources on first use; if a read-only factory does not exist, call the existing factory with options/flags from settings to disable persistence, using the 15s context, and keep the same error handling and skipRemaining logic.management/server/http/handlers/device_auth/handler.go-742-754 (1)
742-754:⚠️ Potential issue | 🟠 MajorDo not publish a partially rebuilt global cert pool.
Lines 744-747 skip accounts that fail
ListTrustedCAsand still replace the shared TLS pool. A transient read failure for one tenant will drop that tenant's trust anchors from the handshake layer until the next successful rebuild.🛡️ Keep the old pool on any read failure
pool := x509.NewCertPool() + rebuildFailed := false for _, acct := range accounts { cas, err := h.store.ListTrustedCAs(r.Context(), store.LockingStrengthNone, acct.Id) if err != nil { log.Warnf("device_auth: rebuild cert pool: list CAs for account %s: %v", acct.Id, err) - continue + rebuildFailed = true + break } for _, ca := range cas { pool.AppendCertsFromPEM([]byte(ca.PEM)) } } + if rebuildFailed { + return + } h.poolUpdater.UpdateCertPool(pool)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/http/handlers/device_auth/handler.go` around lines 742 - 754, The current code builds a global x509.CertPool by iterating accounts and calling h.store.ListTrustedCAs but still calls h.poolUpdater.UpdateCertPool even if any ListTrustedCAs call fails, which can publish a partially rebuilt pool; change the logic in the block that creates "pool" so you build into a temporary pool, track if any ListTrustedCAs returns an error (use a boolean or error counter), log errors as now, and only call h.poolUpdater.UpdateCertPool(pool) if no errors occurred for any account; otherwise skip the update to keep the existing global pool intact (use the same symbols: pool, accounts, acct.Id, h.store.ListTrustedCAs, h.poolUpdater.UpdateCertPool).management/server/http/handlers/device_auth/handler.go-409-444 (1)
409-444:⚠️ Potential issue | 🟠 MajorDisconnect the peer after revoking during renewal.
Unlike
revokeDevice, this path never closes the active Sync stream after settingRevoked=true. The device can keep using the management channel until it reconnects, so admin-initiated renewal is not enforced immediately.🔧 Reuse the same disconnect path as `revokeDevice`
if !cert.Revoked { // Copy before modifying to avoid mutating the store-returned value. now := time.Now().UTC() revokedCert.Revoked = true revokedCert.RevokedAt = &now if err := h.store.SaveDeviceCertificate(r.Context(), store.LockingStrengthUpdate, &revokedCert); err != nil { util.WriteError(r.Context(), err, w) return } } + + if h.networkMapController != nil { + peer, pErr := h.store.GetPeerByPeerPubKey(r.Context(), store.LockingStrengthShare, revokedCert.WGPublicKey) + switch { + case pErr == nil && peer != nil: + h.networkMapController.DisconnectPeers(r.Context(), userAuth.AccountId, []string{peer.ID}) + case pErr != nil && !isNotFoundErr(pErr): + log.WithContext(r.Context()).WithError(pErr).Warn("renewDeviceCert: peer lookup failed; active Sync stream may persist until reconnect") + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/http/handlers/device_auth/handler.go` around lines 409 - 444, The renew path revokes the cert (SaveDeviceCertificate) but never closes the active Sync stream, so replicate the same disconnect logic used by revokeDevice: after successfully saving the revoked certificate (the revokedCert flow around SaveDeviceCertificate) invoke the same helper or code revokeDevice uses to disconnect the peer associated with cert.WGPublicKey and userAuth.AccountId (i.e., call the same disconnect function or path used in revokeDevice), then continue to reset the enrollment and return toCertResponse(&revokedCert).client/internal/tpm/tpm_darwin.go-267-270 (1)
267-270:⚠️ Potential issue | 🟠 MajorPropagate non-
ErrKeyNotFounderrors fromloadKeyByLabel.
GenerateKeyfalls through to key creation, andLoadKeyrewrites the error, for every load failure. A Keychain access error or malformed stored key gets reported as “not found” and can sendGenerateKeydown the wrong path.Suggested fix
- // Try to load an existing key first (idempotent). - if signer, err := p.loadKeyByLabel(label); err == nil { - return signer, nil - } + // Try to load an existing key first (idempotent). + signer, err := p.loadKeyByLabel(label) + if err == nil { + return signer, nil + } + if !errors.Is(err, ErrKeyNotFound) { + return nil, err + } @@ - signer, err := p.loadKeyByLabel(keychainLabel(keyID)) - if err != nil { - return nil, ErrKeyNotFound - } + signer, err := p.loadKeyByLabel(keychainLabel(keyID)) + if err != nil { + return nil, err + } return signer, nilBased on learnings, methods returning
(T, error)should propagate the original error and only special-case the not-found sentinel.Also applies to: 300-303
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/tpm/tpm_darwin.go` around lines 267 - 270, The code in GenerateKey is swallowing all errors from p.loadKeyByLabel and treating them as "not found"; change the logic in GenerateKey (and similarly where LoadKey rewrites errors) to check the error returned from loadKeyByLabel and immediately return that error unless errors.Is(err, ErrKeyNotFound) (i.e., only fall through to key creation when the sentinel ErrKeyNotFound is returned); update any LoadKey path that currently replaces arbitrary errors with ErrKeyNotFound to instead propagate the original error except when the original is ErrKeyNotFound so callers see real keychain/malformed-key errors.management/server/devicepki/scep_adapter.go-161-170 (1)
161-170:⚠️ Potential issue | 🟠 MajorDon't expose SCEP as a selectable CA until issuance exists.
SignCSRis the core CA operation every enrollment path needs, and this adapter always returnsErrNotImplemented. With the factory wiring in this PR, selecting SCEP creates a backend that cannot issue a device certificate.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/scep_adapter.go` around lines 161 - 170, The SCEP adapter (type SCEPCA and its SignCSR method) always returns ErrNotImplemented, so remove or disable SCEP as a selectable CA in the factory/registry until issuance is implemented: update the CA factory/registration code that wires adapters to omit SCEP from selectable backends (or make NewSCEPCA return an error when SignCSR is not implemented), and add a runtime check where CA backends are constructed to reject adapters whose SignCSR returns ErrNotImplemented (reference SCEPCA, SignCSR, and ErrNotImplemented) so SCEP cannot be chosen for enrollment until a full implementation exists.management/server/devicepki/smallstep_adapter.go-24-29 (1)
24-29:⚠️ Potential issue | 🟠 Major
Fingerprintis documented but never enforced.A config that sets only
fingerprintgets the same TLS behavior as if the field were absent. That silently weakens the trust model for Smallstep bootstrap/self-hosted deployments.Also applies to: 86-108
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/smallstep_adapter.go` around lines 24 - 29, SmallstepConfig.Fingerprint is documented but never applied to TLS verification, so supplying only Fingerprint yields no effect; update the Smallstep bootstrap/adapter code that creates the TLS client (where SmallstepConfig is consumed, e.g. the Smallstep client/transport initialization functions) to enforce pinning: when SmallstepConfig.Fingerprint is non-empty, derive the expected SHA-256 hex fingerprint, configure the HTTP/TLS client to either (a) use a custom VerifyPeerCertificate callback that computes the presented leaf or root certificate's SHA-256 and rejects mismatches, or (b) build a tls.Config with a cert pool containing the pinned root cert and disable system verification accordingly; ensure this logic is used instead of no-op so the fingerprint actually changes trust behavior and is covered for both bootstrap and runtime code paths that load SmallstepConfig.client/internal/tpm/tpm_linux.go-34-35 (1)
34-35:⚠️ Potential issue | 🟠 Major
keyIDis ignored for the TPM key, so every logical key aliases the same handle.The provider validates
keyIDand stores cert files perkeyID, but the private key always lives at the global persistent handle. A second key ID will reuse the first private key and can leave you with mismatched cert/key pairs.Also applies to: 117-123, 191-197, 531-533
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/tpm/tpm_linux.go` around lines 34 - 35, The code always uses global persistentHandle and akPersistentHandle so the TPM private key is shared across all logical keyIDs; change the key creation/loading logic to derive or allocate a unique persistent TPM handle per keyID (e.g., compute a deterministic handle from keyID or maintain a mapping) and use that handle instead of the global persistentHandle/akPersistentHandle when creating, persisting, loading, and evicting keys; update every place that references persistentHandle or akPersistentHandle (the key creation function(s), load/save/evict paths, and certificate pairing code) to accept a per-key handle derived from keyID, handle collisions and errors, and ensure cert files remain tied to the same keyID -> handle mapping.client/internal/tpm/tpm_linux.go-275-281 (1)
275-281:⚠️ Potential issue | 🟠 MajorRSA-only TPMs can't finish credential activation with this client.
AttestationProoffalls back to the RSA EK certificate, and the server-side challenge code already supports RSA EKs, butActivateCredentialalways recreates the ECC EK primary. On boxes without an EC EK, Begin can succeed and Complete will fail against the wrong EK.Also applies to: 354-359, 391-430
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/tpm/tpm_linux.go` around lines 275 - 281, The code always creates an ECC EK primary even when the client fell back to an RSA EK cert, causing ActivateCredential to use the wrong key; modify the logic around readNVCert, AttestationProof creation, and ActivateCredential so you detect and record the EK type (ECC vs RSA) when reading NV certs (readNVCert/ekCertNVIndexEC vs ekCertNVIndexRSA), propagate that ekType into the proof/Begin/Complete flow, and when creating the EK primary (where createPrimary / tpm2.Public template is constructed) choose the matching public type and parameters (tpm2.AlgECC vs tpm2.AlgRSA) instead of always using ECC; apply this same fix to the other affected code paths referenced around the ActivateCredential and Begin/Complete implementations so the primary key creation matches the cert you read.management/server/devicepki/smallstep_adapter.go-169-185 (1)
169-185:⚠️ Potential issue | 🟠 MajorDon't pass arbitrary WireGuard public keys in the
sansarray.The Smallstep step-ca /sign API supports only standard X.509 SAN formats: DNS names, IP addresses, email addresses, and URIs. Base64-encoded WireGuard public keys cannot be passed directly in the "sans" array; step-ca will reject them or produce an identity that verifiers won't recognize. If you need to embed WireGuard keys in certificates, use certificate templates with custom ASN.1 extensions or OIDs instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/smallstep_adapter.go` around lines 169 - 185, The SignCSR function is currently adding the WireGuard public key (cn) into the "sans" field of the request body (the "sans" array) which is invalid for step-ca; remove the line that sets "sans": []string{cn} from the body in SignCSR and instead either (a) pass only valid SAN types (DNS/IP/email/URI) or (b) implement embedding the WireGuard key using a certificate template or custom ASN.1 extension/OID and include that via step-ca templates or the appropriate extension field; update any callers that expect the WireGuard key to be present in SAN to consume it from the chosen extension/template instead.management/server/deviceinventory/inventory.go-74-90 (1)
74-90:⚠️ Potential issue | 🟠 MajorThe policy booleans
RequireComplianceandRequireManagementare dropped during marshaling.
MultiSourceIntuneConfig.RequireComplianceandMultiSourceJamfConfig.RequireManagementare parsed from JSON but never included when marshaling toIntuneConfigandJamfConfigfor downstream processing. The downstream config types do not even define these fields, meaning the policy enforcement is either unimplemented or these config knobs are misleading.Fix by either:
- Adding
RequireCompliance/RequireManagementfields toIntuneConfig/JamfConfigand forwarding them during marshaling, or- Removing these fields from
MultiSourceIntuneConfig/MultiSourceJamfConfigif unsupportedAlso applies to lines 150–179.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/deviceinventory/inventory.go` around lines 74 - 90, MultiSourceIntuneConfig.RequireCompliance and MultiSourceJamfConfig.RequireManagement are parsed but never propagated into the downstream IntuneConfig and JamfConfig types, so either add those boolean fields to IntuneConfig and JamfConfig and copy the values during the conversion/marshaling step (i.e., when constructing IntuneConfig from MultiSourceIntuneConfig and JamfConfig from MultiSourceJamfConfig), or remove the RequireCompliance/RequireManagement fields from the MultiSource* structs if the policies are not supported; ensure the conversion code that currently maps MultiSourceIntuneConfig -> IntuneConfig and MultiSourceJamfConfig -> JamfConfig is updated to forward the RequireCompliance and RequireManagement fields when you choose the add-fields option (apply the same change to the other similar occurrence noted in the diff).management/server/devicepki/scep_adapter.go-132-159 (1)
132-159:⚠️ Potential issue | 🟠 Major
fetchCACertmust parse PKCS#7 CA/RA certificate bundles per RFC 8894.SCEP
GetCACertreturns either a single DER certificate (Content-Typeapplication/x-x509-ca-cert) or a degenerate PKCS#7 CMS SignedData bundle (Content-Typeapplication/x-x509-ca-ra-cert) when the server has an intermediate CA. This parser only handles single PEM/DER certificates. Against compliant servers with intermediate CAs,CACert()will returnnilandGenerateCRL()will fail with "failed to fetch CA certificate for GetCRL". Implement PKCS#7 bundle extraction or integrate a library likegithub.com/micromdm/scep.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/scep_adapter.go` around lines 132 - 159, fetchCACert currently only parses single PEM/DER certs but must handle degenerate PKCS#7 SignedData bundles returned by GetCACert; update SCEPCA.fetchCACert to, after reading body (and after PEM decode), attempt to parse PKCS#7 (e.g. using github.com/micromdm/scep or golang.org/x/crypto/pkcs7) and extract the certificates from the SignedData, then return the appropriate CA certificate (prefer a cert with IsCA==true, fallback to the first cert) instead of failing; ensure error messages still wrap underlying errors and update imports to include the chosen PKCS#7 parser so CACert()/GenerateCRL() can succeed against servers returning CA/RA bundles.
🟡 Minor comments (6)
client/internal/tpm/cert_pem.go-15-21 (1)
15-21:⚠️ Potential issue | 🟡 Minor
parsePEMCertshould scan for a CERTIFICATE block, not just the first PEM block.Line 17 decodes only the first block, and Line 21 parses it unconditionally. If a non-certificate block appears first, parsing fails even when a valid certificate exists later.
Suggested fix
func parsePEMCert(data []byte) (*x509.Certificate, error) { - block, _ := pem.Decode(data) - if block == nil { - return nil, fmt.Errorf("tpm: no PEM block found in certificate file") - } - return x509.ParseCertificate(block.Bytes) + for len(data) > 0 { + var block *pem.Block + block, data = pem.Decode(data) + if block == nil { + break + } + if block.Type != "CERTIFICATE" { + continue + } + return x509.ParseCertificate(block.Bytes) + } + return nil, fmt.Errorf("tpm: no CERTIFICATE PEM block found in certificate file") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/tpm/cert_pem.go` around lines 15 - 21, The parsePEMCert function currently decodes only the first PEM block and tries to parse it unconditionally; change it to iterate over all PEM blocks (calling pem.Decode repeatedly) until you find one with block.Type == "CERTIFICATE", then call x509.ParseCertificate on that block.Bytes and return the parsed cert; if no CERTIFICATE block is found return the existing error ("tpm: no PEM block found in certificate file" or a clearer message). Ensure you keep using parsePEMCert, pem.Decode, and x509.ParseCertificate names so the change is localized.management/server/devicepki/attestation_test.go-97-103 (1)
97-103:⚠️ Potential issue | 🟡 MinorIsolate the InvalidEKCert case to a single failure cause.
Line 97 and Line 98 generate different AK pairs, so this test can fail from AK mismatch instead of EK parsing/validation. Use one AK keypair for both signing and
akPubDERto keep the assertion focused.Suggested fix
func TestVerifyAttestation_InvalidEKCert(t *testing.T) { - _, akPriv := generateAKKeyPair(t) - akPubDER, _ := generateAKKeyPair(t) + akPubDER, akPriv := generateAKKeyPair(t) certifyInfo := []byte("attest") sig := signCertifyInfo(t, akPriv, certifyInfo) err := devicepki.VerifyAttestation([]byte("not a valid DER cert"), akPubDER, certifyInfo, sig) require.Error(t, err) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/attestation_test.go` around lines 97 - 103, The test currently calls generateAKKeyPair twice producing different AKs so VerifyAttestation can fail due to an AK mismatch instead of an invalid EK certificate; fix by using the same AK keypair for signing and for akPubDER (use the akPriv/akPubDER from the single generateAKKeyPair call), keep certifyInfo and sig derived from that akPriv via signCertifyInfo, then call devicepki.VerifyAttestation with the intentionally invalid EK cert bytes to ensure the failure is solely from invalid EK parsing/validation.infrastructure_files/stand-dex/Makefile-110-116 (1)
110-116:⚠️ Potential issue | 🟡 Minor
e2e-verbosetarget may be missing theNETBIRD_SETUP_KEYenvironment variable.The
e2etarget retrievesSETUP_KEYfrom the management container and passes it asNETBIRD_SETUP_KEY, bute2e-verbosedoes not. This could cause enrollment tests to be skipped when running verbose mode.🔧 Proposed fix
e2e-verbose: ## Run E2E tests with verbose curl output `@echo` "Running E2E tests (verbose) against $(API_URL)..." + `@SETUP_KEY`=$$(docker compose -f docker-compose.yml -f docker-compose.device-auth.yml \ + exec -T management sh -c 'grep NETBIRD_SETUP_KEY /var/lib/netbird/init.env 2>/dev/null | cut -d= -f2' 2>/dev/null); \ NETBIRD_TEST_TOKEN=$(DEV_TOKEN) \ NETBIRD_API_URL=$(API_URL) \ NETBIRD_GRPC_URL=$(GRPC_URL) \ NETBIRD_E2E_VERBOSE=1 \ + NETBIRD_SETUP_KEY="$$SETUP_KEY" \ ../../scripts/e2e-device-auth.sh🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure_files/stand-dex/Makefile` around lines 110 - 116, The e2e-verbose Makefile target is missing the NETBIRD_SETUP_KEY environment variable, so enrollment tests may be skipped; update the e2e-verbose target (alongside NETBIRD_TEST_TOKEN, NETBIRD_API_URL, NETBIRD_GRPC_URL, NETBIRD_E2E_VERBOSE) to also export NETBIRD_SETUP_KEY sourced the same way the e2e target does (i.e., mirror how SETUP_KEY / NETBIRD_SETUP_KEY is obtained in the e2e target) before invoking ../../scripts/e2e-device-auth.sh so verbose runs receive the same setup key.client/internal/enrollment/manager_test.go-282-313 (1)
282-313:⚠️ Potential issue | 🟡 MinorTest doesn't verify loop termination — it only verifies elapsed time.
The
stoppedchannel is closed by the test's own goroutine after a 200ms sleep, not by the renewal loop itself. This test passes regardless of whetherStartRenewalLooprespects context cancellation.To properly verify loop termination,
StartRenewalLoopwould need to signal completion (e.g., return a done channel or accept async.WaitGroup). As-is, this test provides false confidence.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/internal/enrollment/manager_test.go` around lines 282 - 313, The test TestStartRenewalLoop_StopsOnContextCancel is flawed because it closes the stopped channel itself after a sleep instead of observing the renewal loop's termination; update the test and the Manager API so the loop signals completion: modify StartRenewalLoop (or add StartRenewalLoopWithDone) on Manager to return a done channel or accept a provided done channel/WaitGroup, then in the test call that API, cancel the context, and wait on the returned done channel (or WaitGroup) with a timeout to assert the loop actually exited; ensure references: TestStartRenewalLoop_StopsOnContextCancel, Manager.StartRenewalLoop (or new StartRenewalLoopWithDone) and use the returned done signal in the test instead of the current stopped goroutine sleep.docs/DEVICE-SECURITY.md-162-162 (1)
162-162:⚠️ Potential issue | 🟡 MinorBroken link fragment — anchor doesn't match heading.
The link
[Known limitations](#known-limitations)at line 162 won't resolve correctly. The actual section heading is "Known limitations (deferred to future phases)" which generates the anchor#known-limitations-deferred-to-future-phases.🔧 Proposed fix
-- **Current limitation:** TPM manufacturer CA bundle not included in open-source build. Run `go run scripts/fetch-tpm-roots.go` before production deployment. Without CAs, EK chain verification is skipped with a WARNING log (see [Known limitations](`#known-limitations`)) +- **Current limitation:** TPM manufacturer CA bundle not included in open-source build. Run `go run scripts/fetch-tpm-roots.go` before production deployment. Without CAs, EK chain verification is skipped with a WARNING log (see [Known limitations](`#known-limitations-deferred-to-future-phases`))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/DEVICE-SECURITY.md` at line 162, Update the broken markdown anchor by either changing the link target "[Known limitations](`#known-limitations`)" to match the generated heading anchor ("#known-limitations-deferred-to-future-phases") or by normalizing the heading to "Known limitations" so the existing link works; locate the link text "[Known limitations](`#known-limitations`)" and the heading "Known limitations (deferred to future phases)" and make the anchor and heading consistent (or add an explicit HTML anchor) so the link resolves.management/cmd/enroll-demo/main.go-154-158 (1)
154-158:⚠️ Potential issue | 🟡 MinorSend inventory-relevant system metadata in the demo enrollment request.
registerPeeralready builds host metadata, butsubmitEnrollmentthrows that away and sends only{"hostname":"enroll-demo"}. The server's inventory gate readsSystemSerialNumberfromsystem_info, so this demo cannot exerciserequire_inventory_check=trueaccounts.Also applies to: 171-178
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/cmd/enroll-demo/main.go` around lines 154 - 158, submitEnrollment is discarding the rich host metadata built by registerPeer and sending only {"hostname":"enroll-demo"}, so demo enrollments never include fields like SystemSerialNumber required by the server's inventory gate; update the demo code that constructs sysInfo (the system.Info struct) and the call sites for submitEnrollment to pass the full sysInfo (including SystemSerialNumber, GoOS, NetbirdVersion, Hostname, etc.) instead of the hardcoded minimal JSON, ensuring the same metadata shape registerPeer uses is serialized and sent so require_inventory_check=true accounts can exercise inventory validation.
| if req.GetEkCertPem() == "" || req.GetAkPubPem() == "" || req.GetCsrPem() == "" { | ||
| return nil, status.Error(codes.InvalidArgument, "ek_cert_pem, ak_pub_pem, and csr_pem are required") | ||
| } |
There was a problem hiding this comment.
The TPM flow never attests the CSR key.
Begin/Complete only prove that the caller controls the EK/AK pair, then issue the stored CSR unchanged. A client can satisfy activation with a real TPM and still obtain a certificate for an arbitrary software key, which defeats the hardware-backed guarantee. The new Linux provider already produces CertifyInfo / Signature, but this RPC never verifies them.
Also applies to: 71-84, 131-158, 193-197
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/internals/shared/grpc/attestation_handler.go` around lines 58 -
60, The handler currently skips verifying TPM CertifyInfo/Signature and
therefore can issue a cert for an arbitrary software key; add a verification
step in the attestation flow (the request-handling functions that read
req.GetEkCertPem(), req.GetAkPubPem(), req.GetCsrPem() and the Begin/Complete
attestation code paths referenced around the other ranges) that: parse
CertifyInfo/CertifySignature (req.GetCertifyInfoPem()/req.GetCertifySigPem()),
parse the AK public key (req.GetAkPubPem()), verify the CertifySignature over
CertifyInfo using the AK public key, extract the public key inside CertifyInfo
and ensure it exactly matches the CSR public key, verify any
nonce/qualifyingData matches the attestation challenge used earlier, and only
then proceed to store/issue the CSR/certificate; add/rename helper(s) such as
verifyCertifyInfoSignature(certifyInfo, certifySig, akPub) and ensure all code
paths that previously issued the stored CSR without verification now call this
helper and fail with InvalidArgument if verification fails.
| updatedAt, recheckErr := performInventoryRecheck(ctx, inv, serial, existingCert) | ||
| if recheckErr != nil { | ||
| log.WithContext(ctx).Warnf("EnrollDevice/autoRenew: inventory re-check failed for peer %s: %v", wgPubKey, recheckErr) | ||
| if devAuth.InventoryRecheckFailBehavior == inventoryRecheckFailAllow { | ||
| log.WithContext(ctx).Warnf("EnrollDevice/autoRenew: proceeding despite inventory failure (fail-open configured for account %s)", accountID) | ||
| } else { | ||
| // Default: deny. Peer keeps its existing cert until expiry. | ||
| return s.rejectEnroll(peerKey, "inventory check failed; retry after MDM recovers") | ||
| } |
There was a problem hiding this comment.
Fail-open currently renews devices that are no longer in inventory.
performInventoryRecheck uses error for both backend failures and a definite "device not registered" result. Lines 283-291 then apply InventoryRecheckFailBehavior == allow to every error, so an offboarded device still gets a fresh cert whenever fail-open is enabled.
Also applies to: 451-458
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/internals/shared/grpc/device_enrollment.go` around lines 283 -
291, performInventoryRecheck conflates "backend failure" and "device not
registered" into a single error, and the EnrollDevice/autoRenew path (around
performInventoryRecheck usage) treats all errors as potentially allowable when
InventoryRecheckFailBehavior == inventoryRecheckFailAllow, which lets offboarded
devices renew; modify performInventoryRecheck to return a distinct sentinel
error or typed error (e.g., inventory.ErrNotFound or a NotRegisteredError) for
the definite "device not registered" case, and update the EnrollDevice/autoRenew
logic (the branch that handles recheckErr after calling performInventoryRecheck)
to explicitly deny and call rejectEnroll when the error is the
NotRegisteredError regardless of InventoryRecheckFailBehavior, while continuing
to apply the fail-open behavior only for backend/temporary errors.
| h.mu.RLock() | ||
| pool := h.certPool | ||
| h.mu.RUnlock() | ||
|
|
||
| _, err = leaf.Verify(x509.VerifyOptions{ | ||
| Roots: pool, | ||
| KeyUsages: []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth}, | ||
| }) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "handler.go" -path "*deviceauth*" | head -20Repository: netbirdio/netbird
Length of output: 102
🏁 Script executed:
git ls-files | grep -E "(deviceauth|handler)" | head -20Repository: netbirdio/netbird
Length of output: 962
🏁 Script executed:
cat -n ./management/server/deviceauth/handler.go | head -150Repository: netbirdio/netbird
Length of output: 5279
🌐 Web query:
Go crypto/x509 VerifyOptions Roots nil system trust store fallback
💡 Result:
In Go's crypto/x509 package, when using Certificate.Verify with VerifyOptions where Roots is nil, it falls back to the system trust store via systemRootsPool. On Windows, macOS, and iOS: If Roots is nil and the system pool is available (not overridden), it first attempts platform-specific verification (e.g., Windows CryptoAPI, macOS Security framework). If that fails or Roots has empty system pool, it uses the Go verifier with system roots. On other platforms (Unix-like): Directly uses systemRootsPool, which loads roots from standard locations (e.g., /etc/ssl/certs), SSL_CERT_FILE/SSL_CERT_DIR env vars, or falls back to embedded roots if SetFallbackRoots was called and no system roots are available (e.g., containers). If GODEBUG=x509usefallbackroots=1, forces fallback roots even if system roots exist (disabling platform verifiers on Windows/macOS). If systemRootsPool returns nil (no roots loaded and no fallback), Verify returns SystemRootsError{"x509: failed to load system roots and no roots provided"}. To embed fallback roots (NSS-based bundle), import _ "golang.org/x/crypto/x509roots/fallback", which calls SetFallbackRoots automatically.
Citations:
- 1: https://go.dev/src/crypto/x509/verify.go
- 2: https://pkg.go.dev/crypto/x509@go1.24.3
- 3: https://pkg.go.dev/crypto/x509
- 4: https://go.dev/src/crypto/x509/root.go
- 5: https://go.dev/pkg/golang.org/x/crypto/x509roots/fallback
🌐 Web query:
Go crypto/x509 Certificate Verify Intermediates parameter intermediate certificates
💡 Result:
In Go's crypto/x509 package, the VerifyOptions struct includes an Intermediates field, which is an optional *CertPool of intermediate certificates. These are not trust anchors but are used by Certificate.Verify to build a valid chain from the leaf certificate to a root in the Roots pool (or system roots if Roots is nil). Intermediates allows providing additional CA certificates needed to complete the chain when the peer only sends the leaf certificate. Name constraints from intermediates apply to all names in the chain. The verifier enforces path length constraints, key usages, and other RFC 5280 rules during chain building. To use it: 1. Create a CertPool with NewCertPool. 2. Add intermediates: pool.AddCert(cert). 3. Pass VerifyOptions{Intermediates: pool} to cert.Verify(opts). This is unchanged in recent Go versions (up to 1.25+). Do not put intermediates in Roots; that's for trusted anchors only.
Citations:
- 1: https://github.com/golang/go/blob/master/src/crypto/x509/verify.go
- 2: https://pkg.go.dev/crypto/x509
- 3: https://pkg.go.dev/crypto/x509@go1.25.6
- 4: https://go.dev/src/crypto/x509/verify.go
Configure VerifyOptions with an explicit root pool and intermediates from the peer's certificate chain.
Go's crypto/x509.Certificate.Verify falls back to the system trust store when VerifyOptions.Roots is nil. Allowing a nil h.certPool to reach this code path weakens the trust boundary to the host system's roots. Additionally, the verifier only builds chains through certificates provided in VerifyOptions.Intermediates; supplying rawCerts[1:] to the peer without including them in Intermediates means chain verification fails even when the peer sends a complete intermediate-signed chain.
🔒 Suggested fix
h.mu.RLock()
pool := h.certPool
h.mu.RUnlock()
+ if pool == nil {
+ pool = x509.NewCertPool()
+ }
+ intermediates := x509.NewCertPool()
+ for _, raw := range rawCerts[1:] {
+ cert, err := x509.ParseCertificate(raw)
+ if err != nil {
+ return fmt.Errorf("deviceauth: parse client intermediate certificate: %w", err)
+ }
+ intermediates.AddCert(cert)
+ }
_, err = leaf.Verify(x509.VerifyOptions{
- Roots: pool,
- KeyUsages: []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth},
+ Roots: pool,
+ Intermediates: intermediates,
+ KeyUsages: []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth},
})📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| h.mu.RLock() | |
| pool := h.certPool | |
| h.mu.RUnlock() | |
| _, err = leaf.Verify(x509.VerifyOptions{ | |
| Roots: pool, | |
| KeyUsages: []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth}, | |
| }) | |
| h.mu.RLock() | |
| pool := h.certPool | |
| h.mu.RUnlock() | |
| if pool == nil { | |
| pool = x509.NewCertPool() | |
| } | |
| intermediates := x509.NewCertPool() | |
| for _, raw := range rawCerts[1:] { | |
| cert, err := x509.ParseCertificate(raw) | |
| if err != nil { | |
| return fmt.Errorf("deviceauth: parse client intermediate certificate: %w", err) | |
| } | |
| intermediates.AddCert(cert) | |
| } | |
| _, err = leaf.Verify(x509.VerifyOptions{ | |
| Roots: pool, | |
| Intermediates: intermediates, | |
| KeyUsages: []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth}, | |
| }) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/deviceauth/handler.go` around lines 69 - 76, The Verify
call currently passes h.certPool possibly nil and omits intermediates, causing
fallback to system roots and failed chain building; update the
x509.VerifyOptions passed to leaf.Verify to explicitly set Roots to h.certPool
(ensure h.certPool is non-nil or return an error if it is) and populate
Intermediates with the peer-supplied chain by creating a new x509.CertPool,
appending rawCerts[1:] into it, and assigning it to VerifyOptions.Intermediates
before calling leaf.Verify (refer to h.certPool, rawCerts, leaf.Verify, and
x509.VerifyOptions).
| func VerifyAttestation(ekCertDER, akPubDER, certifyInfo, certifySignature []byte) error { | ||
| if len(ekCertDER) == 0 || len(akPubDER) == 0 || len(certifyInfo) == 0 || len(certifySignature) == 0 { | ||
| return &AttestationError{Reason: "attestation proof fields must not be empty"} | ||
| } | ||
|
|
||
| // Step 1: parse EK certificate. | ||
| ekCert, err := x509.ParseCertificate(ekCertDER) | ||
| if err != nil { | ||
| return fmt.Errorf("devicepki/attestation: parse EK cert: %w", err) | ||
| } | ||
|
|
||
| // Step 2: verify EK cert against the TPM manufacturer CA pool. | ||
| pool := tpmroots.BuildTPMRootPool() | ||
| if pool != nil && len(tpmroots.RootCerts()) > 0 { | ||
| opts := x509.VerifyOptions{Roots: pool} | ||
| if _, err := ekCert.Verify(opts); err != nil { | ||
| return &AttestationError{Reason: fmt.Sprintf("EK certificate chain invalid: %v", err)} | ||
| } | ||
| } | ||
| // When the manufacturer CA pool is empty (development / open-source build) we | ||
| // skip EK chain verification but still verify the AK signature below. | ||
|
|
||
| // Step 3: parse AK public key. | ||
| akPub, err := x509.ParsePKIXPublicKey(akPubDER) | ||
| if err != nil { | ||
| return fmt.Errorf("devicepki/attestation: parse AK public key: %w", err) | ||
| } | ||
|
|
||
| // Step 4: verify AK signature over certifyInfo (SHA-256). | ||
| digest := sha256.Sum256(certifyInfo) | ||
| if err := verifySignature(akPub, digest[:], certifySignature); err != nil { | ||
| return &AttestationError{Reason: fmt.Sprintf("AK signature verification failed: %v", err)} | ||
| } | ||
|
|
||
| return nil |
There was a problem hiding this comment.
The AK signature check does not prove the AK is TPM-backed.
This only proves that certifySignature matches certifyInfo under akPubDER. Because certifyInfo is never parsed or compared against an expected AK name / verifier challenge, a software-generated keypair plus any trusted EK cert can satisfy this helper. Parse the TPM attestation structure and validate its attested contents before treating this as hardware proof.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/devicepki/attestation.go` around lines 54 - 88, The current
VerifyAttestation only checks that certifySignature matches certifyInfo but
never validates that certifyInfo actually attests to the AK being TPM-backed;
parse certifyInfo as a TPM attestation structure (e.g., using
tpm2.DecodeAttestation / attestation parsing in your tpm library), ensure
attestation.STATUSTYPE == TPM2_ST_ATTEST_CERTIFY, verify the
attested.Name/unique field equals the AK public key name computed from akPubDER
(compute the AK name per TPM spec from the parsed akPub), and validate any
qualifyingData/extraData in the attestation matches the expected challenge/nonce
(add an expectedNonce parameter to VerifyAttestation if not available). After
these checks, continue to verify the signature with verifySignature; update
error returns in VerifyAttestation, and reference ekCert, akPub, certifyInfo,
certifySignature, verifySignature and the new expectedNonce parameter in your
changes.
| func NewVaultCA(cfg VaultConfig) (*VaultCA, error) { | ||
| if cfg.Role == "" { | ||
| return nil, fmt.Errorf("devicepki/vault: role is required") | ||
| } | ||
|
|
||
| // Apply default mount path if not specified. | ||
| if cfg.Mount == "" { | ||
| cfg.Mount = "pki" | ||
| } | ||
|
|
||
| // Validate that Mount and Role contain only safe path characters to prevent | ||
| // path traversal attacks when they are interpolated into Vault API URLs. | ||
| if !validPathComponent.MatchString(cfg.Mount) { | ||
| return nil, fmt.Errorf("devicepki/vault: mount contains invalid characters") | ||
| } | ||
| if !validPathComponent.MatchString(cfg.Role) { | ||
| return nil, fmt.Errorf("devicepki/vault: role contains invalid characters") | ||
| } | ||
|
|
||
| timeout := 30 * time.Second | ||
| if cfg.TimeoutSeconds > 0 { | ||
| timeout = time.Duration(cfg.TimeoutSeconds) * time.Second | ||
| } | ||
|
|
||
| transport := &http.Transport{} | ||
| if cfg.CACertPEM != "" { | ||
| pool := x509.NewCertPool() | ||
| if !pool.AppendCertsFromPEM([]byte(cfg.CACertPEM)) { | ||
| return nil, fmt.Errorf("devicepki/vault: failed to parse CACertPEM") | ||
| } | ||
| transport.TLSClientConfig = &tls.Config{RootCAs: pool} | ||
| } | ||
|
|
||
| return &VaultCA{ | ||
| cfg: cfg, | ||
| client: &http.Client{Timeout: timeout, Transport: transport}, | ||
| }, nil |
There was a problem hiding this comment.
Reject plaintext Vault endpoints by default.
doRequest always sends X-Vault-Token, but NewVaultCA accepts any scheme. A misconfigured http:// address would send the token and certificate-management traffic in cleartext. Please require https:// here, or gate plaintext access behind an explicit insecure-testing flag.
🔐 Suggested guard
+ u, err := url.Parse(cfg.Address)
+ if err != nil || u.Scheme == "" || u.Host == "" {
+ return nil, fmt.Errorf("devicepki/vault: invalid address")
+ }
+ if u.Scheme != "https" {
+ return nil, fmt.Errorf("devicepki/vault: Vault address must use https")
+ }Also add a net/url import.
🧰 Tools
🪛 ast-grep (0.42.1)
[warning] 123-123: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{RootCAs: pool}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures
(missing-ssl-minversion-go)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/devicepki/vault_adapter.go` around lines 94 - 130, The
NewVaultCA constructor currently accepts any URL scheme which can leak
X-Vault-Token over plaintext; update NewVaultCA to parse cfg.Address with
net/url, require the scheme to be "https" (return an error if missing/invalid),
or alternatively allow "http" only when an explicit insecure-testing flag (e.g.,
cfg.AllowInsecureHTTP or cfg.InsecureSkipVerify boolean) is set; validate and
error on insecure schemes by default, and add the net/url import to support
parsing.
There was a problem hiding this comment.
Actionable comments posted: 12
♻️ Duplicate comments (2)
management/internals/shared/grpc/device_enrollment.go (1)
283-291:⚠️ Potential issue | 🟠 MajorFail-open behavior conflates "device not registered" with transient API errors.
performInventoryRecheckreturns an error for both backend failures and definitive "device not registered" results (lines 451-458). The handler then appliesInventoryRecheckFailBehavior == allowto every error (lines 286-291), allowing offboarded devices to renew certificates when fail-open is enabled.A device explicitly removed from the MDM inventory should be denied renewal regardless of fail-open configuration. Only transient/backend errors should honor fail-open.
🔒 Proposed fix: distinguish "not registered" from transient errors
+// ErrDeviceNotRegistered is returned when a device is definitively absent from the inventory. +var ErrDeviceNotRegistered = errors.New("device not registered in inventory") + func performInventoryRecheck(ctx context.Context, inv deviceinventory.Inventory, serial string, cert *types.DeviceCertificate) (*time.Time, error) { registered, err := inv.IsRegistered(ctx, serial) if err != nil { return nil, fmt.Errorf("inventory check failed for peer %s: %w", cert.WGPublicKey, err) } if !registered { - return nil, fmt.Errorf("device no longer registered in inventory (peer %s)", cert.WGPublicKey) + return nil, fmt.Errorf("%w (peer %s, serial %s)", ErrDeviceNotRegistered, cert.WGPublicKey, serial) } now := time.Now().UTC() return &now, nil }Then in
tryAutoRenew:updatedAt, recheckErr := performInventoryRecheck(ctx, inv, serial, existingCert) if recheckErr != nil { log.WithContext(ctx).Warnf("EnrollDevice/autoRenew: inventory re-check failed for peer %s: %v", wgPubKey, recheckErr) + // Definitive "not registered" must always deny, regardless of fail-open setting. + if errors.Is(recheckErr, ErrDeviceNotRegistered) { + return s.rejectEnroll(peerKey, "device removed from inventory") + } if devAuth.InventoryRecheckFailBehavior == inventoryRecheckFailAllow {Also applies to: 451-458
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/internals/shared/grpc/device_enrollment.go` around lines 283 - 291, performInventoryRecheck currently returns a generic error for both transient API failures and the definitive "device not registered" case, and tryAutoRenew/EnrollDevice treats all errors the same; change performInventoryRecheck to return a distinct sentinel or typed error (e.g., ErrDeviceNotRegistered or a custom DeviceNotRegisteredError) for the explicit not-registered result and keep other errors as transient failures, then update callers (the EnrollDevice/autoRenew path in tryAutoRenew and any other call sites of performInventoryRecheck) to check errors.Is(err, ErrDeviceNotRegistered) (or type-assert the custom error) and immediately reject/deny renewal in that case regardless of devAuth.InventoryRecheckFailBehavior, while only applying the inventoryRecheckFailAllow behavior for transient/backend errors.management/server/devicepki/vault_adapter.go (1)
93-131:⚠️ Potential issue | 🟠 MajorRequire HTTPS and set TLS MinVersion.
Two security improvements needed:
- The constructor accepts any URL scheme, but
doRequestsendsX-Vault-Tokenin cleartext over HTTP. Reject non-HTTPS addresses by default.- The TLS config should set
MinVersionto enforce modern TLS (at minimum TLS 1.2).🔐 Proposed fix
+import "net/url" func NewVaultCA(cfg VaultConfig) (*VaultCA, error) { if cfg.Role == "" { return nil, fmt.Errorf("devicepki/vault: role is required") } + + // Validate URL scheme to prevent token leakage over plaintext HTTP. + u, err := url.Parse(cfg.Address) + if err != nil || u.Scheme == "" || u.Host == "" { + return nil, fmt.Errorf("devicepki/vault: invalid address") + } + if u.Scheme != "https" { + return nil, fmt.Errorf("devicepki/vault: address must use https://") + } // Apply default mount path if not specified. if cfg.Mount == "" {if cfg.CACertPEM != "" { pool := x509.NewCertPool() if !pool.AppendCertsFromPEM([]byte(cfg.CACertPEM)) { return nil, fmt.Errorf("devicepki/vault: failed to parse CACertPEM") } - transport.TLSClientConfig = &tls.Config{RootCAs: pool} + transport.TLSClientConfig = &tls.Config{ + RootCAs: pool, + MinVersion: tls.VersionTLS12, + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/vault_adapter.go` around lines 93 - 131, NewVaultCA currently accepts non-HTTPS addresses and doesn't enforce a TLS minimum; update NewVaultCA to reject non-HTTPS Vault endpoints by validating the VaultConfig address (e.g., ensure it has scheme "https") and return an error if not, and set the HTTP transport's TLSClientConfig.MinVersion = tls.VersionTLS12 (create TLSClientConfig if nil) so that connections use at least TLS 1.2; reference NewVaultCA, VaultConfig, cfg.CACertPEM, and the created transport/client when making these changes.
🧹 Nitpick comments (27)
management/server/http/handlers/device_auth/pem_helpers.go (1)
49-59: Consider checkingNotBeforefor CA certificates not yet valid.
validateCACertchecks for expired certificates but not for certificates that haven't become valid yet. A CA cert withNotBeforein the future could pass validation.🔧 Proposed fix to also check NotBefore
func validateCACert(cert *x509.Certificate) error { if !cert.IsCA || !cert.BasicConstraintsValid { return fmt.Errorf("certificate is not a CA (missing BasicConstraints or IsCA flag)") } if cert.KeyUsage&x509.KeyUsageCertSign == 0 { return fmt.Errorf("certificate lacks KeyUsageCertSign") } + now := time.Now() + if now.Before(cert.NotBefore) { + return fmt.Errorf("CA certificate is not yet valid (NotBefore: %s)", cert.NotBefore.Format(time.RFC3339)) + } - if cert.NotAfter.Before(time.Now()) { + if cert.NotAfter.Before(now) { return fmt.Errorf("CA certificate has expired at %s", cert.NotAfter.Format(time.RFC3339)) } return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/http/handlers/device_auth/pem_helpers.go` around lines 49 - 59, The validateCACert function currently checks expiry but not whether the CA certificate is not yet valid; update validateCACert (the function handling *x509.Certificate checks) to also verify cert.NotBefore is not after the current time (e.g., compare cert.NotBefore.After(time.Now()) or !cert.NotBefore.Before(time.Now())) and return an error like "CA certificate not yet valid until <time>" when the certificate's NotBefore is in the future; keep existing checks (IsCA, BasicConstraintsValid, KeyUsageCertSign, NotAfter) and add this NotBefore validation before returning nil.management/server/devicepki/appleroots/roots.go (1)
128-186: Mutex held during potentially slow network operation.The
cacheMulock is held for the entire duration of the HTTP download (up to 15 seconds). This means concurrent calls toBuildAppleSERootPoolwill block waiting for the lock, even though they could safely return the cached value.Consider using a read-lock for the cache check and only taking a write-lock for the update, or using
sync.Once/singleflightpattern to coalesce concurrent requests.♻️ Alternative pattern using RWMutex
var ( cachedPool *x509.CertPool cacheExpiry time.Time cacheMu sync.RWMutex ) func loadFromApple(ctx context.Context) (*x509.CertPool, error) { // Fast path: check cache with read lock cacheMu.RLock() if cachedPool != nil && time.Now().Before(cacheExpiry) { pool := cachedPool cacheMu.RUnlock() return pool, nil } cacheMu.RUnlock() // Slow path: download and update cache cacheMu.Lock() defer cacheMu.Unlock() // Double-check after acquiring write lock if cachedPool != nil && time.Now().Before(cacheExpiry) { return cachedPool, nil } // ... rest of download logic }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/appleroots/roots.go` around lines 128 - 186, The function loadFromApple currently holds cacheMu for the entire download which blocks concurrent callers; change the locking to use a read-then-write pattern (or singleflight) so the slow HTTP call is done without the mutex held: use cacheMu.RLock() to check cachedPool and cacheExpiry and return if valid, RUnlock(), then acquire cacheMu.Lock() only for the update, double-check cachedPool/cacheExpiry after acquiring the write lock, release the lock before performing the HTTP download (or perform download outside the write lock) and only re-acquire the write lock to set cachedPool and cacheExpiry; reference loadFromApple, cacheMu, cachedPool, cacheExpiry (or use singleflight to coalesce concurrent downloads).infrastructure_files/README-stand.md (1)
15-28: Add language specifiers to fenced code blocks.Adding language hints improves readability with syntax highlighting.
📝 Suggested improvements
-``` +```bash cd netbird-dashboard && npm install && npm run devThen open http://localhost:3000
Custom dashboard Docker image (serves our build at http://localhost):
-+bash
cd netbird-dashboard && docker build -f docker/Dockerfile -t netbird/dashboard:tpm-dev .Then re-run up-device-auth-stand.sh — it detects the image automatically
Stop: -``` +```bash docker compose -f infrastructure_files/stand/docker-compose.yml down</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@infrastructure_files/README-stand.mdaround lines 15 - 28, Update the fenced
code blocks in README-stand.md to include a language specifier (use "bash") for
each shell snippet so they render with syntax highlighting: the block containing
"cd netbird-dashboard && npm install && npm run dev", the block with "cd
netbird-dashboard && docker build -f docker/Dockerfile -t
netbird/dashboard:tpm-dev .", and the block with "docker compose -f
infrastructure_files/stand/docker-compose.yml down". Ensure each opening triple
backtick is changed to ```bash and leave the block contents unchanged.</details> </blockquote></details> <details> <summary>docs/DEVICE-SECURITY.md (1)</summary><blockquote> `3-4`: **Update branch reference after merge.** The hardcoded branch name `feature/tpm-cert-auth` should be updated to `main` (or removed) once the PR is merged, otherwise it will become stale. <details> <summary>📝 Suggested update</summary> ```diff -**Branch:** `feature/tpm-cert-auth` -**Last updated:** 2026-04-14 +**Last updated:** 2026-04-14 ``` Or reference the stable branch: ```diff -**Branch:** `feature/tpm-cert-auth` +**Branch:** `main` ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@docs/DEVICE-SECURITY.md` around lines 3 - 4, Update the hardcoded branch reference `feature/tpm-cert-auth` in DEVICE-SECURITY.md so it no longer becomes stale after merge: replace that string with `main` (or remove the branch line entirely) or change it to a stable reference like `main`/`stable`; ensure the “Branch:” metadata line is updated accordingly so the document points to the correct post-merge branch. ``` </details> </blockquote></details> <details> <summary>management/server/deviceinventory/jamf_test.go (1)</summary><blockquote> `231-240`: **`CustomTimeout_Applied` never checks the timeout.** This is currently just another construction-success test. A regression that ignores `timeout_seconds` would still pass. Make the mock server sleep past the configured threshold and assert the request times out, or inspect the configured client timeout if the type exposes it. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/server/deviceinventory/jamf_test.go` around lines 231 - 240, The test TestNewJamfInventory_CustomTimeout_Applied currently only checks construction; change it to verify the timeout behavior by either (A) spinning up a local HTTP test server (handled in the test) that sleeps longer than the configured "timeout_seconds" (e.g., >5s) and asserting that requests from the inventory client returned by deviceinventory.NewJamfInventory(string(cfg)) fail with a timeout/error, or (B) if the returned inventory type exposes its http.Client or timeout field, inspect that client's Timeout equals the configured value; locate NewJamfInventory and the test TestNewJamfInventory_CustomTimeout_Applied to implement the appropriate assertion against "timeout_seconds". ``` </details> </blockquote></details> <details> <summary>management/server/devicepki/attestation_test.go (1)</summary><blockquote> `59-67`: **This happy-path test depends on a global environment assumption.** `VerifyAttestation` verifies the EK chain whenever manufacturer roots are present, but this test only passes because the open-source build currently leaves that pool empty. If roots get bundled later, the self-signed EK here becomes invalid and the test starts failing for the wrong reason. Prefer a verifier seam or a test-controlled root pool. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/attestation_test.go` around lines 59 - 67, Test TestVerifyAttestation_ValidProof relies on a global manufacturer root pool being empty so devicepki.VerifyAttestation succeeds; to make the test robust, change the test to control the verifier state instead of relying on global assumptions by either (a) using a test seam/API on VerifyAttestation to inject a cert pool or options (e.g., a parameter or functional option) so you can pass an explicit empty x509.CertPool, or (b) if injection isn’t available, temporarily set and restore the global manufacturer root pool used by VerifyAttestation inside the test so the chain check is deterministic; update TestVerifyAttestation_ValidProof to pass the controlled pool/seam or perform the temporary replace/restore around the call to VerifyAttestation. ``` </details> </blockquote></details> <details> <summary>management/server/http/handler.go (1)</summary><blockquote> `84-84`: **`NewAPIHandler` needs an options struct now.** This signature is already easy to mis-wire, and adding `deviceCertPoolUpdater` plus `managementURL` makes the call sites even more brittle — especially ones like `management/server/http/testing/testing_tools/channel/channel.go:134-140`, which are already a long run of positional `nil`s. Consider bundling optional dependencies into a config struct before this grows further. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/server/http/handler.go` at line 84, The NewAPIHandler signature has grown brittle and should accept a single options/config struct instead of many positional optional parameters; create a NewAPIHandlerOptions (or APIHandlerConfig) struct that includes deviceCertPoolUpdater, managementURL, trustedHTTPProxies and any other optional/dependency-like params (e.g., proxyGRPCServer, reverseProxyDomainManager, reverseProxyAccessLogsManager) and change the function signature to NewAPIHandler(ctx context.Context, accountManager account.Manager, networksManager nbnetworks.Manager, resourceManager resources.Manager, routerManager routers.Manager, groupsManager nbgroups.Manager, LocationManager geolocation.Geolocation, authManager auth.Manager, appMetrics telemetry.AppMetrics, integratedValidator integrated_validator.IntegratedValidator, proxyController port_forwarding.Controller, permissionsManager permissions.Manager, peersManager nbpeers.Manager, settingsManager settings.Manager, zManager zones.Manager, rManager records.Manager, networkMapController network_map.Controller, idpManager idpmanager.Manager, serviceManager service.Manager, options NewAPIHandlerOptions) (http.Handler, error); update all call sites that construct NewAPIHandler to build and pass the options struct (instead of long positional nils) and adjust internal usage to read from options.deviceCertPoolUpdater, options.managementURL, etc., ensuring backward-compatible behavior where fields may be nil. ``` </details> </blockquote></details> <details> <summary>management/server/deviceauth/handler_test.go (1)</summary><blockquote> `150-163`: **`cert-and-sso` is only testing the cert half.** Both positive cases pass `true` for the other auth factor, so this file never checks the important negative branch where a valid certificate is present but the SSO/login condition is false. Add that case to lock down the `cert-and-sso` policy. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/server/deviceauth/handler_test.go` around lines 150 - 163, Add a negative test for the "cert-and-sso" policy that covers the missing SSO/login case: create a handler via newHandlerWithCA(t), build a valid cert with buildTestCert(t, ca, key, "wg-key", ...), call h.CheckDeviceAuth(ctx, "wg-key", false, cert, settings) where settings.Mode == "cert-and-sso", and assert that an error is returned (and optionally that the error message references SSO/login). Name the test e.g. TestCheckDeviceAuth_ModeCertAndSSO_NoSSO_Denied and place it alongside the existing TestCheckDeviceAuth_* tests so the branch where a valid certificate is present but the SSO flag is false is exercised. ``` </details> </blockquote></details> <details> <summary>.github/workflows/tpm-tests.yml (1)</summary><blockquote> `57-63`: **Minor shell script improvement — use underscore for unused loop variable.** Shellcheck flags `i` as unused. Using `_` as the loop variable is idiomatic for "don't care" counters and silences the warning. <details> <summary>🔧 Proposed fix</summary> ```diff # Wait for swtpm to be ready on the command port. - for i in $(seq 1 15); do + for _ in $(seq 1 15); do nc -z localhost 2321 2>/dev/null && break sleep 1 done ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In @.github/workflows/tpm-tests.yml around lines 57 - 63, Replace the unused loop variable `i` in the swtpm readiness loop with the idiomatic underscore to silence ShellCheck; locate the loop `for i in $(seq 1 15); do` and change it to use `_` (i.e., `for _ in $(seq 1 15); do`) so the counter is marked "don't care" while keeping the nc check, sleep, and subsequent environment export (`NETBIRD_TPM_SIMULATOR=localhost:2321`) unchanged. ``` </details> </blockquote></details> <details> <summary>management/cmd/mtls-demo/main.go (1)</summary><blockquote> `91-93`: **Consider setting TLS MinVersion for defense in depth.** While this is a demo tool with an intentional `InsecureSkipVerify` option, setting `MinVersion: tls.VersionTLS12` (or TLS 1.3) is a low-cost hardening that prevents downgrade attacks when `-insecure` is not used. <details> <summary>🔒 Proposed fix</summary> ```diff // Build TLS config. tlsCfg := &tls.Config{ InsecureSkipVerify: *insecure, // `#nosec` G402 — intentional for test stand with self-signed cert + MinVersion: tls.VersionTLS12, } ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/cmd/mtls-demo/main.go` around lines 91 - 93, The tls.Config created in tlsCfg currently only sets InsecureSkipVerify based on the insecure flag; update tlsCfg (tls.Config) to also set MinVersion to tls.VersionTLS12 (or tls.VersionTLS13) when the insecure flag is false so that when not using -insecure the connection enforces a minimum TLS version; locate where tlsCfg is constructed and add the MinVersion field conditioned on *insecure (or simply set MinVersion unconditionally to tls.VersionTLS12 if you prefer stricter defaults). ``` </details> </blockquote></details> <details> <summary>client/internal/tpm/tpm_linux_integration_test.go (2)</summary><blockquote> `126-159`: **Test name is misleading — signature verification is not performed.** The test `AttestationProof_SignatureVerifies` explicitly states in comments (lines 152-153) that it doesn't perform actual signature verification. Consider either implementing the verification or renaming to `AttestationProof_FieldsValid` to accurately reflect the test's scope. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@client/internal/tpm/tpm_linux_integration_test.go` around lines 126 - 159, Rename or implement: the test function TestLinuxTPM_AttestationProof_SignatureVerifies misleads because it never verifies the AK signature; either rename it to TestLinuxTPM_AttestationProof_FieldsValid (or similar) or implement real signature verification using proof.AKPublic, proof.CertifyInfo and proof.Signature. If you implement verification, parse AKPublic into the correct key type (e.g., *rsa.PublicKey or *ecdsa.PublicKey), compute sha256.Sum256(proof.CertifyInfo) and call rsa.VerifyPKCS1v15 or ecdsa.VerifyASN1 as appropriate, asserting no error; otherwise update the test name and comments to reflect it only checks presence/format of CertifyInfo, AKPublic and Signature. ``` </details> --- `57-85`: **Test does not actually verify the ECDSA signature.** The test is named `Sign_ECDSARoundtrip` suggesting a full roundtrip verification, but it only checks that the signature is non-empty and that the public key can be marshaled/unmarshaled. Consider completing the verification or renaming the test to reflect what it actually tests. <details> <summary>🔧 Proposed fix to add actual signature verification</summary> ```diff func TestLinuxTPM_Sign_ECDSARoundtrip(t *testing.T) { requireTPMDevice(t) ctx := context.Background() p := tpm.NewPlatformProvider(t.TempDir()) require.True(t, p.Available()) signer, err := p.GenerateKey(ctx, "sign-key") require.NoError(t, err) digest := sha256.Sum256([]byte("hello netbird tpm")) sig, err := signer.Sign(rand.Reader, digest[:], crypto.SHA256) require.NoError(t, err) require.NotEmpty(t, sig) - // Verify the signature with the public key. - pub := signer.Public() - ecPub, err := x509.MarshalPKIXPublicKey(pub) - require.NoError(t, err) - parsedPub, err := x509.ParsePKIXPublicKey(ecPub) - require.NoError(t, err) - - verifier, ok := parsedPub.(interface { - Equal(crypto.PublicKey) bool - }) - require.True(t, ok) - _ = verifier - // Actual ECDSA verification is done via the signer's public key directly. + // Verify the signature with the public key. + ecdsaPub, ok := signer.Public().(*ecdsa.PublicKey) + require.True(t, ok, "expected ECDSA public key") + valid := ecdsa.VerifyASN1(ecdsaPub, digest[:], sig) + assert.True(t, valid, "ECDSA signature must verify") } ``` Note: You'll need to add `"crypto/ecdsa"` to the imports. </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@client/internal/tpm/tpm_linux_integration_test.go` around lines 57 - 85, The test TestLinuxTPM_Sign_ECDSARoundtrip currently never actually verifies the ECDSA signature; update it to assert the unmarshaled public key is an *ecdsa.PublicKey and call ecdsa.VerifyASN1 (or ecdsa.Verify after parsing r,s) with the digest and sig, then require.True on the verification result; add the crypto/ecdsa import. If you prefer not to perform verification, rename the test to reflect it only checks signature presence and key marshal/unmarshal. ``` </details> </blockquote></details> <details> <summary>management/Dockerfile.multistage (2)</summary><blockquote> `43-46`: **Silent failure of `dev-stand-init` may mask bootstrap issues.** The `|| true` causes any `dev-stand-init` failure to be silently ignored. While this allows the container to start regardless, it could mask configuration or database issues during development. Consider logging a warning when the command fails. <details> <summary>🔧 Proposed fix to log failures</summary> ```diff /usr/local/bin/dev-stand-init \\\n\ --db "${NETBIRD_DB_DATADIR:-/var/lib/netbird}" \\\n\ --token "${NETBIRD_DEV_INIT_TOKEN}" \\\n\ - > /var/lib/netbird/init.env 2>&1 || true\n\ + > /var/lib/netbird/init.env 2>&1 || echo "[entrypoint] WARN: dev-stand-init failed (non-fatal)"\n\ ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/Dockerfile.multistage` around lines 43 - 46, The Dockerfile currently runs dev-stand-init > /var/lib/netbird/init.env 2>&1 || true which swallows failures; change it to run dev-stand-init redirecting output to /var/lib/netbird/init.env, capture its exit status ($?), and if non-zero emit a warning (e.g., echo "[entrypoint] dev-stand-init failed with exit $EXIT_CODE; see /var/lib/netbird/init.env" >&2) instead of silently ignoring, while still allowing container startup if desired; reference the dev-stand-init invocation and /var/lib/netbird/init.env in your edits. ``` </details> --- `17-18`: **Add `--no-install-recommends` to reduce image size.** The `apt-get install` command installs recommended packages by default, which increases image size unnecessarily. <details> <summary>📦 Proposed fix</summary> ```diff # openssl for self-signed cert generation; python3 for management.json patching -RUN apt-get update && apt-get install -y ca-certificates openssl python3-minimal && rm -rf /var/lib/apt/lists/* +RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates openssl python3-minimal && rm -rf /var/lib/apt/lists/* ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/Dockerfile.multistage` around lines 17 - 18, The RUN line in Dockerfile.multistage installs packages with apt-get install which pulls recommended packages; update that RUN command (the apt-get update && apt-get install -y ... && rm -rf /var/lib/apt/lists/* invocation) to pass --no-install-recommends to apt-get install so only the listed packages (ca-certificates, openssl, python3-minimal) are installed and image size is reduced. ``` </details> </blockquote></details> <details> <summary>infrastructure_files/stand-dex/Makefile (1)</summary><blockquote> `97-108`: **The `e2e` target proceeds with empty `SETUP_KEY` despite warning.** When `NETBIRD_SETUP_KEY` is not found, the target logs a warning but still proceeds to run the E2E script. If enrollment tests fail without a setup key, consider exiting early or making the warning more prominent. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@infrastructure_files/stand-dex/Makefile` around lines 97 - 108, The e2e Makefile target currently warns when SETUP_KEY is empty but still invokes ../../scripts/e2e-device-auth.sh which can cause failing tests; update the e2e target to check the captured SETUP_KEY (variable SETUP_KEY) and if empty either exit non‑zero immediately or skip running the script (do not call ../../scripts/e2e-device-auth.sh) and print a clear error message stating NETBIRD_SETUP_KEY is missing; modify the conditional around SETUP_KEY retrieval so that the subsequent export of NETBIRD_SETUP_KEY (used by the script) only happens when SETUP_KEY is non‑empty to prevent proceeding with an empty value. ``` </details> </blockquote></details> <details> <summary>infrastructure_files/scripts/up-device-auth-stand.sh (1)</summary><blockquote> `153-155`: **Quote `$COMPOSE_FILES` to prevent word splitting issues.** Shellcheck SC2086 flags this correctly. While the current values don't contain spaces, quoting prevents issues if paths ever contain spaces. <details> <summary>🔧 Proposed fix</summary> ```diff +# Use array for compose files to handle paths with spaces safely +COMPOSE_FILES_ARRAY=(-f "$STAND_DIR/docker-compose.yml" -f "$STAND_DIR/docker-compose.device-auth.yml") + +# ... later when adding dashboard overlay: +COMPOSE_FILES_ARRAY+=(-f "$STAND_DIR/docker-compose.dashboard.yml") + +# ... then use: docker compose \ - $COMPOSE_FILES \ + "${COMPOSE_FILES_ARRAY[@]}" \ up -d --no-deps management ``` Alternatively, if keeping the string approach: ```diff docker compose \ - $COMPOSE_FILES \ + ${COMPOSE_FILES} \ up -d --no-deps management ``` Note: In this specific case, word splitting is actually intentional to pass multiple `-f` flags, so disabling the warning with a comment is also acceptable: ```bash # shellcheck disable=SC2086 docker compose $COMPOSE_FILES up -d --no-deps management ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@infrastructure_files/scripts/up-device-auth-stand.sh` around lines 153 - 155, The docker compose invocation uses the unquoted variable COMPOSE_FILES which can cause word-splitting; update the command that runs "docker compose $COMPOSE_FILES up -d --no-deps management" to either quote the variable ("$COMPOSE_FILES") so paths with spaces are preserved, or if intentional word-splitting is required for multiple -f flags, disable SC2086 with a comment (# shellcheck disable=SC2086) above that line; adjust the line accordingly to use either "$COMPOSE_FILES" or add the shellcheck disable comment. ``` </details> </blockquote></details> <details> <summary>management/internals/server/boot.go (1)</summary><blockquote> `239-256`: **Consider adding empty-accounts guard for consistency with `rebuildCertPool`.** The HTTP handler's `rebuildCertPool` (in `management/server/http/handlers/device_auth/handler.go:720-755`) explicitly checks for empty accounts and skips the pool update to avoid clearing active certificates. This method doesn't have the same guard. At server startup this may be acceptable (no existing connections), but consider adding a log warning when no accounts are found to aid debugging: <details> <summary>💡 Suggested improvement</summary> ```diff func (s *BaseServer) buildInitialCertPool() *x509.CertPool { pool := x509.NewCertPool() accounts := s.Store().GetAllAccounts(context.Background()) + if len(accounts) == 0 { + log.Warn("deviceauth: buildInitialCertPool: no accounts found, cert pool will be empty") + } for _, acct := range accounts { ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/internals/server/boot.go` around lines 239 - 256, In buildInitialCertPool, add the same empty-accounts guard used by rebuildCertPool: after calling s.Store().GetAllAccounts(ctx) check if the returned slice is empty, log a warning like "deviceauth: build initial cert pool: no accounts found, skipping cert pool initialization" and return the (unchanged) pool immediately; this preserves existing certs and provides a clear debug message. Reference: function buildInitialCertPool and compare behavior to rebuildCertPool. ``` </details> </blockquote></details> <details> <summary>management/server/http/handlers/device_auth/ca_config.go (1)</summary><blockquote> `270-361`: **Consider extracting per-CA-type merge logic into helper functions.** SonarCloud flags the case clauses as exceeding 10 lines. While the current structure is readable due to its consistent pattern, extracting helpers would reduce duplication and make each case more concise: <details> <summary>💡 Optional refactor approach</summary> ```go func mergeVaultConfig(existing string, req *vaultCfgReq) (string, error) { var cfg struct { /* fields */ } if existing != "" { if err := json.Unmarshal([]byte(existing), &cfg); err != nil { return "", err } } // Apply non-credential fields, preserve credential if empty cfg.Address = req.Address // ... if req.Token != "" { cfg.Token = req.Token } raw, err := json.Marshal(cfg) return string(raw), err } ``` </details> This is optional since the current implementation is functionally correct and the repetition makes the pattern clear. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/server/http/handlers/device_auth/ca_config.go` around lines 270 - 361, The switch on s.CAType in the handler contains long, repetitive merge blocks for Vault, Smallstep, and SCEP (the code manipulating s.CAConfig when CAType is DeviceAuthCATypeVault, DeviceAuthCATypeSmallstep, DeviceAuthCATypeSCEP); extract each block into a small helper that takes the existing CAConfig string and the respective request struct and returns the new JSON string (e.g., mergeVaultConfig(existing string, req *VaultReq) (string, error), mergeSmallstepConfig(...), mergeSCEPConfig(...)); inside each helper unmarshal the existing JSON if present, apply non-empty credential-preserving field updates exactly as the current case does, marshal and return the result, then replace the long case bodies to call the helper and assign s.CAConfig (and return any error) so each switch case is concise and SonarCloud-compliant. ``` </details> </blockquote></details> <details> <summary>management/server/http/handlers/device_auth/handler.go (2)</summary><blockquote> `409-412`: **Document the CA/store revocation duality more prominently.** The comment explains that CA-backend revocation is skipped and `CheckDeviceAuth` uses the store's `Revoked` flag. This architectural decision is important — consider adding a brief note in the package documentation or a constant comment explaining this dual-state design. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/server/http/handlers/device_auth/handler.go` around lines 409 - 412, Add a prominent explanation of the dual revocation model (CA/backend revocation vs store-level Revoked flag) to package-level docs or a new top-of-file constant comment so future readers understand why CA revocation (e.g., Vault/Smallstep RevokeCert) is intentionally omitted and CheckDeviceAuth relies on the store's Revoked boolean; reference the CheckDeviceAuth function and the store's Revoked field in that doc comment, describe the expected semantics for external CRL consumers, and note where CA-backend revocation should be implemented if needed (e.g., integration points) to make the design decision explicit. ``` </details> --- `590-592`: **CRL Cache-Control header may be too aggressive for security-sensitive revocation data.** The 1-hour cache (`max-age=3600`) for CRL responses could delay revocation propagation. Depending on security requirements, consider a shorter TTL (e.g., 5-15 minutes) or adding `must-revalidate`. <details> <summary>💡 Consider shorter cache TTL for revocation data</summary> ```diff w.Header().Set("Content-Type", "application/pkix-crl") -w.Header().Set("Cache-Control", "public, max-age=3600") +w.Header().Set("Cache-Control", "public, max-age=300, must-revalidate") ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/server/http/handlers/device_auth/handler.go` around lines 590 - 592, The CRL response currently sets a 1-hour Cache-Control ("public, max-age=3600") which may be too long for revocation data; in the CRL handler where the response headers are set (the code that calls w.Header().Set("Content-Type", \"application/pkix-crl\") and writes crlDER), change the Cache-Control to a shorter TTL and stricter revalidation (for example "public, max-age=300, must-revalidate" or "no-cache, max-age=0, must-revalidate" depending on your security policy) so revocations propagate faster and clients revalidate frequently. ``` </details> </blockquote></details> <details> <summary>management/server/http/handlers/device_auth/ca_test_handler.go (1)</summary><blockquote> `97-219`: **Consider extracting step logic into helper methods to reduce complexity.** SonarCloud flags this function at 108 lines (above 100 threshold). While the sequential test flow is logical, extracting each step into a small helper (e.g., `runGenerateCSRStep`, `runSignStep`, etc.) would improve readability and testability without changing behavior. Additionally, lines 211-214 have a redundant conditional: ```go detail := "CRL generated" if len(crlBytes) > 0 { detail = "CRL generated successfully" } ``` If `GenerateCRL` succeeds (no error), `crlBytes` should always contain data. The distinction between "CRL generated" and "CRL generated successfully" is unclear to the user. <details> <summary>♻️ Suggested simplification for CRL detail</summary> ```diff - detail := "CRL generated" - if len(crlBytes) > 0 { - detail = "CRL generated successfully" - } - addStep("verify_crl", caTestStepOK, detail, "", time.Since(start)) + addStep("verify_crl", caTestStepOK, fmt.Sprintf("CRL generated (%d bytes)", len(crlBytes)), "", time.Since(start)) ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/server/http/handlers/device_auth/ca_test_handler.go` around lines 97 - 219, The runCATestCycle function is too long and contains a redundant CRL detail conditional; extract each major step into small helper methods (e.g., runGenerateCSRStep, runSignStep, runVerifyCertStep, runRevokeStep, runGenerateCRLStep) that accept context, settings, accountID, the caFactory/ca instance and use the existing addStep/skipRemaining behavior to append results, and then refactor runCATestCycle to orchestrate those helpers in sequence; also simplify the CRL detail logic in the GenerateCRL handling (inside the new runGenerateCRLStep) to use a single clear message like "CRL generated" on success and remove the len(crlBytes) conditional. ``` </details> </blockquote></details> <details> <summary>management/internals/shared/grpc/device_enrollment.go (1)</summary><blockquote> `33-129`: **EnrollDevice cognitive complexity exceeds threshold.** SonarCloud flags cognitive complexity at 25 (allowed: 20). The function handles multiple enrollment paths (idempotency, auto-renewal, attestation, manual). Consider extracting the auto-renewal block (lines 69-93) into a separate helper method to reduce complexity. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/internals/shared/grpc/device_enrollment.go` around lines 33 - 129, The EnrollDevice function is over-complex; extract the auto-renewal logic (the block that fetches existingCert via st.GetDeviceCertificateByWGKey, checks Revoked/NotAfter, inspects prevEnroll via st.GetEnrollmentRequestByWGKey, logs the rejection case, and calls s.tryAutoRenew) into a new helper method (e.g., s.handleAutoRenew(ctx, peerKey, accountID, peerID, wgPubKey, enrollReq)) that returns (handled bool, resp *proto.EncryptedMessage, err error); replace the original inline block with a call to that helper and handle its returned values exactly as before (invoke s.tryAutoRenew inside the helper and preserve the log and control flow). Ensure the helper uses the same store methods (GetDeviceCertificateByWGKey, GetEnrollmentRequestByWGKey) and status checks so behavior is unchanged. ``` </details> </blockquote></details> <details> <summary>management/server/devicepki/smallstep_adapter.go (1)</summary><blockquote> `119-166`: **CACert caching holds mutex during HTTP request.** The mutex is held from line 120 until line 165, including the HTTP request and response parsing. This could cause contention if multiple goroutines call `CACert` concurrently while the cache is cold. Consider a double-check locking pattern or accepting the occasional duplicate fetch. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/server/devicepki/smallstep_adapter.go` around lines 119 - 166, CACert currently holds s.mu across the entire HTTP fetch and parsing which can block other callers; change to a double-checked locking pattern: under s.mu check s.caCert and return if set, otherwise release the lock, perform the HTTP request and parse the certificate (using the same logic in CACert but outside the lock), then reacquire s.mu and set s.caCert only if still nil (or return the existing cached value). Reference SmallstepCA.CACert, s.mu and s.caCert — alternatively extract the fetch-and-parse logic to a helper (e.g., fetchCACert) called outside the lock and assign the result under lock to avoid holding the mutex during network I/O. ``` </details> </blockquote></details> <details> <summary>client/internal/enrollment/manager.go (2)</summary><blockquote> `141-150`: **Type assertion for SE attestation could mask other attestation-capable providers.** The check `if _, ok := m.tpmProvider.(seAttestationProvider); ok` at line 141 prioritizes Apple SE attestation. However, if a future provider implements both `seAttestationProvider` and TPM attestation, this ordering assumption may not be correct. Consider making the attestation path selection more explicit based on platform or configuration. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@client/internal/enrollment/manager.go` around lines 141 - 150, The current type assertion on m.tpmProvider using seAttestationProvider can incorrectly prioritize Apple SE when a provider implements both SE and TPM attestation; change the attestation-path selection to be explicit (e.g., consult platform info or an explicit config/flag) rather than relying on a cast order: update the logic around m.tpmProvider, seAttestationProvider, AttestationProof, enrollWithAppleSEAttestation and enrollWithTPMAttestation so you first determine the desired attestation mode from configuration or platform (or a capability method like SupportsSEAttestation/SupportsTPMAttestation) and then call the appropriate enrollWithAppleSEAttestation or enrollWithTPMAttestation path accordingly. ``` </details> --- `382-416`: **Polling loop lacks maximum retry/timeout for rejected enrollments.** The `pollUntilApproved` loop continues indefinitely with backoff until approved, rejected, or context cancelled. If an enrollment remains pending for days (admin doesn't act), the client will poll indefinitely. Consider adding a configurable maximum poll duration or attempt count. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@client/internal/enrollment/manager.go` around lines 382 - 416, The polling loop in the method that polls enrollment status (the loop in pollUntilApproved / the manager.go method using m.grpcClient.GetEnrollmentStatus) can run forever; add a configurable maximum (either a maxPollDuration or maxAttempts field on the manager/config) and enforce it by creating a deadline (deadline := time.Now().Add(m.cfg.MaxPollDuration) or an attempts counter) and checking it each iteration (or before sleeping) to return a clear timeout error (e.g., fmt.Errorf("enrollment: polling timed out after %v", m.cfg.MaxPollDuration) or similar) when exceeded; update the method signature or manager struct to accept the config, increment attempts or compare time.Now() to deadline inside the loop (alongside the ctx.Done() check), and ensure interval/backoff logic still runs until the configured limit is reached. ``` </details> </blockquote></details> <details> <summary>client/internal/tpm/tpm_windows.go (1)</summary><blockquote> `192-204`: **Consider using a shared sentinel error for ActivateCredential.** `AttestationProof` returns `ErrAttestationNotSupported` (line 197), but `ActivateCredential` returns a new `errors.New(...)` (line 203). For consistency and to allow callers to check error types uniformly, consider using a shared sentinel error like `ErrActivateCredentialNotSupported`. <details> <summary>♻️ Use sentinel error for consistency</summary> In `provider.go` or similar: ```go var ErrActivateCredentialNotSupported = errors.New("tpm: ActivateCredential not supported on this platform") ``` Then in `tpm_windows.go`: ```diff func (p *windowsProvider) ActivateCredential(_ context.Context, _ []byte) ([]byte, error) { - return nil, errors.New("tpm: ActivateCredential not yet supported on Windows; CNG/TBS implementation pending") + return nil, ErrActivateCredentialNotSupported } ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@client/internal/tpm/tpm_windows.go` around lines 192 - 204, AttestationProof uses the shared sentinel ErrAttestationNotSupported but ActivateCredential returns an ad-hoc error string; declare a package-level sentinel (e.g., ErrActivateCredentialNotSupported) alongside ErrAttestationNotSupported (in provider.go or the shared provider file) and change windowsProvider.ActivateCredential to return that sentinel instead of errors.New(...), so callers can compare errors consistently; ensure the sentinel name (ErrActivateCredentialNotSupported) is exported/placed where other platform providers can reuse it. ``` </details> </blockquote></details> <details> <summary>management/cmd/enroll-demo/main.go (1)</summary><blockquote> `106-133`: **Consider setting `MinVersion` even for insecure test mode.** While `InsecureSkipVerify` is intentionally enabled for self-signed certs in test environments, setting `MinVersion: tls.VersionTLS12` is still worthwhile to ensure modern TLS negotiation. This is a minor improvement for a demo tool. <details> <summary>♻️ Suggested improvement</summary> ```diff tlsCfg := &tls.Config{ InsecureSkipVerify: true, // `#nosec` G402 — intentional for test stand with self-signed cert + MinVersion: tls.VersionTLS12, } ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@management/cmd/enroll-demo/main.go` around lines 106 - 133, The tls.Config used in the insecure test branch (tlsCfg in the block that builds a connection and calls mgmclient.NewClientFromConn) should explicitly set MinVersion: tls.VersionTLS12 while keeping InsecureSkipVerify true; update tlsCfg to include MinVersion: tls.VersionTLS12 (preserving the existing `#nosec` G402 comment) so the demo still accepts self-signed certs but forces TLS1.2+ negotiation. ``` </details> </blockquote></details> </blockquote></details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Organization UI **Review profile**: CHILL **Plan**: Pro **Run ID**: `3641294f-574f-4cd6-813f-1848ee03bfe6` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 48d16ff2054b854fa76370794d79a43736c10d8f and c139c29fc7c7e13e736c59342dbf772c7aba2ebd. </details> <details> <summary>⛔ Files ignored due to path filters (3)</summary> * `go.sum` is excluded by `!**/*.sum` * `shared/management/proto/management.pb.go` is excluded by `!**/*.pb.go` * `shared/management/proto/management_grpc.pb.go` is excluded by `!**/*.pb.go` </details> <details> <summary>📒 Files selected for processing (127)</summary> * `.devcontainer/Dockerfile` * `.github/workflows/device-auth-tests.yml` * `.github/workflows/e2e-device-auth.yml` * `.github/workflows/tpm-tests.yml` * `.gitignore` * `Makefile` * `client/internal/connect.go` * `client/internal/enrollment/manager.go` * `client/internal/enrollment/manager_grpc_test.go` * `client/internal/enrollment/manager_test.go` * `client/internal/tpm/cert_pem.go` * `client/internal/tpm/keyid.go` * `client/internal/tpm/mock_provider.go` * `client/internal/tpm/provider.go` * `client/internal/tpm/tpm_darwin.go` * `client/internal/tpm/tpm_darwin_test.go` * `client/internal/tpm/tpm_linux.go` * `client/internal/tpm/tpm_linux_integration_test.go` * `client/internal/tpm/tpm_stub.go` * `client/internal/tpm/tpm_test.go` * `client/internal/tpm/tpm_windows.go` * `docs/DEVICE-SECURITY.md` * `go.mod` * `infrastructure_files/README-stand.md` * `infrastructure_files/docker-compose.device-auth.yml` * `infrastructure_files/scripts/approve-enrollment.sh` * `infrastructure_files/scripts/setup-device-auth.sh` * `infrastructure_files/scripts/up-device-auth-stand.sh` * `infrastructure_files/stand-dex/Caddyfile` * `infrastructure_files/stand-dex/Caddyfile.device-auth` * `infrastructure_files/stand-dex/Makefile` * `infrastructure_files/stand-dex/create-pending-enrollment.sh` * `infrastructure_files/stand-dex/dashboard.env` * `infrastructure_files/stand-dex/dex-device-auth.yaml` * `infrastructure_files/stand-dex/dex.yaml` * `infrastructure_files/stand-dex/docker-compose.device-auth.yml` * `infrastructure_files/stand-dex/docker-compose.yml` * `infrastructure_files/stand-dex/management.json` * `infrastructure_files/stand-dex/relay.env` * `management/Dockerfile.multistage` * `management/cmd/enroll-demo/main.go` * `management/cmd/mtls-demo/main.go` * `management/internals/server/boot.go` * `management/internals/server/boot_test.go` * `management/internals/server/config/config.go` * `management/internals/server/server.go` * `management/internals/shared/grpc/attestation_handler.go` * `management/internals/shared/grpc/attestation_handler_test.go` * `management/internals/shared/grpc/attestation_sessions.go` * `management/internals/shared/grpc/attestation_sessions_test.go` * `management/internals/shared/grpc/device_cert_revocation.go` * `management/internals/shared/grpc/device_cert_revocation_test.go` * `management/internals/shared/grpc/device_enrollment.go` * `management/internals/shared/grpc/device_enrollment_test.go` * `management/internals/shared/grpc/server.go` * `management/internals/shared/grpc/server_test.go` * `management/internals/shared/grpc/testhelpers_test.go` * `management/server/account/manager.go` * `management/server/account/manager_mock.go` * `management/server/deviceauth/handler.go` * `management/server/deviceauth/handler_test.go` * `management/server/deviceinventory/intune.go` * `management/server/deviceinventory/intune_test.go` * `management/server/deviceinventory/inventory.go` * `management/server/deviceinventory/inventory_test.go` * `management/server/deviceinventory/jamf.go` * `management/server/deviceinventory/jamf_test.go` * `management/server/devicepki/appleroots/roots.go` * `management/server/devicepki/appleroots/roots_test.go` * `management/server/devicepki/attestation.go` * `management/server/devicepki/attestation_test.go` * `management/server/devicepki/builtin_ca.go` * `management/server/devicepki/builtin_ca_test.go` * `management/server/devicepki/ca.go` * `management/server/devicepki/factory.go` * `management/server/devicepki/factory_test.go` * `management/server/devicepki/scep_adapter.go` * `management/server/devicepki/scep_adapter_test.go` * `management/server/devicepki/smallstep_adapter.go` * `management/server/devicepki/smallstep_adapter_test.go` * `management/server/devicepki/tpmroots/certs/README.md` * `management/server/devicepki/tpmroots/roots.go` * `management/server/devicepki/vault_adapter.go` * `management/server/devicepki/vault_adapter_test.go` * `management/server/http/handler.go` * `management/server/http/handlers/device_auth/ca_config.go` * `management/server/http/handlers/device_auth/ca_config_test.go` * `management/server/http/handlers/device_auth/ca_test_handler.go` * `management/server/http/handlers/device_auth/ca_test_handler_test.go` * `management/server/http/handlers/device_auth/handler.go` * `management/server/http/handlers/device_auth/handler_http_test.go` * `management/server/http/handlers/device_auth/handler_test.go` * `management/server/http/handlers/device_auth/inventory.go` * `management/server/http/handlers/device_auth/inventory_test.go` * `management/server/http/handlers/device_auth/pem_helpers.go` * `management/server/http/testing/integration/device_auth_handler_integration_test.go` * `management/server/http/testing/testdata/device_auth_integration.sql` * `management/server/http/testing/testing_tools/channel/channel.go` * `management/server/http/testing/testing_tools/tools.go` * `management/server/mock_server/account_mock.go` * `management/server/peer.go` * `management/server/secretenc/secretenc.go` * `management/server/secretenc/secretenc_test.go` * `management/server/store/device_auth_migration_test.go` * `management/server/store/device_enrollment_store_test.go` * `management/server/store/secretenc_store_test.go` * `management/server/store/sql_store.go` * `management/server/store/store.go` * `management/server/store/store_mock.go` * `management/server/types/capabilities.go` * `management/server/types/device_auth.go` * `management/server/types/device_auth_settings.go` * `management/server/types/device_auth_test.go` * `management/server/types/settings.go` * `management/server/types/settings_test.go` * `management/server/types/testdata/comparison/components.json` * `management/server/types/testdata/comparison/components_networkmap.json` * `management/server/types/testdata/comparison/legacy_networkmap.json` * `management/server/types/testdata/networkmap_golden_new_with_onpeeradded_router.json` * `management/server/types/user.go` * `scripts/e2e-device-auth.sh` * `scripts/e2e-lib.sh` * `scripts/fetch-tpm-roots.go` * `shared/management/client/grpc.go` * `shared/management/client/grpc_cert_test.go` * `shared/management/proto/management.proto` * `tools/dev-stand-init/main.go` </details> <details> <summary>✅ Files skipped from review due to trivial changes (29)</summary> * go.mod * .gitignore * .devcontainer/Dockerfile * infrastructure_files/docker-compose.device-auth.yml * management/server/http/handlers/device_auth/handler_test.go * infrastructure_files/stand-dex/Caddyfile * management/internals/shared/grpc/server_test.go * .github/workflows/device-auth-tests.yml * client/internal/tpm/keyid.go * infrastructure_files/stand-dex/relay.env * infrastructure_files/stand-dex/dashboard.env * client/internal/tpm/cert_pem.go * management/server/devicepki/tpmroots/certs/README.md * infrastructure_files/stand-dex/docker-compose.yml * .github/workflows/e2e-device-auth.yml * management/server/devicepki/appleroots/roots_test.go * management/server/http/handlers/device_auth/inventory.go * management/internals/shared/grpc/device_cert_revocation.go * infrastructure_files/scripts/approve-enrollment.sh * infrastructure_files/stand-dex/docker-compose.device-auth.yml * client/internal/tpm/tpm_stub.go * management/server/devicepki/ca.go * client/internal/tpm/tpm_test.go * management/server/http/handlers/device_auth/ca_test_handler_test.go * management/server/http/handlers/device_auth/inventory_test.go * management/server/devicepki/smallstep_adapter_test.go * management/server/devicepki/scep_adapter_test.go * management/server/deviceinventory/inventory_test.go * client/internal/tpm/tpm_linux.go </details> <details> <summary>🚧 Files skipped from review as they are similar to previous changes (19)</summary> * management/internals/server/server.go * management/internals/server/config/config.go * infrastructure_files/stand-dex/Caddyfile.device-auth * management/server/devicepki/tpmroots/roots.go * management/server/account/manager.go * management/internals/server/boot_test.go * management/internals/shared/grpc/testhelpers_test.go * infrastructure_files/stand-dex/create-pending-enrollment.sh * management/server/deviceauth/handler.go * management/internals/shared/grpc/attestation_sessions.go * management/server/devicepki/attestation.go * management/internals/shared/grpc/device_cert_revocation_test.go * management/internals/shared/grpc/server.go * management/server/deviceinventory/intune_test.go * management/internals/shared/grpc/attestation_handler.go * client/internal/tpm/mock_provider.go * client/internal/tpm/tpm_darwin.go * management/server/deviceinventory/intune.go * management/server/http/handlers/device_auth/handler_http_test.go </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| stateDir := profilemanager.DefaultConfigPathDir | ||
| tpmProv := tpm.NewPlatformProvider(stateDir) | ||
| if tpmProv.Available() { | ||
| enrollMgr := enrollment.NewManager(tpmProv, mgmClient, stateDir, myPrivateKey.PublicKey().String()) | ||
|
|
||
| enrollCtx, enrollCancel := context.WithTimeout(engineCtx, 30*time.Second) | ||
| defer enrollCancel() | ||
| if _, certErr := enrollMgr.EnsureCertificate(enrollCtx); certErr != nil { | ||
| log.Warnf("device cert enrollment: %v — continuing without certificate", certErr) | ||
| } else if tlsCert, buildErr := enrollMgr.BuildTLSCertificate(engineCtx); buildErr != nil && !errors.Is(buildErr, enrollment.ErrNotEnrolled) { | ||
| log.Warnf("device cert: build TLS cert: %v — continuing without", buildErr) | ||
| } else if tlsCert != nil && mgmTlsEnabled { | ||
| // Reconnect with the device certificate as mTLS client cert. | ||
| certMgmClient, dialErr := mgm.NewClientWithCert(engineCtx, c.config.ManagementURL.Host, myPrivateKey, tlsCert) | ||
| if dialErr != nil { | ||
| log.Warnf("device cert: mTLS reconnect failed: %v — continuing without certificate", dialErr) | ||
| } else { | ||
| // Swap plain → cert client. The existing defer mgmClient.Close() will | ||
| // now close certMgmClient (Go closures capture the variable, not the value). | ||
| if closeErr := mgmClient.Close(); closeErr != nil { | ||
| log.Warnf("close plain mgm client before cert upgrade: %v", closeErr) | ||
| } | ||
| mgmClient = certMgmClient | ||
| mgmClient.SetConnStateListener(mgmNotifier) | ||
| cn := "<unknown>" | ||
| if tlsCert.Leaf != nil { | ||
| cn = tlsCert.Leaf.Subject.CommonName | ||
| } | ||
| log.Infof("device cert: management connection upgraded to mTLS (cert CN=%s)", cn) | ||
|
|
||
| // Start the certificate renewal loop. When the cert is approaching expiry | ||
| // and a fresh one is issued, cancel the engine context so the connection | ||
| // restarts with the new certificate. | ||
| enrollMgr.StartRenewalLoop(engineCtx, func(_ *x509.Certificate) { | ||
| log.Info("device cert: certificate renewed — reconnecting to apply new cert") | ||
| cancel() | ||
| }) | ||
| } | ||
| } |
There was a problem hiding this comment.
Don't enroll device certificates on plain-HTTP management connections.
This block runs EnsureCertificate whenever the platform provider is available, but the resulting certificate is only usable when mgmTlsEnabled is true. On http:// management URLs that means we still perform attestation/enrollment and create server-side enrollment state even though the client can never upgrade the channel to mTLS. Gate the whole enrollment path behind mgmTlsEnabled.
Suggested change
- stateDir := profilemanager.DefaultConfigPathDir
- tpmProv := tpm.NewPlatformProvider(stateDir)
- if tpmProv.Available() {
+ stateDir := profilemanager.DefaultConfigPathDir
+ tpmProv := tpm.NewPlatformProvider(stateDir)
+ if mgmTlsEnabled && tpmProv.Available() {
enrollMgr := enrollment.NewManager(tpmProv, mgmClient, stateDir, myPrivateKey.PublicKey().String())📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| stateDir := profilemanager.DefaultConfigPathDir | |
| tpmProv := tpm.NewPlatformProvider(stateDir) | |
| if tpmProv.Available() { | |
| enrollMgr := enrollment.NewManager(tpmProv, mgmClient, stateDir, myPrivateKey.PublicKey().String()) | |
| enrollCtx, enrollCancel := context.WithTimeout(engineCtx, 30*time.Second) | |
| defer enrollCancel() | |
| if _, certErr := enrollMgr.EnsureCertificate(enrollCtx); certErr != nil { | |
| log.Warnf("device cert enrollment: %v — continuing without certificate", certErr) | |
| } else if tlsCert, buildErr := enrollMgr.BuildTLSCertificate(engineCtx); buildErr != nil && !errors.Is(buildErr, enrollment.ErrNotEnrolled) { | |
| log.Warnf("device cert: build TLS cert: %v — continuing without", buildErr) | |
| } else if tlsCert != nil && mgmTlsEnabled { | |
| // Reconnect with the device certificate as mTLS client cert. | |
| certMgmClient, dialErr := mgm.NewClientWithCert(engineCtx, c.config.ManagementURL.Host, myPrivateKey, tlsCert) | |
| if dialErr != nil { | |
| log.Warnf("device cert: mTLS reconnect failed: %v — continuing without certificate", dialErr) | |
| } else { | |
| // Swap plain → cert client. The existing defer mgmClient.Close() will | |
| // now close certMgmClient (Go closures capture the variable, not the value). | |
| if closeErr := mgmClient.Close(); closeErr != nil { | |
| log.Warnf("close plain mgm client before cert upgrade: %v", closeErr) | |
| } | |
| mgmClient = certMgmClient | |
| mgmClient.SetConnStateListener(mgmNotifier) | |
| cn := "<unknown>" | |
| if tlsCert.Leaf != nil { | |
| cn = tlsCert.Leaf.Subject.CommonName | |
| } | |
| log.Infof("device cert: management connection upgraded to mTLS (cert CN=%s)", cn) | |
| // Start the certificate renewal loop. When the cert is approaching expiry | |
| // and a fresh one is issued, cancel the engine context so the connection | |
| // restarts with the new certificate. | |
| enrollMgr.StartRenewalLoop(engineCtx, func(_ *x509.Certificate) { | |
| log.Info("device cert: certificate renewed — reconnecting to apply new cert") | |
| cancel() | |
| }) | |
| } | |
| } | |
| stateDir := profilemanager.DefaultConfigPathDir | |
| tpmProv := tpm.NewPlatformProvider(stateDir) | |
| if mgmTlsEnabled && tpmProv.Available() { | |
| enrollMgr := enrollment.NewManager(tpmProv, mgmClient, stateDir, myPrivateKey.PublicKey().String()) | |
| enrollCtx, enrollCancel := context.WithTimeout(engineCtx, 30*time.Second) | |
| defer enrollCancel() | |
| if _, certErr := enrollMgr.EnsureCertificate(enrollCtx); certErr != nil { | |
| log.Warnf("device cert enrollment: %v — continuing without certificate", certErr) | |
| } else if tlsCert, buildErr := enrollMgr.BuildTLSCertificate(engineCtx); buildErr != nil && !errors.Is(buildErr, enrollment.ErrNotEnrolled) { | |
| log.Warnf("device cert: build TLS cert: %v — continuing without", buildErr) | |
| } else if tlsCert != nil && mgmTlsEnabled { | |
| // Reconnect with the device certificate as mTLS client cert. | |
| certMgmClient, dialErr := mgm.NewClientWithCert(engineCtx, c.config.ManagementURL.Host, myPrivateKey, tlsCert) | |
| if dialErr != nil { | |
| log.Warnf("device cert: mTLS reconnect failed: %v — continuing without certificate", dialErr) | |
| } else { | |
| // Swap plain → cert client. The existing defer mgmClient.Close() will | |
| // now close certMgmClient (Go closures capture the variable, not the value). | |
| if closeErr := mgmClient.Close(); closeErr != nil { | |
| log.Warnf("close plain mgm client before cert upgrade: %v", closeErr) | |
| } | |
| mgmClient = certMgmClient | |
| mgmClient.SetConnStateListener(mgmNotifier) | |
| cn := "<unknown>" | |
| if tlsCert.Leaf != nil { | |
| cn = tlsCert.Leaf.Subject.CommonName | |
| } | |
| log.Infof("device cert: management connection upgraded to mTLS (cert CN=%s)", cn) | |
| // Start the certificate renewal loop. When the cert is approaching expiry | |
| // and a fresh one is issued, cancel the engine context so the connection | |
| // restarts with the new certificate. | |
| enrollMgr.StartRenewalLoop(engineCtx, func(_ *x509.Certificate) { | |
| log.Info("device cert: certificate renewed — reconnecting to apply new cert") | |
| cancel() | |
| }) | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@client/internal/connect.go` around lines 267 - 305, The code performs TPM
enrollment even when management TLS is disabled (mgmTlsEnabled), creating
server-side enrollment state unnecessarily; modify the block that starts with
tpm.NewPlatformProvider(stateDir) and tpmProv.Available() so that you only
construct enrollment.NewManager, call enrollMgr.EnsureCertificate,
enrollMgr.BuildTLSCertificate, mgm.NewClientWithCert, and
enrollMgr.StartRenewalLoop when mgmTlsEnabled is true; in practice, add a guard
checking mgmTlsEnabled before creating enrollMgr or invoking
EnsureCertificate/BuildTLSCertificate and skip the entire enrollment/renewal
path if mgmTlsEnabled is false.
| if [[ -z "${NETBIRD_TOKEN:-}" ]]; then | ||
| echo "[ERROR] NETBIRD_TOKEN is required." | ||
| echo " Set it before running:" | ||
| echo " export NETBIRD_TOKEN=<your-bearer-token>" | ||
| exit 1 |
There was a problem hiding this comment.
Send fatal diagnostics to stderr.
These branches exit non-zero but print their error output to stdout, which makes the script harder to compose in CI and shell pipelines.
Suggested change
- echo "[ERROR] NETBIRD_TOKEN is required."
- echo " Set it before running:"
- echo " export NETBIRD_TOKEN=<your-bearer-token>"
+ echo "[ERROR] NETBIRD_TOKEN is required." >&2
+ echo " Set it before running:" >&2
+ echo " export NETBIRD_TOKEN=<your-bearer-token>" >&2
@@
- echo "[ERROR] Failed to apply device auth settings (HTTP $SETTINGS_CODE):"
- echo "$SETTINGS_BODY"
+ echo "[ERROR] Failed to apply device auth settings (HTTP $SETTINGS_CODE):" >&2
+ echo "$SETTINGS_BODY" >&2
@@
- echo "[ERROR] Failed to fetch enrollment requests (HTTP $ENROLLMENTS_CODE):"
- echo "$ENROLLMENTS_BODY"
+ echo "[ERROR] Failed to fetch enrollment requests (HTTP $ENROLLMENTS_CODE):" >&2
+ echo "$ENROLLMENTS_BODY" >&2
@@
- echo "[ERROR] Failed to fetch trusted CAs (HTTP $CAS_CODE):"
- echo "$CAS_BODY"
+ echo "[ERROR] Failed to fetch trusted CAs (HTTP $CAS_CODE):" >&2
+ echo "$CAS_BODY" >&2Also applies to: 43-46, 63-65, 82-84
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@infrastructure_files/scripts/setup-device-auth.sh` around lines 20 - 24, The
error messages in setup-device-auth.sh (e.g., the NETBIRD_TOKEN check and the
other failing branches) are printed to stdout but should go to stderr; update
the echo/printing calls in the failing branches (the NETBIRD_TOKEN check and the
similar branches at the other ranges) to write to stderr (for example by
redirecting to >&2 or using printf to >&2) and keep the non-zero exit unchanged
so diagnostic output goes to stderr while the script still exits with failure.
| func TestBeginTPMAttestation_InvalidEKCert_ReturnsError(t *testing.T) { | ||
| s := &Server{attestationSessions: NewAttestationSessionStore()} | ||
| req := &proto.BeginTPMAttestationRequest{ | ||
| EkCertPem: "not-a-valid-cert", | ||
| AkPubPem: "also-invalid", | ||
| CsrPem: "csr", | ||
| } | ||
| _, err := s.BeginTPMAttestation(context.Background(), req) | ||
| require.Error(t, err, "invalid EK cert must return error") | ||
| st, ok := status.FromError(err) | ||
| require.True(t, ok) | ||
| assert.Equal(t, codes.InvalidArgument, st.Code()) | ||
| } | ||
|
|
||
| func TestBeginTPMAttestation_EmptyEKCert_ReturnsError(t *testing.T) { | ||
| s := &Server{attestationSessions: NewAttestationSessionStore()} | ||
| req := &proto.BeginTPMAttestationRequest{ | ||
| EkCertPem: "", | ||
| AkPubPem: "", | ||
| CsrPem: "csr", | ||
| } | ||
| _, err := s.BeginTPMAttestation(context.Background(), req) | ||
| require.Error(t, err) | ||
| st, ok := status.FromError(err) | ||
| require.True(t, ok) | ||
| assert.Equal(t, codes.InvalidArgument, st.Code()) | ||
| } |
There was a problem hiding this comment.
Isolate EK-certificate validation in these tests.
Both cases also make AkPubPem/CsrPem invalid, so they can still pass if EK validation regresses and another field is rejected first. Keep AK + CSR valid here and vary only EkCertPem.
Suggested test tightening
func TestBeginTPMAttestation_InvalidEKCert_ReturnsError(t *testing.T) {
s := &Server{attestationSessions: NewAttestationSessionStore()}
+ _, akKey := buildSelfSignedCertPEM(t)
req := &proto.BeginTPMAttestationRequest{
EkCertPem: "not-a-valid-cert",
- AkPubPem: "also-invalid",
- CsrPem: "csr",
+ AkPubPem: buildPublicKeyPEM(t, akKey),
+ CsrPem: buildCSRPEM(t),
}
_, err := s.BeginTPMAttestation(context.Background(), req)
require.Error(t, err, "invalid EK cert must return error")
st, ok := status.FromError(err)
@@
func TestBeginTPMAttestation_EmptyEKCert_ReturnsError(t *testing.T) {
s := &Server{attestationSessions: NewAttestationSessionStore()}
+ _, akKey := buildSelfSignedCertPEM(t)
req := &proto.BeginTPMAttestationRequest{
EkCertPem: "",
- AkPubPem: "",
- CsrPem: "csr",
+ AkPubPem: buildPublicKeyPEM(t, akKey),
+ CsrPem: buildCSRPEM(t),
}
_, err := s.BeginTPMAttestation(context.Background(), req)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/internals/shared/grpc/attestation_handler_test.go` around lines 21
- 47, The tests TestBeginTPMAttestation_InvalidEKCert_ReturnsError and
TestBeginTPMAttestation_EmptyEKCert_ReturnsError currently make AkPubPem and
CsrPem invalid too; update these tests so only EkCertPem is invalid/empty while
AkPubPem and CsrPem contain valid PEMs/CSR strings that pass
Server.BeginTPMAttestation's AK and CSR validation. Locate the tests in
attestation_handler_test.go and modify the request construction for each (the
proto.BeginTPMAttestationRequest) to keep AkPubPem and CsrPem valid, leaving
EkCertPem as the only varying/invalid field to ensure the test isolates
EK-certificate validation for Server.BeginTPMAttestation.
| func TestAttestationSessionStore_StartCleanup_ContextCancel(t *testing.T) { | ||
| s := NewAttestationSessionStore() | ||
| ctx, cancel := context.WithCancel(context.Background()) | ||
| s.StartCleanup(ctx, 10*time.Millisecond) | ||
| // Cancel and verify the goroutine stops without blocking. | ||
| cancel() | ||
| time.Sleep(50 * time.Millisecond) | ||
| } |
There was a problem hiding this comment.
This cancellation test never observes the worker exiting.
After cancel(), the test just sleeps and returns, so it will still pass if StartCleanup leaks a goroutine. Please make shutdown observable here, e.g. by waiting on a done signal or using a goroutine-leak check.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/internals/shared/grpc/attestation_sessions_test.go` around lines
80 - 87, The test never observes the cleanup worker exit so it can’t detect
goroutine leaks; change StartCleanup to return a done signal (e.g. return <-chan
struct{} or context-aware wait channel) that is closed when its goroutine exits,
and update TestAttestationSessionStore_StartCleanup_ContextCancel to call
s.StartCleanup(ctx, 10*time.Millisecond), cancel(), then select waiting on that
done channel with a short timeout (failing the test if the done channel isn’t
closed in time) so shutdown is observable; ensure the goroutine in StartCleanup
closes the returned channel when it returns.
| if cfg.Intune != nil && cfg.Intune.Enabled { | ||
| intuneCfgJSON, err := json.Marshal(IntuneConfig{ | ||
| TenantID: cfg.Intune.TenantID, | ||
| ClientID: cfg.Intune.ClientID, | ||
| ClientSecret: cfg.Intune.ClientSecret, | ||
| }) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("deviceinventory: marshal intune config: %w", err) | ||
| } | ||
| inv, err := NewIntuneInventory(string(intuneCfgJSON)) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| inventories = append(inventories, inv) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if IntuneConfig has a RequireCompliance field that should be populated
rg -n "RequireCompliance" --type go management/server/deviceinventory/Repository: netbirdio/netbird
Length of output: 164
🏁 Script executed:
#!/bin/bash
# Find IntuneConfig struct definition
rg -n "type IntuneConfig struct" --type go management/server/deviceinventory/Repository: netbirdio/netbird
Length of output: 134
🏁 Script executed:
#!/bin/bash
# Find NewIntuneInventory function and check its implementation
rg -n "func NewIntuneInventory" --type go management/server/deviceinventory/Repository: netbirdio/netbird
Length of output: 178
🏁 Script executed:
#!/bin/bash
# Search for where IntuneInventory or intune-related code handles RequireCompliance
rg -n "RequireCompliance" --type go -A 3 -B 3 management/server/deviceinventory/Repository: netbirdio/netbird
Length of output: 672
🏁 Script executed:
#!/bin/bash
# Read the IntuneConfig struct definition
sed -n '24,40p' management/server/deviceinventory/intune.goRepository: netbirdio/netbird
Length of output: 873
🏁 Script executed:
#!/bin/bash
# Read NewIntuneInventory implementation
sed -n '95,125p' management/server/deviceinventory/intune.goRepository: netbirdio/netbird
Length of output: 955
🏁 Script executed:
#!/bin/bash
# Check if RequireCompliance is used anywhere in intune.go
rg -n "RequireCompliance" management/server/deviceinventory/intune.goRepository: netbirdio/netbird
Length of output: 43
🏁 Script executed:
#!/bin/bash
# Check the full IntuneInventory struct definition
rg -n "type IntuneInventory struct" --type go -A 10 management/server/deviceinventory/intune.goRepository: netbirdio/netbird
Length of output: 372
🏁 Script executed:
#!/bin/bash
# Check RequireCompliance usage across all files in deviceinventory
rg -n "RequireCompliance" --type go management/server/deviceinventory/Repository: netbirdio/netbird
Length of output: 164
🏁 Script executed:
#!/bin/bash
# Check if there are any TODOs or comments about compliance in intune.go
rg -n "compliance|Compliance" --type go -i management/server/deviceinventory/intune.goRepository: netbirdio/netbird
Length of output: 43
🏁 Script executed:
#!/bin/bash
# Check how RequireCompliance is used for JAMF
rg -n "Jamf|jamf" --type go -B 3 -A 3 management/server/deviceinventory/inventory.go | head -60Repository: netbirdio/netbird
Length of output: 1302
🏁 Script executed:
#!/bin/bash
# Check if any source actually uses RequireCompliance
rg -n "RequireCompliance" --type go -B 5 -A 5 management/server/deviceinventory/Repository: netbirdio/netbird
Length of output: 1032
🏁 Script executed:
#!/bin/bash
# Check if cfg.Intune.RequireCompliance is read or used anywhere
rg -n "cfg\.Intune\.RequireCompliance|intune\.RequireCompliance" --type go management/server/deviceinventory/Repository: netbirdio/netbird
Length of output: 43
🏁 Script executed:
#!/bin/bash
# Check the complete code section where configs are marshaled to see full picture
sed -n '150,180p' management/server/deviceinventory/inventory.goRepository: netbirdio/netbird
Length of output: 960
RequireCompliance field in MultiSourceIntuneConfig is not propagated to IntuneInventory.
The MultiSourceIntuneConfig struct defines a RequireCompliance field (line 80), but this field is never passed to IntuneConfig when creating the Intune inventory (lines 151-155). Additionally, the IntuneConfig struct itself does not have a RequireCompliance field, meaning the Intune inventory implementation does not support compliance checking configuration. If compliance checking is intended, the IntuneConfig struct must be extended to include and handle this field.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/deviceinventory/inventory.go` around lines 150 - 164,
MultiSourceIntuneConfig's RequireCompliance is never passed into the Intune
inventory because IntuneConfig lacks that field; add a RequireCompliance bool
(or appropriate type) to the IntuneConfig struct, update any unmarshalling/usage
in NewIntuneInventory/IntuneInventory to honor that flag, and include
cfg.Intune.RequireCompliance when constructing the IntuneConfig JSON in the
inventory creation block so the RequireCompliance setting is propagated to
NewIntuneInventory.
| func TestJamfInventory_TokenURL_TrimsTrailingSlash(t *testing.T) { | ||
| tokenCalled := false | ||
|
|
||
| srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { | ||
| if r.URL.Path == "/api/oauth/token" { | ||
| tokenCalled = true | ||
| w.Header().Set("Content-Type", "application/json") | ||
| _ = json.NewEncoder(w).Encode(map[string]interface{}{ | ||
| "access_token": "tok", | ||
| "expires_in": 3600, | ||
| }) | ||
| return | ||
| } | ||
| // Computers endpoint — return empty. | ||
| w.Header().Set("Content-Type", "application/json") | ||
| _ = json.NewEncoder(w).Encode(map[string]interface{}{"results": []interface{}{}}) | ||
| })) | ||
| defer srv.Close() | ||
|
|
||
| // Provide URL with trailing slash — implementation trims it. | ||
| cfg, err := json.Marshal(map[string]interface{}{ | ||
| "url": srv.URL + "/", | ||
| "client_id": "c", | ||
| "client_secret": "s", | ||
| }) | ||
| require.NoError(t, err) | ||
|
|
||
| inv, err := deviceinventory.NewJamfInventory(string(cfg)) | ||
| require.NoError(t, err) | ||
|
|
||
| _, err = inv.IsRegistered(context.Background(), "12345") | ||
| require.NoError(t, err) | ||
| assert.True(t, tokenCalled, "token endpoint must be called at /api/oauth/token") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "jamf_test.go" -type fRepository: netbirdio/netbird
Length of output: 109
🏁 Script executed:
sed -n '303,335p' ./management/server/deviceinventory/jamf_test.goRepository: netbirdio/netbird
Length of output: 1164
🏁 Script executed:
grep -n "tokenCallCount" ./management/server/deviceinventory/jamf_test.goRepository: netbirdio/netbird
Length of output: 199
🏁 Script executed:
sed -n '160,200p' ./management/server/deviceinventory/jamf_test.goRepository: netbirdio/netbird
Length of output: 1455
Use atomic operations to prevent data race on tokenCalled.
The handler writes tokenCalled from the server goroutine while the test reads it from the test goroutine without synchronization. Replace with atomic.Bool or an int32, consistent with the tokenCallCount pattern used in TestJamfInventory_GetAccessToken_CachesTokenUntilExpiry.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/deviceinventory/jamf_test.go` around lines 303 - 335, The
test TestJamfInventory_TokenURL_TrimsTrailingSlash currently uses a plain bool
tokenCalled that is written by the server goroutine and read by the test
goroutine; replace tokenCalled with a concurrency-safe atomic (use atomic.Bool
or an int32 like the tokenCallCount pattern in
TestJamfInventory_GetAccessToken_CachesTokenUntilExpiry), update the handler to
set the atomic (Store(true) or atomic.AddInt32) and update the assertion to read
the atomic (Load() or atomic.LoadInt32) so there is no data race.
| if resp.StatusCode != http.StatusOK { | ||
| body, _ := io.ReadAll(io.LimitReader(resp.Body, maxHTTPResponseBytes)) | ||
| return false, fmt.Errorf("deviceinventory/jamf: API returned %d: %s", resp.StatusCode, string(body)) |
There was a problem hiding this comment.
Response body is read twice - second read will be empty.
At line 130, io.ReadAll consumes the response body. Then at line 139, json.NewDecoder tries to read from the same body, which is already drained. This means successful responses will fail to decode.
🐛 Fix: read body once and decode from bytes
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(io.LimitReader(resp.Body, maxHTTPResponseBytes))
return false, fmt.Errorf("deviceinventory/jamf: API returned %d: %s", resp.StatusCode, string(body))
}
+ body, err := io.ReadAll(io.LimitReader(resp.Body, maxHTTPResponseBytes))
+ if err != nil {
+ return false, fmt.Errorf("deviceinventory/jamf: read response: %w", err)
+ }
+
var result struct {
Results []struct {
ID string `json:"id"`
} `json:"results"`
}
- if err := json.NewDecoder(io.LimitReader(resp.Body, maxHTTPResponseBytes)).Decode(&result); err != nil {
+ if err := json.Unmarshal(body, &result); err != nil {
return false, fmt.Errorf("deviceinventory/jamf: decode response: %w", err)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if resp.StatusCode != http.StatusOK { | |
| body, _ := io.ReadAll(io.LimitReader(resp.Body, maxHTTPResponseBytes)) | |
| return false, fmt.Errorf("deviceinventory/jamf: API returned %d: %s", resp.StatusCode, string(body)) | |
| if resp.StatusCode != http.StatusOK { | |
| body, _ := io.ReadAll(io.LimitReader(resp.Body, maxHTTPResponseBytes)) | |
| return false, fmt.Errorf("deviceinventory/jamf: API returned %d: %s", resp.StatusCode, string(body)) | |
| } | |
| body, err := io.ReadAll(io.LimitReader(resp.Body, maxHTTPResponseBytes)) | |
| if err != nil { | |
| return false, fmt.Errorf("deviceinventory/jamf: read response: %w", err) | |
| } | |
| var result struct { | |
| Results []struct { | |
| ID string `json:"id"` | |
| } `json:"results"` | |
| } | |
| if err := json.Unmarshal(body, &result); err != nil { | |
| return false, fmt.Errorf("deviceinventory/jamf: decode response: %w", err) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/deviceinventory/jamf.go` around lines 129 - 131, The
response body is being consumed by io.ReadAll when checking non-200 responses,
so later json.NewDecoder(resp.Body).Decode(...) fails because the body is
drained; fix by reading the body once into a byte slice (respecting
maxHTTPResponseBytes), use that byte slice for both the error message and for
JSON decoding (e.g., json.Unmarshal or
json.NewDecoder(bytes.NewReader(bodyBytes))). Update the code paths around
resp.Body, maxHTTPResponseBytes, and the json decoding logic in the function
that handles the HTTP response so all reads come from the saved bytes and errors
include the same body content.
| func TestBuiltinCA_SignCSR_BadSignature(t *testing.T) { | ||
| ctx := context.Background() | ||
| ca := newBuiltinCA(t) | ||
|
|
||
| // Tamper with CSR: use a different key to sign the request. | ||
| csr, _ := newTestCSR(t, "key1") | ||
| _, err := ca.SignCSR(ctx, csr, "key1", 365) // should succeed (the csr is valid) | ||
| require.NoError(t, err) | ||
|
|
||
| // Now create a structurally valid CSR but corrupt the signature bytes. | ||
| // We can do this by generating a CSR with one key but substituting the | ||
| // public key from another key — creating a signature mismatch. | ||
| otherKey, err2 := ecdsa.GenerateKey(elliptic.P256(), rand.Reader) | ||
| require.NoError(t, err2) | ||
| tmpl := &x509.CertificateRequest{ | ||
| Subject: pkix.Name{CommonName: "tampered"}, | ||
| PublicKey: otherKey.Public(), | ||
| } | ||
| // We intentionally build a CSR template with mismatched public key. | ||
| // x509.CreateCertificateRequest will sign with the provided key; let's sign | ||
| // with otherKey but place csr's public key — not directly possible through | ||
| // the standard library. Instead, verify that CheckSignature on the DER is | ||
| // what SignCSR calls. We'll pass a CSR whose CheckSignature will fail by | ||
| // mutating RawSubjectPublicKeyInfo in a freshly parsed struct. | ||
| // Easiest reliable approach: use a CSR with signature over wrong data. | ||
| // We rely on the fact that x509.CertificateRequest.CheckSignature verifies | ||
| // the signature over RawTBSCertificateRequest. We can't easily forge this, | ||
| // so just verify the happy path and trust the stdlib for the sad path. | ||
| _ = tmpl | ||
| } |
There was a problem hiding this comment.
Incomplete test: TestBuiltinCA_SignCSR_BadSignature contains no assertions.
This test creates variables but doesn't verify any behavior. The extensive comments explain the difficulty in creating a CSR with an invalid signature using the standard library, but the test body ends with _ = tmpl discarding the value.
Either implement the test properly or remove it to avoid confusion. Consider using the makeBadSignatureCSR helper from vault_adapter_test.go which handles this case with DER corruption:
💡 Suggested approach
func TestBuiltinCA_SignCSR_BadSignature(t *testing.T) {
ctx := context.Background()
ca := newBuiltinCA(t)
- // Tamper with CSR: use a different key to sign the request.
- csr, _ := newTestCSR(t, "key1")
- _, err := ca.SignCSR(ctx, csr, "key1", 365) // should succeed (the csr is valid)
- require.NoError(t, err)
-
- // Now create a structurally valid CSR but corrupt the signature bytes.
- // ... [removed lengthy comment block] ...
- _ = tmpl
+ // Use DER corruption to create a CSR with invalid signature
+ badCSR := makeBadSignatureCSR(t) // reuse helper from vault_adapter_test.go
+ if badCSR == nil {
+ t.Skip("could not construct bad-signature CSR")
+ }
+ _, err := ca.SignCSR(ctx, badCSR, "tampered", 365)
+ require.Error(t, err)
+ assert.ErrorIs(t, err, devicepki.ErrInvalidCSR)
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/devicepki/builtin_ca_test.go` around lines 106 - 135, The
test TestBuiltinCA_SignCSR_BadSignature is incomplete — either implement it or
delete it; to implement, replace the current no-op CSR setup with a
DER-corrupted CSR produced by the existing helper makeBadSignatureCSR (from
vault_adapter_test.go), then call ca.SignCSR(ctx, badCSR, "key1", 365) and
assert it returns an error (e.g., require.Error(t, err) and optionally assert
the error message/type). Ensure you reference
TestBuiltinCA_SignCSR_BadSignature, the ca.SignCSR method, and the
makeBadSignatureCSR helper when adding the assertion so the test verifies the
bad-signature path instead of discarding tmpl.
| func TestNewCA_VaultCAType_ValidConfig_ReturnsVaultCA(t *testing.T) { | ||
| cfg, _ := json.Marshal(map[string]interface{}{ | ||
| "address": "https://vault.example.com:8200", | ||
| "token": "test-token", | ||
| "mount": "pki", | ||
| "role": "netbird-device", | ||
| }) | ||
| settings := &types.DeviceAuthSettings{ | ||
| CAType: types.DeviceAuthCATypeVault, | ||
| CAConfig: string(cfg), | ||
| } | ||
| ca, err := devicepki.NewCA(context.Background(), settings, "acct-vault", nil, "") | ||
| require.NoError(t, err) | ||
| assert.NotNil(t, ca) | ||
| } |
There was a problem hiding this comment.
These success cases won't catch a wrong factory dispatch.
Each "valid config" test only checks that NewCA returned a non-nil devicepki.CA, so the suite still passes if the switch in management/server/devicepki/factory.go accidentally routes Vault/Smallstep/SCEP to the wrong implementation. Assert the concrete type or some backend-specific behavior here.
Example assertion style
ca, err := devicepki.NewCA(context.Background(), settings, "acct-vault", nil, "")
require.NoError(t, err)
assert.NotNil(t, ca)
+ assert.IsType(t, &devicepki.VaultCA{}, ca)Also applies to: 91-103, 115-127
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/devicepki/factory_test.go` around lines 65 - 79, The tests
like TestNewCA_VaultCAType_ValidConfig_ReturnsVaultCA call devicepki.NewCA and
only assert non-nil; update these tests (including the ones at the other ranges)
to verify the concrete implementation returned for each CA type by either
performing a type assertion against the expected backend concrete type (e.g.,
the Vault CA concrete struct returned for types.DeviceAuthCATypeVault) or
invoking a backend-specific method/behavior and asserting its expected result;
locate the NewCA call in the test functions
(TestNewCA_VaultCAType_ValidConfig_ReturnsVaultCA and the analogous
Smallstep/SCEP tests) and replace or add assertions that ensure the returned ca
is the expected backend implementation rather than just non-nil.
| transport := &http.Transport{} | ||
| if cfg.RootPEM != "" { | ||
| pool := x509.NewCertPool() | ||
| if !pool.AppendCertsFromPEM([]byte(cfg.RootPEM)) { | ||
| return nil, fmt.Errorf("devicepki/smallstep: failed to parse RootPEM") | ||
| } | ||
| transport.TLSClientConfig = &tls.Config{RootCAs: pool} | ||
| } |
There was a problem hiding this comment.
Missing TLS MinVersion in client configuration.
The static analyzer correctly flags that the TLS config lacks MinVersion. While Go defaults to TLS 1.2 for clients, explicitly setting MinVersion: tls.VersionTLS12 (or TLS 1.3 if the CA supports it) improves security posture and makes the requirement explicit.
🔒 Add explicit TLS minimum version
transport := &http.Transport{}
if cfg.RootPEM != "" {
pool := x509.NewCertPool()
if !pool.AppendCertsFromPEM([]byte(cfg.RootPEM)) {
return nil, fmt.Errorf("devicepki/smallstep: failed to parse RootPEM")
}
- transport.TLSClientConfig = &tls.Config{RootCAs: pool}
+ transport.TLSClientConfig = &tls.Config{
+ RootCAs: pool,
+ MinVersion: tls.VersionTLS12,
+ }
}🧰 Tools
🪛 ast-grep (0.42.1)
[warning] 101-101: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{RootCAs: pool}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures
(missing-ssl-minversion-go)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@management/server/devicepki/smallstep_adapter.go` around lines 96 - 103, The
TLS client config for the Smallstep adapter (when setting
transport.TLSClientConfig in smallstep_adapter.go) is missing an explicit
MinVersion; update the TLS config assigned to transport.TLSClientConfig to
include MinVersion: tls.VersionTLS12 (or tls.VersionTLS13 if you verify
CA/support) so the config becomes &tls.Config{RootCAs: pool, MinVersion:
tls.VersionTLS12} to enforce the minimum TLS version.
Thanks for fixing the PR and for the contribution in general. I might be understanding the GitHub issue, but the request there is more around a posture check to validate company ownership. This is similar to using mTLS, but not necessarily the same approach to resolve it. We offer something for it as a cloud feature with MDM/EDR integrations. But we could also go with a posture check approach. Overall, as we mention in our contribution guide, we need to discuss changes like this beforehand, as it touches protocol, APIs, and behavior in general. Let me check the proposal here, and we will discuss it soon. But being transparent, the change is far too large for a one-shot PR, and we are also working towards 1.0, and we will be rejecting a few PRs for the time being to avoid large code conflicts and potential instability. |
|
I agree that this is a really big feature, but it is still one feature ) And you said at https://github.com/netbirdio/netbird/blob/main/CONTRIBUTING.md?plain=1#L294 that team is Ok about huge PRs ) |
This comment is more around not force pushing to remove commit history, which doesn't mean large changes are ok. You can see our sonnar, which got updated due to the only large PR we accepted (reverse-proxy), but it was set to max 10K lines. |
|
Thanks for the comment! Just to be clear — I’m not trying to overrule anything here 😊 I just smiled and shared my own understanding of what I read in the repo (yes, I actually did read it 🙂). The reason I jumped in is that I came across the idea of certificate‑based authentication, noticed that others also wanted it, and decided to give it a try. Personally, I really liked both the result and the process. That said, the final decision is of course up to the project’s core team. |
|
@mlsmaycon , hi! Did your team look at this? |
|
Hi @dexion , we did, we will take a different approach for this feature. We have a few concerns around the UX, the user need and the amount of changes as well especially because we are focusing on 1.0 release. |




Describe your changes
Implement hardware-backed device certificate authentication using TPM 2.0 and Apple Secure Enclave. Device certificates are issued by an in-process CA and used for mTLS on the management gRPC channel — not the WireGuard data plane.
Architecture
Certificates authenticate the management-plane gRPC connection, not WireGuard tunnels. Two layers:
This ensures a revoked device is blocked at reconnect even if its certificate is still within validity period.
What's added
Management server (
management/):server/devicepki/—BuiltinCA(in-process ECDSA P-256 CA, AES-256-GCM key at rest, PKCS#12 export),ExternalCAadapter (signs CSRs against an admin-provided CA),CertificateFactory(selects backend)server/secretenc/—FileKeyProviderAES-256-GCM envelope encryption for CA private keyinternals/shared/grpc/—AttestationSessionStore(in-memory with TTL + background cleanup),AttestationHandler(BeginTPMAttestation,CompleteTPMAttestation,AttestAppleSE), device enrollment RPC, revocation check wired intoLoginPOST /api/accounts/{accountId}/device-ca,GET /api/accounts/{accountId}/device-ca,GET /api/accounts/{accountId}/device-ca/cacert,POST /api/accounts/{accountId}/device-enrollments/{enrollmentId}/approve,POST /api/accounts/{accountId}/device-enrollments/{enrollmentId}/reject,POST /api/accounts/{accountId}/device-certs/{certId}/revokecert_approverRBAC role — may approve/reject enrollments without full admin privilegesDeviceCertificateandDeviceEnrollmentstore modelsManagement proto (
proto/):BeginTPMAttestation/CompleteTPMAttestation— two-round TPM credential-activation protocol (EK → AK binding)AttestAppleSE— single-round Apple Secure Enclave attestation with CBOR decode + chain verificationGetDeviceEnrollmentStatus,SubmitDeviceCertRenewalRPCsClient (
client/):internal/enrollment/—Managerdrives the full enrollment lifecycle: CSR generation, submission, poll-until-approved, store cert, build mTLS config,StartRenewalLoop(7-day threshold, 6-hour check interval)tpm/—TPMProvider(TPM 2.0 viagoogle/go-tpm):GenerateAK,ActivateCredentialse/—SEProvider(Apple Secure Enclave viagithub.com/fxamacker/cbor):CreateAttestationconnect.go— wraps enrollment intpmProv.Available()guard (skips on iOS/Android/WASM), upgrades connection to mTLS after enrollmentDocumentation:
docs/DEVICE-SECURITY.md— full architecture description, sequence diagrams for all three enrollment modes (admin-issued, TPM, Apple SE),issueDeviceCertstep-by-step, revocation algorithm, known limitations (L-1 through L-6)Known limitations (documented, deferred)
Syncstreams not interrupted on revocation — peer blocked at next reconnectcert_approverrole not yet surfaced in dashboard RBAC UIIssue ticket number and link
Closes #4773
Related PRs
Stack
Checklist
Documentation
Select exactly one:
Docs PR URL (required if "docs added" is checked)
Paste the PR link from https://github.com/netbirdio/docs here:
See
docs/DEVICE-SECURITY.mdin this PR. External docs PR to follow.Summary by CodeRabbit