Skip to content

Fix Cloud Run service-to-service auth for doc-service-xliff #1

@jasperdew

Description

@jasperdew

Problem

Cloud Run service-to-service authentication broke after deploying new revisions of doc-service-xliff on 2026-04-14. The EU API (langbly-api-eu) gets a 403 Forbidden when calling doc-service-xliff's /extract and /inject endpoints.

Current workaround

doc-service-xliff is deployed with --allow-unauthenticated. This works but removes the auth layer between internal services.

Root cause analysis

  • IAM binding is correct: cloud-run-service-account@langbly.iam.gserviceaccount.com has roles/run.invoker on doc-service-xliff
  • Service account is correct on both services
  • ID token fetch via metadata server appears to work (cloudRunAuth.ts)
  • The 403 persisted even after rolling back BOTH services to previously-working revisions
  • Re-adding the IAM binding didn't help
  • The issue started after deploying doc-service-xliff-00021 (but rollback to 00020 didn't fix it)

Steps to restore auth

  1. Investigate why the ID token from the metadata server isn't accepted by Cloud Run's auth proxy
  2. Check if there's a GCP-side IAM propagation issue or if the audience URL format changed
  3. Re-enable auth: gcloud run deploy doc-service-xliff --no-allow-unauthenticated
  4. Test end-to-end document translation

Risk assessment

Low risk: doc-service-xliff only processes XLIFF files, has no API key auth, and its URL is not published. But restoring service-to-service auth is best practice.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions