Updating api server env to production#1398
Updating api server env to production#1398maknop wants to merge 55 commits intoambient-code:mainfrom
Conversation
Signed-off-by: red-hat-konflux <konflux@no-reply.konflux-ci.dev>
Signed-off-by: red-hat-konflux <konflux@no-reply.konflux-ci.dev>
Signed-off-by: red-hat-konflux <konflux@no-reply.konflux-ci.dev>
Signed-off-by: red-hat-konflux <konflux@no-reply.konflux-ci.dev>
removing label vendor value
Signed-off-by: red-hat-konflux <konflux@no-reply.konflux-ci.dev>
…d-main Red Hat Konflux update ambient-code-backend-main
…nd-main Red Hat Konflux update ambient-code-frontend-main
…or-main Red Hat Konflux update ambient-code-operator-main
…-api-main Red Hat Konflux update ambient-code-public-api-main
…t-api-server-main Red Hat Konflux update ambient-code-ambient-api-server-main
Signed-off-by: red-hat-konflux <konflux@no-reply.konflux-ci.dev>
…nt-runner-main Red Hat Konflux update ambient-code-ambient-runner-main
Creates kustomize overlay for deploying to hcmais01ue1 via app-interface: - Uses Konflux images from redhat-services-prod/hcm-eng-prod-tenant - Scales down in-cluster databases (using external RDS from app-interface Phase 2) - Scales down MinIO (using external S3 from app-interface Phase 2) - Includes CRDs, RBAC, routes, and all application components - Patches operator to use Konflux runner image Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Creates kustomize overlay for deploying to hcmais01ue1 via app-interface: - Uses Konflux images from redhat-services-prod/hcm-eng-prod-tenant - Scales down in-cluster databases (using external RDS from app-interface Phase 2) - Scales down MinIO (using external S3 from app-interface Phase 2) - Includes CRDs, RBAC, routes, and all application components - Patches operator to use Konflux runner image Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…-overlay Add app-interface overlay for AppSRE platform deployment
Convert kustomize overlay to OpenShift Template format for app-interface SaaS deployment. Split into two templates: 1. template-operator.yaml (CRDs, ClusterRoles, operator deployment) - Operator and ambient-runner images - Cluster-scoped resources (CRDs, RBAC) - Operator deployment and its ConfigMaps 2. template-services.yaml (Application services) - Backend, frontend, public-api, ambient-api-server images - All deployments, services, routes, configmaps - Scales in-cluster services to 0 (minio, postgresql, unleash) Both templates use IMAGE_TAG parameter (auto-generated from git commit SHA) and support Konflux image gating through app-interface. This allows app-interface to use provider: openshift-template with proper parameter substitution instead of the directory provider which doesn't run kustomize build.
Add openshift template
The objects field must be a YAML array with proper list indicators. Previous version was missing the '-' prefix on array items, causing: 'unable to decode STDIN: json: cannot unmarshal object into Go struct field Template.objects of type []runtime.RawExtension' Changes: - Rebuild templates using Python yaml library for correct formatting - Objects now properly formatted as YAML array with '- apiVersion:' - Add validate.sh script for testing with oc process - Both templates validated successfully Generated from kustomize overlay output with proper YAML structure.
fix: correct OpenShift Template objects array format
Remove minio, postgresql, unleash, ambient-api-server-db. Using external RDS and S3 from app-interface. Removed 12 resources (4 Deployments, 4 Services, 3 PVCs, 1 Secret) Remaining: ambient-api-server, backend-api, frontend, public-api
refactor: remove in-cluster services from template
updating postgresql db name
enabling SSL mode for RDS
Disables OTEL metrics export by commenting out OTEL_EXPORTER_OTLP_ENDPOINT environment variable in operator deployment manifests. The operator was configured to send metrics to otel-collector.ambient-code.svc:4317, but this service does not exist in the cluster, causing repeated gRPC connection failures every 30 seconds with error: "failed to upload metrics: context deadline exceeded: rpc error: code = Unavailable desc = name resolver error: produced zero addresses" With OTEL_EXPORTER_OTLP_ENDPOINT unset, InitMetrics() will skip metrics export and log "metrics export disabled" instead of throwing connection errors. Changes: - Comment out OTEL_EXPORTER_OTLP_ENDPOINT in base operator deployment - Comment out OTEL_EXPORTER_OTLP_ENDPOINT in OpenShift template - Add clarifying comment about re-enabling when collector is deployed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
fix: disable OpenTelemetry metrics export in operator
enabling ssl mode for RDS
Changes: - Add oauth-proxy component to frontend deployment (dashboard-ui port on 8443) - Enable SSL for ambient-api-server RDS connection (db-sslmode=require) - Set AMBIENT_ENV to 'stage' for ambient-api-server - Enable OpenShift service-ca for ambient-api-server TLS cert provisioning - Regenerate templates with new oauth-proxy and api-server patches This enables: - Authenticated access to frontend via OpenShift OAuth - Secure connections to external RDS database - Automatic TLS certificate rotation for ambient-api-server Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add OAuth proxy and SSL/TLS configuration for app-interface overlay
Remove postgresql, minio, unleash, and ambient-api-server-db resources from the services template. These services are scaled to 0 via kustomize patches because we use external RDS and S3 instead. Including them in the template causes app-interface to try deploying them, which fails imagePattern validation and wastes resources. Excluded resources: - Deployment/postgresql, Service/postgresql - Deployment/minio, Service/minio, PVC/minio-data - Deployment/unleash, Service/unleash - Deployment/ambient-api-server-db, Service/ambient-api-server-db Template now has 21 service resources (down from 30). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…ices Remove in-cluster services from template-services.yaml
Switch from custom vault secrets to OpenShift service account-based OAuth: - Use Red Hat's official ose-oauth-proxy-rhel9 image - Use service account token for cookie secret (no vault needed) - Enable HTTPS on OAuth proxy with OpenShift service-ca auto-generated certs - Add system:auth-delegator ClusterRoleBinding for OAuth delegation - Add OAuth redirect reference annotation to frontend ServiceAccount - Fix service account reference from 'nginx' to 'frontend' - Add missing NAMESPACE and UPSTREAM_TIMEOUT parameters Benefits: - No manual vault secret management - Automatic TLS cert rotation via service-ca - Standard OpenShift OAuth integration pattern - Follows app-interface team recommendations Files changed: - frontend-rbac.yaml: Added OAuth annotations and auth-delegator binding - oauth-proxy component patches: Updated to new configuration - Templates: Regenerated with OAuth fixes (27 operator, 21 service resources) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix OAuth proxy configuration to use OpenShift service account auth
The RDS credentials secret should not be in the OpenShift template - it's
provided by the external resource provider (terraform) in app-interface.
The namespace's externalResources section already defines:
- provider: rds
output_resource_name: ambient-code-rds
This automatically creates the secret with the correct RDS credentials.
Including the secret in the template with VAULT_INJECTED placeholders
caused deployment failures.
Changes:
- Excluded ambient-code-rds secret from template generation
- Template now has 20 service resources (down from 21)
- Deployment still references the secret via volumeMount (correct)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Exclude ambient-code-rds secret from services template
Signed-off-by: Chris Mitchell <cmitchel@redhat.com>
…ination fix: fix frontent route termination
Signed-off-by: Chris Mitchell <cmitchel@redhat.com>
fix: revert https changes for oauth pods
Changes GCP service account configuration to align with app-interface deployment where credentials are provided via Vault. Changes: - template-services.yaml: Update backend vertex-credentials secret name from 'ambient-vertex' to 'stage-gcp-creds' (matches Vault secret) - template-operator.yaml: Update GOOGLE_APPLICATION_CREDENTIALS path to match Vault secret key name 'itpc-gcp-hcm-pe-eng.json' The secret is provided by app-interface via: path: engineering-productivity/ambient-code/stage-gcp-creds This allows the backend and operator to use Vertex AI for Claude and Gemini API calls with the service account configured with roles/aiplatform.user permissions. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Chris Mitchell <cmitchel@redhat.com>
Update Vertex AI credentials to use app-interface Vault secret
Configure OAuth proxy sidecar to inject authentication token into forwarded requests, fixing 401 errors on /api/projects endpoints. Changes: - Add --pass-access-token=true flag to inject X-Forwarded-Access-Token header - Change upstream from frontend-service:3000 to localhost:3000 (correct sidecar pattern) - Remove --request-logging to reduce log noise Backend logs showed: tokenSource=none hasAuthHeader=false hasFwdToken=false The backend expects the X-Forwarded-Access-Token header, which is now injected by the OAuth proxy for all authenticated requests. Flow: 1. User authenticates via OpenShift OAuth ✓ 2. OAuth proxy injects token header ✓ (new) 3. Frontend forwards token to backend API ✓ (fixed) This resolves the 401 authentication errors while maintaining the working OpenShift OAuth integration. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix OAuth proxy to pass access token to backend API
❌ Deploy Preview for cheerful-kitten-f556a0 failed.
|
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (50)
Cache: Disabled due to data retention organization setting Knowledge base: Disabled due to data retention organization setting 📝 WalkthroughWalkthroughPull request adds comprehensive Tekton CI/CD pipelines for automated container image builds across six service components with conditional security scanning, renames the database secret to Changes
✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This change updates the environment to be
productionwhich should resolve a crashing pod in OpenShift containing the following error:Summary by CodeRabbit
New Features
Chores