This is a GitHub Action to deploy a service with Cloud Deploy. It supports three types of deployment targets
- Managed Cloud Run
- Kubernetes Deployment
- Kubernetes StatefulSet
See action.yml.
See the schema documentation for details about the cloud-deploy.yaml.
This action requires a GCP service account key with permission to deploy the cloud run services.
Once created, the JSON key should be base64 encoded and added as secret in the GitHub repository.
Important: Always store service account keys as GitHub secrets. Never commit them to your repository.
name: Deploy Service
on:
push:
branches:
- master
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build Docker image
run: docker build -t my-image:${{ github.sha }} .
- name: Deploy to Cloud
uses: extenda/actions/cloud-deploy@v0
with:
secrets-account-key: ${{ secrets.SECRET_AUTH }}
service-account-key: ${{ secrets.GCLOUD_AUTH_STAGING }}
image: my-image:${{ github.sha }}
update-dns: if-missingThe action will read a cloud-deploy.yaml file for its configuration.
The following examples demonstrate common deployment configurations for different use cases. Each example includes inline explanations and can be adapted to your specific requirements.
The security sidecar will use default resources. This example shows the minimal configuration for a Cloud Run service with IAM-based authentication.
cloud-run:
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
concurrency: 80
traffic:
static-egress-ip: false
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1
domain-mappings:
- my-service.retailsvc.com
- my-service.retailsvc-test.com
env: &env
KEY: value
staging:
min-instances: 0
max-instances: 1
domain-mappings:
- my-service.retailsvc.dev
env:
<<: *envcloud-run:
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
concurrency: 80
traffic:
static-egress-ip: false
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1
domain-mappings:
- my-service.retailsvc.com
- my-service.retailsvc-test.com
env: &env
SQL_INSTANCE_NAME: sm://*/secret-name # Important ENV variable
staging:
min-instances: 0
max-instances: 1
domain-mappings:
- my-service.retailsvc.dev
env:
<<: *envcloud-run:
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
concurrency: 80
traffic:
static-egress-ip: false
security:
consumers:
service-accounts:
- my-service@project-id.iam.gserviceaccount.com
- user:user-account@extendaretail.com
- group:group-account@extendaretail.com
audiences:
- https://my-service.retailsvc.com
- my-service
Configuration:
service-accounts: List of service accounts, users (prefix withuser:), or groups (prefix withgroup:) allowed to invoke the serviceaudiences: List of allowed audiences in JWT tokens for authentication
Protect your Cloud Run service with a Cloud Armor security policy to defend against DDoS attacks and other web-based threats.
cloud-run:
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
concurrency: 80
traffic:
static-egress-ip: false
security:
cloud-armor:
policy-name: cloud-armor-policy-name
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1Note: The Cloud Armor policy must be created beforehand in your GCP project.
This example demonstrates advanced Cloud Run features for optimizing performance and resource usage:
- startup-cpu-boost: Allocates extra CPU during container startup for faster cold starts
- cpu-throttling: Disabled to keep CPU available even when not serving requests (useful for background tasks)
- session-affinity: Enabled to route requests from the same client to the same instance
- request-logs: Configured to reduce logging overhead while keeping error logs
cloud-run:
service: my-service
resources:
cpu: 2
memory: 1Gi
protocol: http
scaling:
concurrency: 100
traffic:
static-egress-ip: false
startup-cpu-boost: true
cpu-throttling: false
session-affinity: true
request-logs:
cloud-run: false
load-balancer: false
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 2
max-instances: 50
domain-mappings:
- my-service.retailsvc.comNote: Request logs for 429 and 5xx responses are always logged, even when cloud-run: false.
Automatically scale your service up and down based on time of day to optimize costs. This example scales up at 07:00 UTC and down to 0 instances at 22:00 UTC.
Important notes:
- All times are in UTC
- Scaling triggers run every 30 minutes at :00 and :30
- If scale-hours is set to 07:50, scaling will occur at 08:00
- The
min-instancesvalue in the environment is used during scale-up hours - Outside scale-hours, instances scale to 0
cloud-run:
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
concurrency: 80
schedule:
- scale-hours: 07:00-22:00
region: europe-west1
traffic:
static-egress-ip: false
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1
domain-mappings:
- my-service.retailsvc.comRoute different URL paths to different backend services or Cloud Storage buckets using path-based routing. This example shows how to route different API versions to separate services.
cloud-run:
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
concurrency: 80
traffic:
static-egress-ip: false
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1
domain-mappings:
- my-service.retailsvc.com
- my-service.retailsvc-test.com
path-mappings:
- paths:
- /login/*
- /api/v2/*
- /api/beta/*
service: my-service-a
path-rewrite: /
- paths:
- /api/v3/*
service: my-service-bExample: A request to my-service.retailsvc.com/api/v2/feature will be routed to
my-service-a with the path rewritten to /feature.
Configuration:
serviceorbucket: Target backend service or Cloud Storage bucket (must exist in the same project)paths: URL path patterns to match for this targetpath-rewrite: Optional path prefix to rewrite on the downstream service
Configure your Cloud Run service to route all outgoing traffic through a NAT router with a static public IP address. This is useful when your service needs to communicate with external APIs or services that require IP whitelisting.
Note: While IP whitelisting is generally discouraged in favor of more secure authentication methods, it may be required when integrating with legacy systems or third-party services.
cloud-run:
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
concurrency: 80
traffic:
static-egress-ip: true
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1
env:
EXTERNAL_API_URL: https://api.example.comConfiguration:
static-egress-ip: Set totrue(default) to route egress traffic through a NAT router with a static public IP- When enabled, all outbound requests from the service will appear to originate from the same static IP address
- The static IP address is managed by the platform and shared across services in the same region
- Set to
falseto use direct VPC egress without a static IP (better performance, no IP whitelisting capability)
A collector sidecar can be configured on Cloud Run to collect Prometheus and Open Telemetry metrics and traces.
The same sidecar supports both Prometheus and Open Telemetry.
Prometheus configuration:
- Set the
prometheussection if your service exposes a metrics endpoint to be scraped - Scraping is performed on the internal service port 8080 and will not pass through the security sidecar
- Default scraping path is
/metrics, but can be customized
Open Telemetry configuration:
- Set the
open-telemetrykey to activate the Open Telemetry feature in the collector - The default
autoconfiguration should be correct for most solutions on Google Cloud - Configuration is injected as environment variables on the user container to enable auto instrumentation
- The collector exposes an OTLP endpoint which is the default exporter
- By default, only
tracesare collected. Usecollect: [traces, metrics]to also collect OTEL metrics
cloud-run:
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
concurrency: 80
traffic:
static-egress-ip: false
monitoring:
prometheus:
interval: 60
path: /metrics
port: 8080
open-telemetry:
config: auto
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1For more control over Open Telemetry behavior, you can customize the collector configuration:
cloud-run:
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
concurrency: 80
traffic:
static-egress-ip: false
monitoring:
open-telemetry:
config:
sampler: parentbased_traceidratio
sampler-ratio: 0.5
propagators: [tracecontext, baggage]
otlp-exporter-protocol: grpc
collect: [traces, metrics]
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1Configuration options:
sampler: Sampling strategy (e.g.,parentbased_traceidratio,always_on,always_off)sampler-ratio: Sampling ratio (0.0 to 1.0) when using ratio-based samplingpropagators: Context propagation formats (e.g.,tracecontext,baggage,b3)otlp-exporter-protocol: Protocol for exporting data (grpcorhttp/protobuf)collect: Signals to export (traces,metrics, or both)
Enable CORS (Cross-Origin Resource Sharing) support in the security proxy for services that need to handle preflight requests from web browsers. By default, CORS preflight requests are denied.
Important: CORS settings only work when using an auth-proxy (default: envoy-opa). They are ignored
when security: none is configured.
Basic CORS configuration - Enable CORS with default allowed origins (only domain-mapped origins):
cloud-run:
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
concurrency: 80
traffic:
static-egress-ip: false
security:
permission-prefix: mye
cors:
enabled: true
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1
domain-mappings:
- my-service.retailsvc.com
- my-service.retailsvc-test.comAdvanced CORS configuration - Allow additional origins and custom headers:
cloud-run:
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
concurrency: 80
traffic:
static-egress-ip: false
security:
permission-prefix: mye
cors:
enabled: true
additional-allow-origins:
- https://external-app.example.com
- .trusted-partners.com # Matches any subdomain of trusted-partners.com
additional-allow-headers:
- X-Custom-Header
- X-API-Key
additional-expose-headers:
- X-Request-Id
- X-RateLimit-Remaining
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1
domain-mappings:
- my-service.retailsvc.com
- my-service.retailsvc-test.comCORS Configuration Reference:
enabled: Set totrueto enable CORS preflight request handling (default:false)additional-allow-origins: List of allowed origins for CORS requests- Use full URI (e.g.,
https://example.com) for exact match - Use leading dot (e.g.,
.example.com) for suffix match (matches any subdomain) - Domain-mapped origins are automatically allowed when
enabled: true
- Use full URI (e.g.,
additional-allow-headers: List of allowed request headers from browsers- Cannot include
Tenant-Id(reserved header) - Standard headers like
Content-Type,Authorizationare always allowed
- Cannot include
additional-expose-headers: List of response headers accessible to browsers- Cannot include
Tenant-Id(reserved header) - Useful for custom headers clients need to read (e.g., pagination, rate limit info)
- Cannot include
Example request headers and responses:
A browser making a CORS request might send:
Origin: https://external-app.example.com
Access-Control-Request-Method: POST
Access-Control-Request-Headers: X-Custom-Header
The service will respond with:
Access-Control-Allow-Origin: https://external-app.example.com
Access-Control-Allow-Methods: GET, POST, PUT, DELETE, OPTIONS
Access-Control-Allow-Headers: Content-Type, Authorization, X-Custom-Header, X-API-Key
Access-Control-Expose-Headers: X-Request-Id, X-RateLimit-Remaining
Access-Control-Allow-Credentials: true
Configure whether your service should be accessible via the internal load balancer at service-name.internal.
This is enabled by default and is required for service-to-service communication within your VPC.
cloud-run:
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
concurrency: 80
traffic:
static-egress-ip: false
internal-traffic: true
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1
domain-mappings:
- my-service.retailsvc.com
env:
SERVICE_MODE: productionConfiguration:
internal-traffic: true(default): Service is accessible via internal load balancer atmy-service.internalfor VPC-internal communicationinternal-traffic: false: Disables internal load balancer access; service is only reachable via external domain mappings- Both internal and external access can be enabled simultaneously by keeping
internal-traffic: trueand configuringdomain-mappings
Note: Most services should keep internal-traffic: true to enable service-to-service communication within your infrastructure.
Warning
Multi-region deployments are currently in preview and may be subject to change without notice.
Deploy your service to multiple Google Cloud regions for high availability and reduced latency. Each region can have its own configuration.
cloud-run:
service: my-service
resources:
cpu: 2
memory: 1Gi
protocol: http
scaling:
concurrency: 100
traffic:
static-egress-ip: false
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 2
max-instances: 100
domain-mappings:
- my-service.retailsvc.com
regions:
- europe-west1
- us-central1
- asia-northeast1
env: &env
DATABASE_URL: sm://*/database-connection-string
staging:
min-instances: 0
max-instances: 5
regions:
- europe-west1
env:
<<: *envNote: Multi-region deployments automatically set up global load balancing across all specified regions.
This example demonstrates combining multiple advanced features in a production-ready configuration:
cloud-run:
service: my-production-service
resources:
cpu: 2
memory: 2Gi
protocol: http
timeout: 60
scaling:
concurrency: 100
schedule:
- scale-hours: 06:00-23:00
region: europe-west1
traffic:
static-egress-ip: true
direct-vpc-connection: false
startup-cpu-boost: true
cpu-throttling: true
session-affinity: false
monitoring:
prometheus:
interval: 60
path: /metrics
port: 8080
open-telemetry:
config:
sampler: parentbased_traceidratio
sampler-ratio: 0.1
collect: [traces, metrics]
request-logs:
cloud-run: true
load-balancer: false
security:
permission-prefix: mye
cloud-armor:
policy-name: production-armor-policy
cors:
enabled: true
resources:
cpu: 0.5
memory: 512Mi
labels:
product: my-product
component: my-component
iso-country: global
environments:
production:
min-instances: 3
max-instances: 100
domain-mappings:
- my-service.retailsvc.com
- my-service.retailsvc-test.com
path-mappings:
- paths:
- /api/v1/*
service: my-service-v1
path-rewrite: /
regions:
- europe-west1
- us-central1
env: &env
DATABASE_URL: sm://*/database-url
REDIS_URL: sm://*/redis-url
LOG_LEVEL: info
staging:
min-instances: 0
max-instances: 5
domain-mappings:
- my-service.retailsvc.dev
env:
<<: *env
LOG_LEVEL: debugThe action supports deploying to Google Kubernetes Engine (GKE) Autopilot clusters with both Deployment and StatefulSet resources.
A Kubernetes deployment with IAM security and customized resource allocation for the security sidecar.
kubernetes:
type: Deployment
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
cpu: 50
security:
permission-prefix: mye
resources:
cpu: 1
memory: 1Gi
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1
max-instances: 20
domain-mappings:
- my-service.retailsvc.com
- my-service.retailsvc-test.com
env: &env
KEY: value
staging:
min-instances: 1
max-instances: 1
domain-mappings:
- my-service.retailsvc.dev
env:
<<: *envConfigure vertical scaling to automatically adjust CPU and memory resources based on actual usage. This is useful for workloads with variable resource requirements.
kubernetes:
type: Deployment
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
cpu: 50
vertical:
threshold: 50
kafka-lag-threshold: 1000
increments-cpu: 1
max-cpu: 5
max-memory: 8Gi
scale-up-interval: 8
scale-up-threshold: 5
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 2
max-instances: 20
domain-mappings:
- my-service.retailsvc.comVertical scaling behavior:
- Every minute, the autoscaler checks pod CPU usage and the kafka lag metrics
- If any pod uses CPU above the threshold (50%), a scale-up is triggered
- Scale-up increases CPU by
increments-cpu(1 core) up tomax-cpu(5 cores) - Scale-up can only occur 8 minutes after the previous scale-up
- Scale-down occurs 30 minutes after the last scale-up when CPU is below threshold
- A scale-up requires 5 consecutive successful checks before triggering
Configuration:
threshold: CPU percentage utilization to trigger vertical scaling (10-100%)kafka-lag-threshold: (Optional) Kafka consumer lag threshold to trigger vertical scaling ( amount of records )increments-cpu: CPU increase per scale-up eventmax-cpu: Maximum CPU cores allowedmax-memory: Maximum memory allowed (scales proportionally with CPU)scale-up-interval: Minimum minutes between scale-up eventsscale-up-threshold: Number of consecutive checks before scaling up
An internal Kubernetes gRPC service without IAM security. The request timeout has been increased from the default 300s to 900s (15 minutes) for long-running operations. This service will only be available on the internal domain.
kubernetes:
type: Deployment
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http2
timeout: 900
scaling:
cpu: 50
security: none
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1
env: &env
KEY: value
staging:
min-instances: 1
max-instances: 1
env:
<<: *envNote: Use protocol: http2 for gRPC services. Set security: none to disable IAM authentication.
GKE Autopilot includes managed Prometheus. Configure a PodMonitoring resource to automatically
scrape metrics from your pods. This example collects metrics every 60 seconds from the /metrics endpoint.
Scraping is performed on the internal service port 8080 and does not pass through the security sidecar, so your application can expose metrics without authentication overhead.
kubernetes:
type: Deployment
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
cpu: 50
monitoring:
prometheus:
interval: 60
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1
max-instances: 20
domain-mappings:
- my-service.retailsvc.com
- my-service.retailsvc-test.com
env: &env
KEY: value
staging:
min-instances: 0
max-instances: 1
domain-mappings:
- my-service.retailsvc.dev
env:
<<: *envOpen Telemetry is supported on GKE Autopilot using a collector sidecar. The configuration is similar to Cloud Run,
but Prometheus scraping in Kubernetes always uses PodMonitoring resources instead of the collector sidecar.
The collector sidecar is only used for Open Telemetry traces and metrics. By default, only traces are collected.
To also collect OTEL metrics, use collect: [traces, metrics].
This configuration works the same way for both Deployment and StatefulSet.
kubernetes:
type: Deployment
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
cpu: 50
monitoring:
open-telemetry:
config: auto
security:
permission-prefix: mye
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 1
max-instances: 20
domain-mappings:
- my-service.retailsvc.com
- my-service.retailsvc-test.com
env:
KEY: valueA Kubernetes StatefulSet for stateful applications that require persistent storage. This example doesn't use
IAM security and has a persistent SSD volume mounted at /mnt/shared/data.
StatefulSets are ideal for applications that need:
- Stable, unique network identifiers
- Stable, persistent storage
- Ordered, graceful deployment and scaling
kubernetes:
type: StatefulSet
service: my-service
resources:
cpu: 1
memory: 512Mi
protocol: http
scaling:
cpu: 50
volumes:
- disk-type: ssd
size: 5Gi
mount-path: /mnt/shared/data
security: none
labels:
product: my-product
component: my-component
environments:
production:
min-instances: 3
max-instances: 3
env: &env
KEY: value
staging:
min-instances: 1
max-instances: 1
env:
<<: *envVolume configuration:
disk-type: Type of persistent disk (ssdorstandard)size: Disk size (e.g.,5Gi,10Gi,100Gi)mount-path: Directory where the volume will be mounted in the container
Note: StatefulSets maintain a sticky identity for each pod. If a pod is rescheduled, it maintains the same name and persistent volume.
Use Secret Manager references for sensitive configuration:
environments:
production:
env:
DATABASE_URL: sm://*/database-url
API_KEY: sm://*/api-key
PUBLIC_URL: https://my-service.retailsvc.comThe sm://*/secret-name syntax automatically retrieves secrets from Google Secret Manager.
Start with conservative resource allocations and monitor usage:
Cloud Run:
- Start with
cpu: 1andmemory: 512Mi - Enable
startup-cpu-boost: trueif cold starts are slow - Use
cpu-throttling: falseonly for background tasks that need continuous CPU
Kubernetes:
- Start with
cpu: 1andmemory: 512Mi - Set
scaling.cpu: 50for moderate autoscaling responsiveness - Consider vertical scaling for workloads with unpredictable resource needs
For Cloud Run:
- Set
min-instances: 0in staging to save costs - Set
min-instances: 1-3in production to avoid cold starts - Use scheduled scaling for predictable traffic patterns
- Set appropriate
concurrencybased on your application (50-100 for I/O bound, lower for CPU bound)
For Kubernetes:
- Set
min-instancesbased on required availability (1-3 for production) - Configure
max-instancesbased on expected peak load - Use
availability: highfor critical services
- Enable Prometheus for application metrics
- Enable Open Telemetry for distributed tracing
- Use
request-logs.cloud-run: falsein high-traffic services to reduce log costs - Always monitor error logs (429 and 5xx are logged regardless of settings)
- Always use the
permission-prefixfor IAM-based authentication - Use Cloud Armor for DDoS protection on public-facing services
- Enable CORS only when needed for browser-based clients
- Keep
internal-traffic: true(default) to enable service-to-service communication within your VPC - Use IAM bindings (
consumers.service-accounts) for service-to-service authentication
- Deploy to multiple regions for high availability
- Use
europe-west1as the primary region for European services - Add
us-central1andasia-northeast1for global services - Global load balancing is automatically configured
- Use
update-dns: if-missing(default) to avoid accidental DNS changes - Use
update-dns: alwaysonly when migrating DNS entries - Configure path-mappings for API versioning and routing
The cloud-deploy JSON schema documentation is generated with json-schema-for-humans.
Install it with pip
pip3 install json-schema-for-humansAnd use the command below to update the generated documentation.
generate-schema-doc --config template_name=md cloud-deploy/src/utils/cloud-deploy.schema.json cloud-deploy/schema_doc.md