Skip to content

Evaluate switching CLOUD_ML_REGION from us-east5 to global for Vertex AI cost optimization #1451

@redhat-ship-help

Description

@redhat-ship-help

Summary

Engineering leadership has requested that all Claude Code / Vertex AI users switch from regional endpoints (e.g. us-east5) to the global endpoint to reduce AI spend. Google charges a premium for tokens served from regional endpoints, and using global allows better cost management across the organization.

Source: ambient-code-go Google Group — forwarded announcement from engineering leadership (2026-04-22).

The official Claude Code installation instructions have been updated to reflect CLOUD_ML_REGION=global.

Current State in This Repo

The platform currently has multiple touchpoints for this setting:

  1. operator-config-openshift.yaml — The ConfigMap already sets CLOUD_ML_REGION: "global", but this is overridden at runtime by application code.

  2. display_name.gogetAnthropicClient() — Contains explicit fallback logic that overrides global back to us-east5:

    region := os.Getenv("CLOUD_ML_REGION")
    // Default to us-east5 - claude-haiku-4-5 is not available in global region
    if region == "" || region == "global" {
        region = "us-east5"
    }
  3. vertex.goValidateVertexConfig() — Reads CLOUD_ML_REGION at startup but does not transform the value.

  4. Benchmark scripts (referenced in RHOAIENG-48735) — Default to us-east5 via ${CLOUD_ML_REGION:-us-east5}.

Known Technical Constraint

The override in getAnthropicClient() was added because claude-haiku-4-5 was not available in the global region at the time of implementation. See: https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/claude

Action Items

  • Verify whether claude-haiku-4-5 (and all other Claude models used by the platform) are now available in the global region
  • If models are available globally, remove the globalus-east5 override in getAnthropicClient() so the ConfigMap value is respected
  • If models are NOT yet available globally, document the constraint and set a follow-up date to re-check
  • Update any hardcoded us-east5 defaults in benchmark/test scripts
  • Validate the change in staging before promoting to production

/cc @jeremyeder

Metadata

Metadata

Assignees

No one assigned

    Labels

    ambient-code:auto-fixAmber agent: automated low-risk fixes (formatting, linting)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions