Skip to content

chore(pricing): Update vertex-ai pricing#721

Open
siddharthsambharia-portkey wants to merge 91 commits intomainfrom
pricing-update/vertex-ai
Open

chore(pricing): Update vertex-ai pricing#721
siddharthsambharia-portkey wants to merge 91 commits intomainfrom
pricing-update/vertex-ai

Conversation

@siddharthsambharia-portkey
Copy link
Copy Markdown
Collaborator

@siddharthsambharia-portkey siddharthsambharia-portkey commented Apr 16, 2026

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type Count
➕ Models added 5
🔄 Models updated (merged) 11

➕ New Models

  • gemini-2.5-pro-tts
  • gemini-2.5-flash-tts
  • gemma-4-26b-a4b-it-maas
  • gemini-3.1-flash-tts-preview
  • veo-3.1-lite-generate-001

🔄 Updated Models

  • gemini-2.5-pro
  • gemini-2.5-computer-use-preview-10-2025
  • gemini-3-pro-preview
  • gemini-3-flash-preview
  • gemini-3-pro-image-preview
  • gemini-3.1-pro-preview
  • gemini-3.1-flash-image-preview
  • gemini-3.1-flash-lite-preview
  • veo-3.0-fast-generate-001
  • veo-3.1-fast-generate-001
  • multimodalembedding

Model-to-Pricing-Page Mapping

Google – Gemini (text/multimodal)

Model ID Publisher / Section Source Notes
gemini-2.5-pro Google – Gemini 2.5 Pro API Standard ≤200K prices used; long-context (>200K): input $2.50, output $15
gemini-2.5-flash Google – Gemini 2.5 Flash API web_search $35/1K → 3.5¢; enterprise $45/1K → 4.5¢
gemini-2.5-flash-lite Google – Gemini 2.5 Flash-Lite API
gemini-2.5-flash-preview-09-2025 Google – Gemini 2.5 Flash API Preview alias; priced same as gemini-2.5-flash
gemini-2.5-flash-lite-preview-09-2025 Google – Gemini 2.5 Flash-Lite API Preview alias; priced same as gemini-2.5-flash-lite
gemini-2.5-computer-use-preview-10-2025 Google – Gemini 2.5 Pro (Computer Use) API Matched to Gemini 2.5 Pro pricing row
gemini-2.5-flash-image Google – Gemini 2.5 Flash Image API Image output model; image_token $30/1M added
gemini-2.5-pro-tts Google – Gemini API – price not found TTS model; no dedicated pricing row; added with price 0
gemini-2.5-flash-tts Google – Gemini API – price not found TTS model; no dedicated pricing row; added with price 0
gemini-2.0-flash-001 Google – Gemini 2.0 Flash API Batch pricing added; web_search 3.5¢
gemini-2.0-flash-lite-001 Google – Gemini 2.0 Flash-Lite API Batch pricing added
gemma-4-26b-a4b-it-maas Google – Gemma 4 26B API MaaS Gemma; priced at $0.15/$0.60 per the pricing page
gemini-3-pro-preview Google – Gemini 3 Pro Preview API Matched to Gemini 3 Pro Preview row
gemini-3-flash-preview Google – Gemini 3 Flash Preview API Matched to Gemini 3 Flash Preview row
gemini-3-pro-image-preview Google – Gemini 3 Pro Image Preview API Image output model; image_token $30/1M
gemini-3.1-pro-preview Google – Gemini 3.1 Pro Preview API
gemini-3.1-flash-image-preview Google – Gemini 3.1 Flash Image Preview API Image output model; image_token $30/1M
gemini-3.1-flash-lite-preview Google – Gemini 3.1 Flash-Lite Preview API
gemini-3.1-flash-tts-preview Google – Gemini API – price not found TTS preview; no pricing row; added with price 0

Google – Imagen

Model ID Publisher / Section Source Notes
imagen-4.0-ultra-generate-001 Google – Imagen 4.0 Ultra Generate API $0.06/image
imagen-4.0-generate-001 Google – Imagen 4.0 Generate API $0.04/image
imagen-4.0-fast-generate-001 Google – Imagen 4.0 Fast Generate API $0.02/image
imagen-3.0-generate-002 Google – Imagen 3.0 Generate API $0.04/image
imagen-3.0-capability-001 Google – Imagen 3.0 Capability API Capability model; uses imagen-3.0-generate pricing ($0.04/image)
imagen-3.0-capability-002 Google – Imagen 3.0 Capability API Capability model; uses imagen-3.0-generate pricing ($0.04/image)

Google – Veo (video)

Model ID Publisher / Section Source Notes
veo-2.0-generate-001 Google – Veo 2.0 API $0.50/s video; 8s default
veo-3.0-generate-001 Google – Veo 3.0 API $0.20/s video-only 720p/1080p; 8s default
veo-3.0-fast-generate-001 Google – Veo 3.0 Fast API $0.08/s video-only 720p; 8s default
veo-3.1-generate-001 Google – Veo 3.1 API $0.20/s video-only 720p/1080p; 8s default
veo-3.1-fast-generate-001 Google – Veo 3.1 Fast API $0.08/s video-only 720p; 8s default
veo-3.1-lite-generate-001 Google – Veo 3.1 Lite API $0.03/s video-only 720p; 8s default

Google – Embedding

Model ID Publisher / Section Source Notes
gemini-embedding-001 Google – Gemini Embedding API $0.00015/1K tokens
text-embedding-005 Google – Text Embedding (excl. Gemini) API $0.000025/1K chars (as per_thousand_tokens)
text-multilingual-embedding-002 Google – Text Embedding (excl. Gemini) API $0.000025/1K chars
textembedding-gecko Google – Text Embedding (excl. Gemini) API Legacy; same family pricing $0.000025/1K
text-embedding-large-exp-03-07 Google – Text Embedding (excl. Gemini) API Experimental; no dedicated row; shares text-embedding pricing
multimodalembedding Google – Multimodal Embedding API Text $0.0002/1K + image $0.0001 + video tiers

Anthropic – Claude

Model ID Publisher / Section Source Notes
claude-opus-4-1@20250805 Anthropic – Claude Opus 4.1 API Pinned date version; input $15, output $75
claude-sonnet-4-5@20250929 Anthropic – Claude Sonnet 4.5 API Pinned date version; ≤200K standard pricing
claude-haiku-4-5@20251001 Anthropic – Claude Haiku 4.5 API Pinned date version
claude-opus-4-5@20251101 Anthropic – Claude Opus 4.5 API Pinned date version; input $5, output $25
claude-opus-4-6 Anthropic – Claude Opus 4.6 API @default stripped; input $5, output $25
claude-sonnet-4-6 Anthropic – Claude Sonnet 4.6 API @default stripped
claude-opus-4-7 Anthropic – Claude Opus 4.7 API @default stripped; input $5, output $25

OpenAI

Model ID Publisher / Section Source Notes
gpt-oss-120b-maas OpenAI – GPT OSS 120B API MaaS; input $0.09, output $0.36

Meta – Llama

Model ID Publisher / Section Source Notes
llama-3.3-70b-instruct-maas Meta – Llama 3.3 70B API MaaS; $0.72/$0.72
llama-4-maverick-17b-128e-instruct-maas Meta – Llama 4 Maverick API MaaS; $0.35/$1.15

Mistral

Model ID Publisher / Section Source Notes
mistral-small-2503 Mistral – Mistral Small 3.1 API $0.10/$0.30
mistral-medium-3 Mistral – Mistral Medium 3 API $0.40/$2.00
codestral-2 Mistral – Codestral 2 API $0.30/$0.90

DeepSeek

Model ID Publisher / Section Source Notes
deepseek-v3.1-maas DeepSeek – DeepSeek-V3.1 API MaaS; cache_read $0.06/1M
deepseek-v3.2-maas DeepSeek – DeepSeek-V3.2 API MaaS; cache_read $0.056/1M
deepseek-r1-0528-maas DeepSeek – DeepSeek-R1-0528 API MaaS; $1.35/$5.40

Qwen

Model ID Publisher / Section Source Notes
qwen3-235b-a22b-instruct-2507-maas Qwen – Qwen3-235B API MaaS; $0.22/$0.88
qwen3-coder-480b-a35b-instruct-maas Qwen – Qwen3 Coder 480B API MaaS; cache_read $0.022/1M
qwen3-next-80b-a3b-instruct-maas Qwen – Qwen3-Next-80B Instruct API MaaS; $0.15/$1.20
qwen3-next-80b-a3b-thinking-maas Qwen – Qwen3-Next-80B Thinking API MaaS; $0.15/$1.20

Moonshot / Kimi

Model ID Publisher / Section Source Notes
kimi-k2-thinking-maas Moonshot – Kimi K2 Thinking API MaaS; cache_read $0.06/1M

MiniMax

Model ID Publisher / Section Source Notes
minimax-m2-maas MiniMax – MiniMax M2 API MaaS; cache_read $0.03/1M

ZAI.org – GLM

Model ID Publisher / Section Source Notes
glm-4.7-maas ZAI.org – GLM-4.7 API MaaS; $0.60/$2.20
glm-5-maas ZAI.org – GLM-5 API MaaS; cache_read $0.10/1M

Excluded Models (not added to pricing)

Model / Pattern Publisher Reason
gemini-live-2.5-flash-native-audio Google Gemini Live streaming — excluded
lyria-* (lyria-002, lyria-3-pro-preview, lyria-3-clip-preview) Google Music generation — excluded
model-optimizer-* Google Meta-endpoint — excluded
imagegeneration Google Legacy superseded model — excluded
virtual-try-on-001 Google Retail product model — excluded per google.md
shieldgemma2 Google Guard model — excluded
chirp-2, chirp-3 Google Audio transcription — excluded
translate-llm Google Non-generative translation — excluded
weathernext, weather-next-v2 Google Non-generative forecast — excluded
All self-deploy Gemma/PaLM/Codey/etc. Google Self-deploy, non-generative, or fine-tuning only
All traditional CV/NLP models Google Non-generative ML
pretrained-ocr, pretrained-form-parser Google OCR — excluded
image-segmentation-001, sam3 Google/Meta Non-generative segmentation — excluded
jamba-large-1.6 AI21 Self-deploy (has_deploy: true, no -maas) — excluded
qwen-image Qwen Excluded by policy
glm-image, glm-ocr ZAI.org glm-image excluded by policy; glm-ocr is OCR
glm-4.7, glm-5, glm-4.5 ZAI.org Self-deploy — excluded
kimi-k2, kimi-k2-5 Moonshot Self-deploy — excluded
minimax-m2 MiniMax Self-deploy — excluded
deepseek-r1, deepseek-v3, deepseek-v4, deepseek-v3-1, deepseek-v3-2 DeepSeek Self-deploy — excluded
deepseek-ocr-maas, deepseek-ocr, deepseek-ocr-2 DeepSeek OCR — excluded
All self-deploy Qwen variants Qwen Self-deploy (no -maas) — excluded
mistral-ocr-2505 Mistral OCR — excluded
ministral-3, mistral-large-3, codestral-2501-self-deploy Mistral Self-deploy — excluded
llama-guard, prompt-guard Meta Guard models — excluded
sam3 Meta Non-generative segmentation — excluded
All self-deploy Llama variants Meta Self-deploy (no -maas) — excluded
Non-generative Meta models (roberta, imagebind, nllb, faster-rcnn, etc.) Meta Non-generative ML — excluded
gpt-oss-20b, clip-*, openclip, whisper-* OpenAI gpt-oss-20b self-deploy; clip non-generative; whisper transcription
xAI Grok models xAI Pricing page only — not returned by get_vertex_models; not added

Generated by Pricing Agent on 2026-04-27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant