@yawlabs/ai-pricing

Community-maintained pricing data for AI infrastructure services beyond the LLM layer — vector databases, inference hosts, managed agents, embedding services, MCP tool calls, fine-tuning, and evaluation platforms. YAML-first, schema-validated, supply-chain-hardened.

For LLM pricing specifically, we recommend LiteLLM's model_prices_and_context_window.json. This project complements that file; it does not replace it.

Why this exists

Modern AI cost attribution needs more than LLM pricing. An observability or FinOps tool running cost queries today needs pricing for:

Vector databases — Pinecone per-query + storage-GB, Weaviate cluster pricing, Qdrant tier pricing
Inference hosts — Replicate per-second, Modal per-second, Together per-token-per-model, Groq per-token, Fireworks per-token
Managed agents — OpenAI Assistants API, Anthropic Assistants, agent orchestration services
Embedding services — many now priced separately from the parent provider
MCP tool calls — when metered per-invocation
Fine-tuning — per-token training cost + per-inference host cost
Evaluation platforms — per-eval-run cost

LiteLLM's file is excellent for LLM pricing and we inherit it. This repo covers the rest.

Why supply-chain-hardened

In March 2026, two LiteLLM PyPI releases were compromised via a poisoned GitHub Action. 40,000 packages were downloaded in 40 minutes before PyPI quarantined them. The compromise wasn't in source code — it was in the build pipeline.

If pricing data is a single supply-chain surface, a compromise means every downstream cost dashboard shows the wrong numbers. Customers reconciling against invoices find discrepancies weeks later. Trust takes years to rebuild.

This project treats pricing data as security-critical infrastructure. Practices we've adopted from the outset:

Every GitHub Action SHA-pinned (never tag-pinned)
SLSA Level 3 provenance on releases — build attestation signed by GitHub's OIDC identity
SBOM published with every release — human + machine readable dependency inventory
Signed commits required on the main branch
CodeQL + dependency review on every PR
Reproducible builds — verify locally that what's released matches what's in the repo
Public audit log — every price change is a signed PR with human review

See SECURITY.md and THREAT_MODEL.md for the full posture.

Scope — what's in, what's out

In scope (this project maintains)

Vector databases: Pinecone, Weaviate, Qdrant, Chroma, Milvus/Zilliz
Inference hosts: Replicate, Modal, Together, Groq, Fireworks, Anyscale
Managed agent platforms: OpenAI Assistants pricing, Anthropic managed agents, crew/swarm platforms
Embedding services (where priced separately): Cohere, Voyage, Jina
MCP server monetization platforms (vend.sh compatible)
Fine-tuning pricing across providers
Evaluation platforms: Langfuse, Arize, Phoenix, Weights & Biases

Out of scope (use LiteLLM)

Core LLM inference pricing (GPT-4o, Claude, Gemini, Mistral, etc.)
LLM context-window and mode metadata
LLM rate-limit data

Explicitly excluded

Cloud infrastructure pricing (AWS, GCP, Azure) — those providers have rich pricing APIs; reinventing isn't useful
Historical pricing archive — this is current pricing only, with git history providing audit trail
Usage data — this repo is pricing data; usage belongs in observability tools
Benchmarks — use Artificial Analysis or similar

Data model

Every provider's pricing lives in a single YAML file under data/. Files follow schema/pricing-schema.yaml. Example:

provider: pinecone
homepage: https://www.pinecone.io/pricing
pricing_url: https://www.pinecone.io/pricing/
last_reviewed: 2026-04-15
reviewed_by: '@jeffyaw'
services:
  - id: serverless-reads
    category: vector_db
    description: Pinecone serverless read units
    pricing:
      read_units:
        unit: per_1m
        price_usd: 8.25
        notes: Billed per million read units consumed
  - id: serverless-storage
    category: vector_db
    description: Pinecone serverless storage
    pricing:
      storage_gb:
        unit: per_gb_month
        price_usd: 0.33

The schema validates: required fields, category enum (defined in pricing-schema.yaml), pricing unit enum, ISO-8601 date on last_reviewed, and cross-references.

Installation + usage

As a git dependency (recommended for auditability)

# Pin to a specific commit SHA — never a tag or branch
git submodule add \
  https://github.com/YawLabs/ai-pricing \
  vendor/ai-pricing
cd vendor/ai-pricing
git checkout <specific-commit-sha>

As a release tarball (with SLSA verification)

# Download from the GitHub release for the version you want:
curl -LO https://github.com/YawLabs/ai-pricing/releases/download/v0.1.4/ai-pricing-0.1.4.tar.gz
curl -LO https://github.com/YawLabs/ai-pricing/releases/download/v0.1.4/ai-pricing.intoto.jsonl
# Verify with slsa-verifier (see Provenance verification section below), then:
tar -xzf ai-pricing-0.1.4.tar.gz

Reading the data

The YAML files under data/ are designed to be human-inspectable and loaded directly from whatever runtime reads them — no package wrapper needed.

Provenance verification

Every release ships with a signed SLSA provenance attestation. To verify a release yourself:

# Install slsa-verifier (pinned to v2.5.1 SHA)
curl -sSL https://github.com/slsa-framework/slsa-verifier/releases/download/v2.5.1/slsa-verifier-linux-amd64 \
  -o slsa-verifier && chmod +x slsa-verifier

# Verify release tarball
./slsa-verifier verify-artifact \
  ai-pricing-v1.0.0.tar.gz \
  --provenance-path ai-pricing-v1.0.0.intoto.jsonl \
  --source-uri github.com/YawLabs/ai-pricing \
  --source-tag v1.0.0

SLSA provenance binds the released artifact to the exact GitHub Actions workflow that built it, signed by GitHub's OIDC-issued certificate. A compromised build pipeline can't produce a valid attestation.

Contributing

See CONTRIBUTING.md. Short version:

Signed commits only on main. git commit -S -m "..." or set commit signing as default.
One provider per PR. Don't mix.
Include last_reviewed date + verifier handle in the YAML.
Link to the provider's pricing page in the PR description.
Screenshots welcome for nonstandard pricing structures.
CI will validate schema, lint YAML, and fail on any unvalidated entries.

Security reporting

Vulnerability reports: open a private advisory (preferred) or email security@mcp.hosting. Details in SECURITY.md.

Never open a public issue for a vulnerability — use the private channel, coordinate disclosure.

License

Pricing data (data/): CC BY 4.0 — use it anywhere; attribution appreciated.
Tooling (scripts/, validation code): MIT.

Maintainers

Yaw Labs (YawLabs on GitHub) — @jeffyaw primary, plus community contributors as recognized.

Relationship to other YawLabs projects

@yawlabs/mcp-compliance — open methodology for grading MCP server spec compliance. Same supply-chain practices.
Spend — AI spend tracking, cost estimation, and provider comparison. Uses this repo + LiteLLM's file as its pricing data sources, with both sources independently maintained for supply-chain resilience. On the reconciliation path (e.g. "list this month's usage," "fetch invoice," "get quota remaining") Spend is MCP-native — each provider's admin API is wrapped as its own MCP server and Spend consumes them as a client, so the same protocol it sells to users is the one it uses to reason about provider billing.
vend.sh — license-key-native billing for builders. Includes MCP tool-call pricing pattern that informs the mcp_tool_call category here.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github		.github
data		data
docs		docs
schema		schema
scripts		scripts
.gitignore		.gitignore
.yamllint.yaml		.yamllint.yaml
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE-CODE		LICENSE-CODE
LICENSE-DATA		LICENSE-DATA
README.md		README.md
SECURITY.md		SECURITY.md
THREAT_MODEL.md		THREAT_MODEL.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

@yawlabs/ai-pricing

Why this exists

Why supply-chain-hardened

Scope — what's in, what's out

In scope (this project maintains)

Out of scope (use LiteLLM)

Explicitly excluded

Data model

Installation + usage

As a git dependency (recommended for auditability)

As a release tarball (with SLSA verification)

Reading the data

Provenance verification

Contributing

Security reporting

License

Maintainers

Relationship to other YawLabs projects

About

Licenses found

Uh oh!

Releases 4

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

@yawlabs/ai-pricing

Why this exists

Why supply-chain-hardened

Scope — what's in, what's out

In scope (this project maintains)

Out of scope (use LiteLLM)

Explicitly excluded

Data model

Installation + usage

As a git dependency (recommended for auditability)

As a release tarball (with SLSA verification)

Reading the data

Provenance verification

Contributing

Security reporting

License

Maintainers

Relationship to other YawLabs projects

About

Resources

License

Licenses found

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages