Skip to content

docs: enrich tool descriptions with gotchas, add API reference and examples#4

Merged
nialloulton merged 6 commits intomainfrom
docs/api-reference-and-tool-tips
Apr 6, 2026
Merged

docs: enrich tool descriptions with gotchas, add API reference and examples#4
nialloulton merged 6 commits intomainfrom
docs/api-reference-and-tool-tips

Conversation

@nialloulton
Copy link
Copy Markdown
Contributor

@nialloulton nialloulton commented Apr 6, 2026

Addresses documentation gaps identified during MCP agent testing. Enriches MCP tool descriptions with inline gotchas so AI agents get tips automatically. Adds Gotchas and Tips, Common Errors, Direct API Access (MCP vs API comparison, Python and curl quick-start) sections to the public README. Bumps version to 0.1.2.

Made with Cursor


Note

Medium Risk
Mostly documentation and metadata updates, but it also changes the ASGI export path (create_app -> lazy app) which could break existing uvicorn deployments/imports if consumers relied on the old symbol.

Overview
Adds a substantial README troubleshooting and usage expansion: Gotchas & Tips, a Common Errors table, and a Direct API Access section with MCP-vs-API guidance plus Python/curl quick starts; also updates example URLs to https://app.simba-mmm.com and refreshes branding/packaging metadata.

Enriches MCP tool docstrings (e.g., upload_data, get_model_results, run_optimizer, get_scenario_template, run_scenario) with inline warnings about exact channel-name matching, required array shapes, NaN/null cleanup, and async polling patterns.

Bumps version to 0.1.2 and adjusts deployment guidance/code to serve uvicorn via simba_mcp.server:app, implementing a lazy ASGI app accessor in server.py and removing create_app from the public package exports.

Reviewed by Cursor Bugbot for commit f6a797e. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Eager module-level app creation removes intentional lazy initialization
    • Restored lazy initialization using module-level getattr to defer HTTP app creation until explicitly accessed, avoiding overhead in stdio mode.
Preview (f6a797ee41)
diff --git a/CHANGELOG.md b/CHANGELOG.md
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -4,6 +4,15 @@
 
 The format is based on [Keep a Changelog](https://keepachangelog.com/), and this project adheres to [Semantic Versioning](https://semver.org/).
 
+## 0.1.2 — 2026-04-06
+
+### Added
+
+- Enriched MCP tool descriptions with inline gotchas (channel name matching, array requirements, NaN cleaning, async polling patterns) so AI agents get tips automatically.
+- README: Gotchas & Tips section covering the 6 most common pitfalls.
+- README: Common Errors table with causes and fixes.
+- README: Direct API Access section with MCP vs API comparison, Python and curl quick-start examples.
+
 ## 0.1.1 — 2026-04-06
 
 ### Fixed

diff --git a/README.md b/README.md
--- a/README.md
+++ b/README.md
@@ -1,28 +1,11 @@
-# SIMBA MCP Server — Marketing Mix Modeling for AI Assistants
+# Simba MCP Server
 
 [![PyPI](https://img.shields.io/pypi/v/simba-mcp)](https://pypi.org/project/simba-mcp/)
-[![CI](https://github.com/getsimba-ai/simba-mcp/actions/workflows/ci.yml/badge.svg)](https://github.com/getsimba-ai/simba-mcp/actions/workflows/ci.yml)
 [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
 [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
 
-An open-source [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) server that connects AI assistants to [Simba](https://getsimba.ai), a Bayesian Marketing Mix Modeling (MMM) platform built on [PyMC-Marketing](https://www.pymc-marketing.io/). Upload marketing data, build MMM models, measure channel ROI, optimize budgets, and run scenario forecasts — all through natural language in Claude, Cursor, or Claude Code.
+[Simba](https://simba-mmm.com) is a Bayesian Marketing Mix Modeling (MMM) platform. This [MCP server](https://modelcontextprotocol.io/) lets AI assistants interact with your Marketing Mix Models directly — upload data, build models, check results, and run budget optimizations through natural language in Claude, Cursor, or Claude Code.
 
-## Why use this?
-
-- **Marketing Mix Modeling via AI** — ask your AI assistant to build a Bayesian MMM, measure media attribution, or optimize your marketing budget instead of writing code
-- **Full MMM workflow** — data upload, model fitting, channel contribution analysis, response curves, budget optimization, and scenario planning in one integration
-- **Built on PyMC** — transparent Bayesian models with uncertainty quantification, adstock effects, saturation curves, and lift test calibration under the hood
-- **Any MCP client** — works with Claude Desktop, Cursor IDE, Claude Code, or any Model Context Protocol-compatible client
-
-## Prerequisites
-
-> **This MCP server requires a [Simba](https://getsimba.ai) account.** You'll need an active subscription and API key to connect. If you're not already a customer, **[book a call](https://calendly.com/niall-oulton)** to get set up with a demo and access.
-
-## Resources
-
-- **[getsimba.ai](https://getsimba.ai)** — SIMBA platform (features, pricing, demos)
-- **[getsimba-ai/simba-mmm](https://github.com/getsimba-ai/simba-mmm)** — full documentation on MMM concepts, data requirements, model configuration, incrementality measurement, and more
-
 ## Installation
 
 ```bash
@@ -48,7 +31,7 @@
       "command": "uvx",
       "args": ["simba-mcp"],
       "env": {
-        "SIMBA_API_URL": "https://demo.simba-mmm.com",
+        "SIMBA_API_URL": "https://app.simba-mmm.com",
         "SIMBA_API_KEY": "simba_sk_..."
       }
     }
@@ -67,7 +50,7 @@
       "command": "uvx",
       "args": ["simba-mcp"],
       "env": {
-        "SIMBA_API_URL": "https://demo.simba-mmm.com",
+        "SIMBA_API_URL": "https://app.simba-mmm.com",
         "SIMBA_API_KEY": "simba_sk_..."
       }
     }
@@ -91,7 +74,7 @@
     mcp_servers=[
         {
             "type": "url",
-            "url": "https://demo.simba-mmm.com/mcp",
+            "url": "https://app.simba-mmm.com/mcp",
             "name": "simba",
             "authorization_token": "simba_sk_...",
         }
@@ -145,12 +128,170 @@
 **Full workflow:**
 > "I have marketing data I want to analyze. First get the schema so I know what format is needed, then upload my data, create a model, and once it's done show me the ROI by channel."
 
+## Gotchas & Tips
+
+Things that commonly trip up both AI agents and humans:
+
+### Channel names are exact-match
+
+Channel names in model results can contain spaces (e.g. `"Digital impressions"`, `"TV_impressions"`). The optimizer uses these as dictionary keys — matching is **case-sensitive and space-sensitive**.
+
+**Always** call `get_model_results` with `sections="channel_summary"` first to see exact channel names, then use those verbatim in optimizer payloads.
+
+### Models are identified by `model_hash`
+
+All model endpoints use the string `model_hash` (e.g. `"f835671a25"`) returned by `create_model` and `list_models`.
+
+### Optimizer arrays, not scalars
+
+`laydown_weights` and `period_cpm` must be **objects of arrays**, each array having exactly `num_periods` elements:
+
+```json
+// Wrong
+"period_cpm": {"TV": 10}
+
+// Correct
+"period_cpm": {"TV": [10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10]}
+```
+
+The same channel keys must appear in `bounds`, `laydown_weights`, and `period_cpm`. Bounds values are **percentages** (0-100) of `total_budget`, not currency amounts.
+
+### Clean NaN from scenario templates
+
+The template from `get_scenario_template` may contain `NaN`/`null` for channels without historical data. Replace them with `0` before passing to `run_scenario`:
+
+```python
+import math
+for row in scenario_data:
+    for key, val in row.items():
+        if val is None or (isinstance(val, float) and math.isnan(val)):
+            row[key] = 0
+```
+
+### Three endpoints are async
+
+These return 202 and require polling:
+
+| Action | Start | Poll |
+|--------|-------|------|
+| Fit model | `create_model` | `get_model_status` |
+| Optimize | `run_optimizer` | `get_optimizer_results` |
+| Scenario | `run_scenario` | `get_scenario_results` |
+
+Poll every 5-10 seconds. Check the `status` field for `"complete"` or `"failed"`.
+
+### Data upload requirements
+
+- **CSV only** (not Excel). Maximum 50 MB.
+- Minimum **52 rows** (104+ recommended).
+- Media columns: `{channel}_activity` and `{channel}_spend` per channel.
+- Use `0` for inactive periods, not blank or NA.
+
+## Common Errors
+
+| Error | Cause | Fix |
+|-------|-------|-----|
+| `Authentication required` | No API key or expired key | Check `SIMBA_API_KEY` env var |
+| `API key missing required scope: <scope>` | Key doesn't have the needed scope | Create a key with all scopes |
+| `Missing required fields: [...]` | Payload missing required keys | Check the tool's parameter list |
+| `Model status is '<status>'. Optimization requires a 'complete' model.` | Model still fitting or failed | Poll `get_model_status` until complete |
+| `laydown_weights['TV'] must be an array of length 12` | Scalar instead of array, or wrong length | Use arrays matching `num_periods` |
+| `period_cpm['TV'] values must all be positive` | Zero or negative CPM | All CPM values must be > 0 |
+| `Channels in bounds missing from period_cpm: [...]` | Mismatched channel names | Same keys in bounds, laydown_weights, and period_cpm |
+| `Columns not found in data: [...]` | Column name typo | Check CSV headers match exactly |
+| `File exceeds 50 MB limit` | CSV too large | Reduce file size or aggregate data |
+
+## Direct API Access
+
+The MCP server wraps the Simba REST API. For scripting, CI/CD, or environments without MCP, you can call the API directly.
+
+### When to use MCP vs direct API
+
+| | MCP (via AI assistant) | Direct API (curl / Python) |
+|---|---|---|
+| **Best for** | Exploratory analysis, conversational workflows | Automated pipelines, scheduled jobs, scripts |
+| **Async polling** | Assistant handles it automatically | You implement poll-until-complete logic |
+| **Data cleaning** | Assistant cleans NaN/null, builds payloads | You write the data prep code |
+| **Reproducibility** | Conversational | Scriptable, version-controlled |
+
+Both use the same API keys with the same scopes.
+
+### Quick start (Python)
+
+```python
+import requests, time
+
+BASE = "https://app.simba-mmm.com"
+HEADERS = {"Authorization": "Bearer simba_sk_..."}
+
+# Upload data
+with open("marketing_data.csv", "rb") as f:
+    r = requests.post(f"{BASE}/api/v1/ingest",
+                      headers={**HEADERS, "Content-Type": "text/csv"},
+                      data=f.read(), params={"name": "q1_data"})
+file_id = r.json()["id"]
+
+# Create model
+r = requests.post(f"{BASE}/api/v1/models", headers=HEADERS, json={
+    "data_source": {"uploaded_file_id": file_id},
+    "date_column": "date",
+    "kpi_column": "revenue",
+    "hierarchy_column": "brand",
+    "channels": [
+        {"name": "TV", "activity_column": "tv_grps", "spend_column": "tv_spend"},
+        {"name": "Search", "activity_column": "search_impressions", "spend_column": "search_spend"},
+    ],
+    "total_media_effect": "Retail",
+})
+model_hash = r.json()["model_hash"]
+
+# Poll until complete
+while True:
+    status = requests.get(f"{BASE}/api/v1/models/{model_hash}/status",
+                          headers=HEADERS).json()
+    if status["status"] in ("complete", "failed"):
+        break
+    print(f"Fitting... {status.get('progress', '?')}%")
+    time.sleep(10)
+
+# Get results
+results = requests.get(f"{BASE}/api/v1/models/{model_hash}/results",
+                       headers=HEADERS,
+                       params={"sections": "channel_summary,model_stats"}).json()
+for ch in results["results"]["channel_summary"]:
+    print(f"{ch['Channel']}: ROI {ch['ROI']:.1f}")
+```
+
+### Quick start (curl)
+
+```bash
+API_KEY="simba_sk_..."
+BASE="https://app.simba-mmm.com"
+
+# Upload data
+curl -X POST "$BASE/api/v1/ingest?name=q1_data" \
+  -H "Authorization: Bearer $API_KEY" \
+  -H "Content-Type: text/csv" \
+  --data-binary @marketing_data.csv
+
+# Create model (replace uploaded_file_id with id from upload)
+curl -X POST "$BASE/api/v1/models" \
+  -H "Authorization: Bearer $API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{"data_source": {"uploaded_file_id": 1}, "date_column": "date", "kpi_column": "revenue", "hierarchy_column": "brand", "channels": [{"name": "TV", "activity_column": "tv_grps", "spend_column": "tv_spend"}]}'
+
+# Poll status (replace MODEL_HASH)
+curl "$BASE/api/v1/models/MODEL_HASH/status" -H "Authorization: Bearer $API_KEY"
+
+# Get results
+curl "$BASE/api/v1/models/MODEL_HASH/results?sections=channel_summary,model_stats" \
+  -H "Authorization: Bearer $API_KEY"
+```
+
 ## API Key Setup
 
-The MCP server authenticates with the same API keys used by the Simba REST API. You'll need an active Simba account — **[book a call](https://calendly.com/niall-oulton)** if you don't have one yet.
+The MCP server authenticates with the same API keys used by the Simba REST API. Create a key with the required scopes:
 
-Once you have access, create a key with the required scopes:
-
 1. Go to **Profile > API Keys** in the Simba UI
 2. Click **Create Key**
 3. Set scopes: `ingest`, `read:models`, `read:results`, `create:models`, `optimize`, `scenario`
@@ -180,11 +321,9 @@
 simba-mcp --transport sse --port 8100
 
 # Or via uvicorn directly
-uvicorn "simba_mcp.server:create_app()" --host 0.0.0.0 --port 8100
+uvicorn simba_mcp.server:app --host 0.0.0.0 --port 8100

License

-MIT — see LICENSE for details.

-Built by SIMBA on PyMC-Marketing.
+MIT

diff --git a/pyproject.toml b/pyproject.toml
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,12 +1,12 @@
[project]
name = "simba-mcp"
-version = "0.1.1"
-description = "Model Context Protocol (MCP) server for SIMBA — connect Claude, Cursor, and AI assistants to Bayesian Marketing Mix Models (MMM)"
+version = "0.1.2"
+description = "MCP server for Simba Marketing Mix Modeling — connect AI assistants to your MMM models"
readme = "README.md"
license = "MIT"
requires-python = ">=3.11"
-authors = [{ name = "SIMBA", email = "info@1749.io" }]
-keywords = ["mcp", "model-context-protocol", "marketing-mix-model", "mmm", "bayesian", "simba", "pymc", "budget-optimization", "media-attribution", "roi"]
+authors = [{ name = "Simba", email = "hello@simba-mmm.com" }]
+keywords = ["mcp", "marketing-mix-model", "mmm", "bayesian", "simba"]
classifiers = [
"Development Status :: 4 - Beta",
"Intended Audience :: Developers",
@@ -23,9 +23,9 @@
]

[project.urls]
-Homepage = "https://getsimba.ai"
-Repository = "https://github.com/getsimba-ai/simba-mcp"
-Documentation = "https://github.com/getsimba-ai/simba-mmm"
+Homepage = "https://simba-mmm.com"
+Repository = "https://github.com/getsimba-ai/mcp-server"
+Documentation = "https://docs.simba-mmm.com"

[project.scripts]
simba-mcp = "simba_mcp.main:main"

diff --git a/src/simba_mcp/init.py b/src/simba_mcp/init.py
--- a/src/simba_mcp/init.py
+++ b/src/simba_mcp/init.py
@@ -1,8 +1,8 @@
-"""SIMBA MCP Server — connect AI assistants to SIMBA MMM."""
+"""Simba MCP Server — connect AI assistants to Simba MMM."""

-version = "0.1.0"
+version = "0.1.2"

from .api_client import SimbaAPIClient
-from .server import create_app, mcp
+from .server import mcp

-all = ["SimbaAPIClient", "create_app", "mcp"]
+all = ["SimbaAPIClient", "mcp"]

diff --git a/src/simba_mcp/server.py b/src/simba_mcp/server.py
--- a/src/simba_mcp/server.py
+++ b/src/simba_mcp/server.py
@@ -1,12 +1,12 @@
"""
-SIMBA MCP Server — exposes Simba's API v1 as MCP tools.
+Simba MCP Server — exposes Simba's API v1 as MCP tools.

Tools allow AI assistants (Claude, Cursor, etc.) to interact with
Marketing Mix Models: upload data, create models, check status,
get results, and run budget optimizations.

Run locally: simba-mcp
-Run remote: uvicorn "simba_mcp.server:create_app()" --host 0.0.0.0 --port 8100
+Run remote: uvicorn simba_mcp.server:app --host 0.0.0.0 --port 8100
"""

import logging
@@ -97,6 +97,12 @@
with date, KPI, multiplier, hierarchy, media activity/spend columns,
and optional control variables.

  • IMPORTANT:

    • CSV only (not Excel). Maximum file size: 50 MB.
    • Minimum 52 rows required (104+ recommended for robust estimation).
    • Media columns must follow naming: {channel}_activity and {channel}_spend.
    • Use 0 for inactive periods, not blank or NA.
  • Args:
    csv_content: The full CSV text content (not base64, just raw CSV text).
    name: Optional dataset name for identification.
    @@ -123,6 +129,9 @@
    Returns model name, hash, status (pending/under way/complete/failed),
    type (mmm/var), hierarchy value, and timestamps.

  • NOTE: All other model endpoints use model_hash (string, e.g. "f835671a25")

  • as the identifier. Use the model_hash from this response.

  • Args:
    include_unsaved: Include draft/unsaved models (default false).
    limit: Maximum number of models to return (default 50, max 500).
    @@ -243,6 +252,11 @@
    params, optimizer, predictions, model_stats, decay_curves,
    response_curves, marginal_curves, actual_vs_model.

  • IMPORTANT: Channel names in results may contain spaces (e.g. "Digital impressions").

  • These exact names (case-sensitive, space-sensitive) must be used as dictionary keys

  • in run_optimizer bounds, laydown_weights, and period_cpm. Always check channel_summary

  • first to get the exact channel names before calling run_optimizer.

  • Args:
    model_hash: The model hash.
    sections: Comma-separated list of sections to include.
    @@ -277,12 +291,18 @@
    Finds the optimal budget allocation across channels to maximize
    predicted revenue within the given constraints.

  • IMPORTANT: Channel names in bounds, laydown_weights, and period_cpm must
  • match the media channel names from get_model_results (channel_summary section).
  • Use get_scenario_template to discover channel names and their average CPM values.
  • IMPORTANT:
    • Channel names must exactly match model results (case-sensitive, space-sensitive).
  •  Call get_model_results with sections="channel_summary" first to get exact names,
    
  •  or use get_scenario_template to discover channel names and their average CPM values.
    
    • bounds values are percentages of total_budget (0-100), not currency amounts.
    • laydown_weights and period_cpm must be ARRAYS of length num_periods, not scalars.
  •  Wrong: {"TV": 10}. Correct: {"TV": [10, 10, 10, 10]}.
    
    • The same channel keys must appear in all three: bounds, laydown_weights, and period_cpm.
    • All period_cpm values must be positive (> 0).
    • laydown_weights per channel must sum to a positive value (weights are normalized internally).
  • All three dicts (bounds, laydown_weights, period_cpm) must have the same
  • set of channel keys, and array values must have length equal to num_periods.
  • Returns 202 (async). Use get_optimizer_results to poll until status is "complete".

    Args:
    model_hash: Hash of a completed model.
    @@ -303,8 +323,6 @@
    of length num_periods with positive values. Get baseline CPM from
    get_scenario_template (avg_cpu_by_channel field).
    Example: {"TV_Impressions": [10.5, 10.5, 10.5, 10.5]}

  • Returns immediately with status. Use get_optimizer_results to poll for results.
    """
    payload = {
    "total_budget": total_budget,
    @@ -363,6 +381,10 @@
    • Baseline activity values per channel (rows — use as starting point for scenarios)
    • Media vs control channel classification
  • WARNING: Template data may contain NaN or null values for channels without
  • historical data. You MUST replace NaN/null with 0 before passing to run_scenario,
  • otherwise the prediction will fail downstream.
  • Args:
    model_hash: Hash of a completed model.
    periods_forward: Number of future periods to generate (default 12).
    @@ -389,9 +411,13 @@
    predicts the KPI outcome. Use get_scenario_template first to get
    the expected format, channel names, and baseline values.
  • This is async — returns immediately with status "pending".
  • You MUST poll get_scenario_results afterwards to get the actual prediction output.
  • IMPORTANT: Before submitting, replace any NaN/null values in scenario_data with 0.

  • The template from get_scenario_template may contain NaN for channels without

  • historical data, which will cause the prediction to fail.

  • This is async (returns 202 with status "pending"). Poll get_scenario_results

  • until status is "complete" or "failed".

  • Workflow: get_scenario_template -> modify values -> run_scenario -> poll get_scenario_results

    Args:
    @@ -432,6 +458,9 @@
    channel contributions, confidence intervals, and base components
    (intercept, seasonality, trend).

  • NOTE: Failed scenarios return status "failed" with an error message in the

  • JSON body (not an HTTP error). Always check the status field.

  • Args:
    model_hash: Hash of the model the scenario was run on.
    """
    @@ -443,7 +472,16 @@

---------------------------------------------------------------------------

-def create_app():
+def _create_app():
"""Create the ASGI app for uvicorn/Streamable HTTP deployment."""
mcp.settings.streamable_http_path = "/"
return mcp.streamable_http_app()
+
+
+def getattr(name: str):

  • """Lazy module-level attribute access to avoid creating the HTTP app in stdio mode."""
  • if name == "app":
  •    global app
    
  •    app = _create_app()
    
  •    return app
    
  • raise AttributeError(f"module '{name}' has no attribute '{name}'")

</details>


<sub>You can send follow-ups to the cloud agent <a href="https://cursor.com/agents/bc-e9fec931-313c-405d-810e-4fbb71804883">here</a>.</sub>
<!-- BUGBOT_AUTOFIX_REVIEW_FOOTNOTE_END -->

<sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit fedb9b7579a26b6aa59694d5200ca102d32dc2a9. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup>

Comment thread src/simba_mcp/server.py Outdated
…io mode

The HTTP app creation has been moved back to a lazy pattern using module-level
__getattr__. This ensures that:

- Importing the module in stdio mode (default) does not create the HTTP app
- The app is only created when explicitly accessed (e.g., by uvicorn)
- Test imports don't trigger unnecessary HTTP app instantiation
- mcp.settings are not mutated as a side effect of import

The previous eager initialization at module level caused overhead in the most
common code path (stdio transport) where the HTTP app is never used.
@nialloulton nialloulton merged commit 8334384 into main Apr 6, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants