From 2a50aac1c51c06b286da61e795755b2006a3fddf Mon Sep 17 00:00:00 2001 From: "Aivin V. Solatorio" Date: Wed, 1 Apr 2026 13:54:31 -0400 Subject: [PATCH 1/5] feat(docs): GitHub Pages landing site and Pages deploy workflow - Add static site under docs/ (index.html, CSS, .nojekyll) - Rename docs/index.md to overview.md with site cross-link - Add GitHub Actions workflow to deploy docs/ to Pages - Document site URL, preview, and Actions setup in README Signed-off-by: Aivin V. Solatorio --- .github/workflows/pages.yml | 39 +++ README.md | 6 +- docs/.nojekyll | 0 docs/assets/css/main.css | 611 +++++++++++++++++++++++++++++++++ docs/index.html | 537 +++++++++++++++++++++++++++++ docs/{index.md => overview.md} | 2 + 6 files changed, 1194 insertions(+), 1 deletion(-) create mode 100644 .github/workflows/pages.yml create mode 100644 docs/.nojekyll create mode 100644 docs/assets/css/main.css create mode 100644 docs/index.html rename docs/{index.md => overview.md} (93%) diff --git a/.github/workflows/pages.yml b/.github/workflows/pages.yml new file mode 100644 index 0000000..20e8532 --- /dev/null +++ b/.github/workflows/pages.yml @@ -0,0 +1,39 @@ +# Deploy static site from docs/ to GitHub Pages (configure repo: Settings → Pages → Source: GitHub Actions) +name: Deploy GitHub Pages + +on: + push: + branches: + - main + - dev + workflow_dispatch: + +permissions: + contents: read + pages: write + id-token: write + +concurrency: + group: pages + cancel-in-progress: false + +jobs: + deploy: + environment: + name: github-pages + url: ${{ steps.deployment.outputs.page_url }} + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + + - name: Configure Pages + uses: actions/configure-pages@v4 + + - name: Upload artifact + uses: actions/upload-pages-artifact@v3 + with: + path: docs + + - name: Deploy to GitHub Pages + id: deployment + uses: actions/deploy-pages@v4 diff --git a/README.md b/README.md index 5d1dac9..da7405b 100644 --- a/README.md +++ b/README.md @@ -126,7 +126,11 @@ For chatbot integration, copy `data360://system-prompt` into your system prompt. ## Documentation -Full documentation is available at: **https://worldbank.github.io/data360-mcp** +**Project site:** [worldbank.github.io/data360-mcp](https://worldbank.github.io/data360-mcp) — landing page with features, tools, and connection details. + +A markdown overview lives in [docs/overview.md](docs/overview.md). The site is deployed with [GitHub Actions](.github/workflows/pages.yml) on pushes to `main` or `dev`. In the repository **Settings → Pages**, set **Build and deployment** source to **GitHub Actions** (first-time setup). + +**Preview locally:** from the repository root, run `python -m http.server --directory docs` and open `http://127.0.0.1:8000/`. For developer setup, testing, and contribution instructions, see [DEVELOPMENT.md](DEVELOPMENT.md). diff --git a/docs/.nojekyll b/docs/.nojekyll new file mode 100644 index 0000000..e69de29 diff --git a/docs/assets/css/main.css b/docs/assets/css/main.css new file mode 100644 index 0000000..d4e4940 --- /dev/null +++ b/docs/assets/css/main.css @@ -0,0 +1,611 @@ +/* Data360 MCP — project site (GitHub Pages) */ + +:root { + --wb-navy: #002b5c; + --wb-navy-deep: #001e3c; + --wb-navy-soft: #0a3d6e; + --wb-gold: #e8b54d; + --wb-gold-muted: #c9a227; + --wb-cream: #f5f2eb; + --wb-paper: #faf9f6; + --wb-ink: #1a1a1a; + --wb-ink-muted: #4a5568; + --wb-border: rgba(0, 43, 92, 0.12); + --wb-card-shadow: 0 4px 24px rgba(0, 30, 60, 0.08); + --font-sans: "Source Sans 3", system-ui, -apple-system, "Segoe UI", Roboto, + "Helvetica Neue", Arial, sans-serif; + --font-mono: ui-monospace, "Cascadia Code", "Source Code Pro", Menlo, monospace; + --space-xs: 0.5rem; + --space-sm: 1rem; + --space-md: 1.5rem; + --space-lg: 2.5rem; + --space-xl: 4rem; + --radius: 10px; + --max-width: 68rem; +} + +*, +*::before, +*::after { + box-sizing: border-box; +} + +html { + scroll-behavior: smooth; +} + +body { + margin: 0; + font-family: var(--font-sans); + font-size: 1.0625rem; + line-height: 1.6; + color: var(--wb-ink); + background: var(--wb-paper); +} + +.skip-link { + position: absolute; + left: -9999px; + z-index: 100; + padding: var(--space-sm) var(--space-md); + background: var(--wb-gold); + color: var(--wb-navy-deep); + font-weight: 600; + text-decoration: none; + border-radius: 0 0 var(--radius) 0; +} + +.skip-link:focus { + left: 0; + top: 0; +} + +/* Header */ +.site-header { + background: linear-gradient( + 165deg, + var(--wb-navy-deep) 0%, + var(--wb-navy) 45%, + var(--wb-navy-soft) 100% + ); + color: var(--wb-cream); + padding: var(--space-md) var(--space-md) var(--space-xl); + border-bottom: 3px solid var(--wb-gold); +} + +.site-header__inner { + max-width: var(--max-width); + margin: 0 auto; +} + +.site-nav { + display: flex; + flex-wrap: wrap; + align-items: center; + justify-content: space-between; + gap: var(--space-sm); + margin-bottom: var(--space-lg); + font-size: 0.9375rem; +} + +.site-brand { + font-weight: 700; + font-size: 1.125rem; + letter-spacing: -0.02em; + color: var(--wb-cream); + text-decoration: none; +} + +.site-brand:hover, +.site-brand:focus-visible { + color: var(--wb-gold); +} + +.site-nav__links { + display: flex; + flex-wrap: wrap; + gap: var(--space-sm) var(--space-md); + list-style: none; + margin: 0; + padding: 0; +} + +.site-nav__links a { + color: rgba(245, 242, 235, 0.92); + text-decoration: none; + border-bottom: 1px solid transparent; +} + +.site-nav__links a:hover, +.site-nav__links a:focus-visible { + color: var(--wb-gold); + border-bottom-color: var(--wb-gold); +} + +.hero { + max-width: 42rem; +} + +.hero__eyebrow { + display: inline-block; + font-size: 0.8125rem; + font-weight: 600; + text-transform: uppercase; + letter-spacing: 0.12em; + color: var(--wb-gold); + margin-bottom: var(--space-xs); +} + +.hero h1 { + margin: 0 0 var(--space-sm); + font-size: clamp(1.85rem, 4vw, 2.5rem); + font-weight: 700; + line-height: 1.15; + letter-spacing: -0.03em; +} + +.hero__lead { + margin: 0 0 var(--space-md); + font-size: 1.125rem; + line-height: 1.65; + color: rgba(245, 242, 235, 0.88); +} + +.hero__lead a { + color: #f0c96a; + text-decoration: underline; + text-underline-offset: 2px; +} + +.hero__lead a:hover { + color: #fff; +} + +.hero__audience { + font-size: 1rem; + opacity: 0.92; +} + +.hero__actions { + display: flex; + flex-wrap: wrap; + gap: var(--space-sm); +} + +.hero__sub { + margin: 0 0 var(--space-md); + font-size: 0.9375rem; + line-height: 1.55; + color: rgba(245, 242, 235, 0.75); +} + +.hero__sub a { + color: #f0c96a; +} + +/* User-first sections (light band) */ +.section--user { + background: linear-gradient(180deg, #fff 0%, var(--wb-paper) 100%); + margin-left: calc(-1 * var(--space-md)); + margin-right: calc(-1 * var(--space-md)); + padding-left: var(--space-md); + padding-right: var(--space-md); + padding-top: var(--space-xl); + padding-bottom: var(--space-xl); + border-bottom: 1px solid var(--wb-border); +} + +.section--user .section { + margin-bottom: var(--space-xl); +} + +.section--user .section:last-of-type { + margin-bottom: 0; +} + +.outcome-grid { + display: grid; + grid-template-columns: repeat(auto-fill, minmax(15rem, 1fr)); + gap: var(--space-md); + margin: 0; + padding: 0; + list-style: none; +} + +.outcome-card { + background: #fff; + border: 1px solid var(--wb-border); + border-radius: var(--radius); + padding: var(--space-md); + box-shadow: var(--wb-card-shadow); + border-top: 3px solid var(--wb-gold); +} + +.outcome-card h3 { + margin: 0 0 var(--space-xs); + font-size: 1.0625rem; + font-weight: 700; + color: var(--wb-navy); +} + +.outcome-card p { + margin: 0; + font-size: 0.9375rem; + color: var(--wb-ink-muted); +} + +.question-list { + margin: 0; + padding: 0; + list-style: none; + display: flex; + flex-direction: column; + gap: var(--space-sm); +} + +.question-list li { + margin: 0; + padding: var(--space-md); + background: #fff; + border: 1px solid var(--wb-border); + border-radius: var(--radius); + border-left: 4px solid var(--wb-navy); + font-size: 1rem; + line-height: 1.5; + color: var(--wb-ink); + box-shadow: var(--wb-card-shadow); +} + +.question-list .question__note { + display: block; + margin-top: var(--space-xs); + font-size: 0.875rem; + color: var(--wb-ink-muted); +} + +.viz-panel { + background: #fff; + border: 1px solid var(--wb-border); + border-radius: var(--radius); + padding: var(--space-md); + box-shadow: var(--wb-card-shadow); +} + +.viz-panel ul { + margin: 0 0 var(--space-md); + padding-left: 1.25rem; + color: var(--wb-ink-muted); + font-size: 0.9375rem; +} + +.viz-panel ul:last-child { + margin-bottom: 0; +} + +.viz-panel__lead { + margin: 0 0 var(--space-sm); + color: var(--wb-ink); +} + +.viz-panel__footnote { + margin: 0; + font-size: 0.9375rem; + color: var(--wb-ink-muted); +} + +.viz-panel__tools { + margin-top: var(--space-md); + padding-top: var(--space-md); + border-top: 1px solid var(--wb-border); + font-size: 0.875rem; + color: var(--wb-ink-muted); +} + +.viz-panel__tools code { + font-family: var(--font-mono); + font-size: 0.8125rem; + background: rgba(0, 43, 92, 0.06); + padding: 0.1rem 0.35rem; + border-radius: 4px; +} + +/* Technical reference band */ +.section--technical-intro { + margin-top: var(--space-xl); + padding-top: var(--space-md); + border-top: 2px solid var(--wb-navy); +} + +.section__title--technical { + font-size: 1.35rem; + color: var(--wb-navy-deep); +} + +.section__technical-lede { + margin: 0 0 var(--space-lg); + color: var(--wb-ink-muted); + max-width: 40rem; + font-size: 0.9375rem; +} + +/* Buttons */ +.btn { + display: inline-flex; + align-items: center; + justify-content: center; + padding: 0.65rem 1.25rem; + font-family: inherit; + font-size: 0.9375rem; + font-weight: 600; + text-decoration: none; + border-radius: var(--radius); + border: 2px solid transparent; + cursor: pointer; + transition: + background 0.15s ease, + color 0.15s ease, + border-color 0.15s ease; +} + +.btn:focus-visible { + outline: 2px solid var(--wb-gold); + outline-offset: 3px; +} + +.btn--primary { + background: var(--wb-gold); + color: var(--wb-navy-deep); +} + +.btn--primary:hover { + background: #f0c96a; +} + +.btn--ghost { + background: transparent; + color: var(--wb-cream); + border-color: rgba(245, 242, 235, 0.45); +} + +.btn--ghost:hover { + border-color: var(--wb-gold); + color: var(--wb-gold); +} + +/* Sections */ +main { + padding: var(--space-xl) var(--space-md); +} + +.section { + max-width: var(--max-width); + margin: 0 auto var(--space-xl); +} + +.section:last-child { + margin-bottom: 0; +} + +.section__title { + margin: 0 0 var(--space-sm); + font-size: 1.5rem; + font-weight: 700; + color: var(--wb-navy-deep); + letter-spacing: -0.02em; +} + +.section__intro { + margin: 0 0 var(--space-md); + color: var(--wb-ink-muted); + max-width: 48rem; +} + +.section__title--sub { + margin-top: var(--space-lg); + font-size: 1.25rem; +} + +.section__title--compact { + margin-top: var(--space-md); + font-size: 1.15rem; +} + +.section__intro--after-table { + margin-top: var(--space-md); +} + +/* Feature grid */ +.feature-grid { + display: grid; + grid-template-columns: repeat(auto-fill, minmax(16rem, 1fr)); + gap: var(--space-md); + margin: 0; + padding: 0; + list-style: none; +} + +.feature-card { + background: #fff; + border: 1px solid var(--wb-border); + border-radius: var(--radius); + padding: var(--space-md); + box-shadow: var(--wb-card-shadow); +} + +.feature-card h3 { + margin: 0 0 var(--space-xs); + font-size: 1.0625rem; + font-weight: 700; + color: var(--wb-navy); +} + +.feature-card p { + margin: 0; + font-size: 0.9375rem; + color: var(--wb-ink-muted); +} + +/* Tables */ +.table-wrap { + overflow-x: auto; + border-radius: var(--radius); + border: 1px solid var(--wb-border); + background: #fff; + box-shadow: var(--wb-card-shadow); +} + +.table-wrap--flush { + border: none; + box-shadow: none; +} + +table { + width: 100%; + border-collapse: collapse; + font-size: 0.9375rem; +} + +th, +td { + padding: 0.75rem 1rem; + text-align: left; + border-bottom: 1px solid var(--wb-border); +} + +th { + background: var(--wb-cream); + font-weight: 600; + color: var(--wb-navy-deep); + white-space: nowrap; +} + +td code { + font-family: var(--font-mono); + font-size: 0.875rem; + background: rgba(0, 43, 92, 0.06); + padding: 0.15rem 0.4rem; + border-radius: 4px; +} + +tr:last-child td { + border-bottom: none; +} + +/* Code block */ +.code-block { + margin: 0; + padding: var(--space-md); + background: var(--wb-navy-deep); + color: #e2e8f0; + border-radius: var(--radius); + font-family: var(--font-mono); + font-size: 0.8125rem; + line-height: 1.55; + overflow-x: auto; + border: 1px solid rgba(232, 181, 77, 0.2); +} + +.code-block .comment { + color: #94a3b8; +} + +/* Connect panel */ +.connect-grid { + display: grid; + gap: var(--space-md); +} + +@media (min-width: 640px) { + .connect-grid { + grid-template-columns: 1fr 1fr; + } +} + +.panel { + background: #fff; + border: 1px solid var(--wb-border); + border-radius: var(--radius); + padding: var(--space-md); + box-shadow: var(--wb-card-shadow); +} + +.panel h3 { + margin: 0 0 var(--space-sm); + font-size: 1.0625rem; + color: var(--wb-navy); +} + +.panel__text { + margin: 0 0 var(--space-sm); + font-size: 0.9375rem; + color: var(--wb-ink-muted); +} + +.panel__list { + margin: 0; + padding-left: 1.25rem; + font-size: 0.9375rem; +} + +.panel table { + font-size: 0.875rem; +} + +.panel th, +.panel td { + padding: 0.5rem 0.75rem; +} + +.workflow-box { + margin: 0; + padding: var(--space-md); + background: var(--wb-cream); + border-radius: var(--radius); + font-family: var(--font-mono); + font-size: 0.8125rem; + line-height: 1.55; + white-space: pre-wrap; + color: var(--wb-navy-deep); + border: 1px solid var(--wb-border); +} + +/* Footer */ +.site-footer { + margin-top: var(--space-xl); + padding: var(--space-lg) var(--space-md); + background: var(--wb-navy-deep); + color: rgba(245, 242, 235, 0.85); + font-size: 0.9375rem; +} + +.site-footer__inner { + max-width: var(--max-width); + margin: 0 auto; +} + +.site-footer a { + color: var(--wb-gold); + text-decoration: underline; + text-underline-offset: 2px; +} + +.site-footer a:hover { + color: #f0c96a; +} + +.site-footer p { + margin: 0 0 var(--space-sm); +} + +.site-footer__meta { + margin-top: var(--space-md); + padding-top: var(--space-md); + border-top: 1px solid rgba(245, 242, 235, 0.15); + font-size: 0.875rem; + color: rgba(245, 242, 235, 0.65); +} + +.site-footer__built { + margin-top: var(--space-sm); + font-size: 0.8125rem; +} diff --git a/docs/index.html b/docs/index.html new file mode 100644 index 0000000..5e2df48 --- /dev/null +++ b/docs/index.html @@ -0,0 +1,537 @@ + + + + + + + Data360 MCP Server · World Bank + + + + + + + + + + +
+
+
+

What this enables for you

+

+ Whether you are prototyping a policy assistant or embedding data in + an internal tool, the server is designed so the model can + look up facts instead of guessing numbers or mixing + units. +

+
    +
  • +

    Country and regional insight

    +

    + Compare economies, track SDG-related series, and see whether an + indicator actually covers the places you care about. +

    +
  • +
  • +

    Trends over time

    +

    + Request years or periods you need—then reason over consistent + time series instead of a vague “recent” snapshot. +

    +
  • +
  • +

    Definitions you can cite

    +

    + Pull methodology, limitations, and field meanings so answers can + explain what the number measures—not only its value. +

    +
  • +
  • +

    Charts when it helps

    +

    + Turn suitable series into visualization specs (for example + Vega-Lite) so assistants can show trends, not only tables. +

    +
  • +
+
+ +
+

+ Questions your assistant can support +

+

+ The tools are built for natural, analyst-style questions. Your agent + resolves countries to standard codes, picks indicators, applies + filters (sex, age, urbanization, etc.), and checks coverage before + answering. +

+
    +
  • + How has GDP per capita changed in Kenya over the last two decades? + Search, confirm coverage, then return a time series. +
  • +
  • + Which indicators in WDI relate to female labor force participation, + and do they include Bangladesh? + Discovery plus metadata and disaggregation checks. +
  • +
  • + What is the official definition and source for this indicator—and + what footnotes should I mention? + Metadata retrieval for transparency and reporting. +
  • +
  • + Plot a line chart comparing access to electricity in three + countries since 2010. + Data first, then visualization specs when the series fits. +
  • +
+
+ +
+

Charts and visualizations

+

+ Beyond tables, the server can help agents produce chart-ready output + for time series and comparisons—so “show me the trend” becomes a + structured spec your client can render. +

+
+

+ Typical flow: fetch data with clear columns (for + example time period and observed value), then ask for a + visualization specification. Chart type hints such as line, bar, + or scatter are mapped to supported views; the agent can discover + valid options and constraints via dedicated tools. +

+
    +
  • + Line and area — trends over years or periods +
  • +
  • + Bar — discrete periods or categories when + appropriate +
  • +
  • + Scatter / point — relationships when the data + supports it +
  • +
+

+ Specs are produced in a standard chart grammar (Vega-Lite) so + compatible hosts can render them consistently. If you operate a + separate chart rendering service, you can point the server at it + with environment configuration for integration. +

+

+ Related tools: + data360_get_viz_spec, + data360_get_supported_chart_types +

+
+
+
+ +
+

+ Technical reference +

+

+ The sections below are for developers wiring an MCP + client, IDE, or custom agent: capabilities summary, tool and resource + tables, install steps, and connection URLs. +

+
+ +
+

Server capabilities

+

+ Implementation details behind the user-facing outcomes above—search, + metadata, series retrieval, codes, and LLM-oriented resources. +

+
    +
  • +

    Smart indicator discovery

    +

    + Search hundreds of indicators with enriched metadata and optional + country coverage checks. +

    +
  • +
  • +

    Rich metadata

    +

    + Methodology, definitions, limitations, and statistical concepts + on demand. +

    +
  • +
  • +

    Reliable time series

    +

    + Historical data with filters for country, period, sex, age, + urbanization, and more. +

    +
  • +
  • +

    LLM-oriented resources

    +

    + System prompts, codelists, and chain-of-thought guidance for + chatbot integration. +

    +
  • +
  • +

    Agent-friendly design

    +

    + Composable tools and guardrails that reduce common data retrieval + mistakes. +

    +
  • +
+
+ +
+

MCP tools

+

+ Use these tools in sequence: search, then fetch data with the right + codes and filters. +

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ToolDescription
data360_search_indicators + Search with enriched metadata; use + required_country for coverage. Returns + covers_country, latest_data, + dimensions. +
data360_get_data + Fetch data points with filters (country, time period, SEX, + AGE, etc.). +
data360_get_metadata + Indicator metadata; use select_fields to limit + payload. +
data360_get_disaggregation + Available filter values (countries, years, dimensions) per + indicator. +
data360_find_codelist_value + Resolve names to codes (e.g. “Kenya” → + KEN, “female” → F). +
data360_list_indicatorsList all indicators for a database.
data360_get_viz_spec + Build a chart specification from retrieved data (e.g. Vega-Lite). +
data360_get_supported_chart_typesList supported chart types and data requirements.
data360_get_data_api_urlLow-level: direct Data360 data API URL helper.
+
+ +

+ Recommended agent workflow +

+
1. Search → data360_search_indicators(query, required_country="Kenya")
+             Returns: covers_country, latest_data, dimensions per indicator
+
+2. Get data → data360_get_data(database_id, indicator_id, filters)
+             Use REF_AREA from search; add time period filters
+
+ +
+

MCP resources

+

+ Subscribe to these URIs for prompts and reference data. For chatbots, + include data360://system-prompt in your system prompt. +

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ResourceDescription
data360://system-promptChain-of-thought guidance for chatbot integration
data360://databasesAvailable databases (WB_WDI, WB_SSGD, …)
data360://codelistsCodelist reference (REF_AREA, SEX, AGE, …)
data360://metadata-fieldsField mapping for smart question routing
data360://data-filtersAvailable filters and usage guidance
data360://search-usageSearch examples and best practices
+
+
+ +
+

Get started

+

+ Requires Python 3.11+ and + uv (recommended) or + pip. Copy .env.example to .env + and adjust API URL or port as needed. +

+ +

Install & run

+
# Clone and install (uv)
+git clone https://github.com/worldbank/data360-mcp.git
+cd data360-mcp
+uv sync
+cp .env.example .env
+uv run poe serve
+# Server: http://localhost:8000/mcp
+ +

Configuration

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
VariableDescriptionDefault
DATA360_API_BASE_URLData360 API base URLhttps://data360api.worldbank.org
MCP_PORTMCP server port8000
MCP_TRANSPORThttp or ssehttp
MCP_CHARTS_API_URLOptional chart rendering API URL
+
+ +

+ Try the bundled demo: + uv run scripts/llm_mcp_demo.py (use + DEBUG=true for verbose logs). +

+
+ +
+

Connect your agent

+

+ Point your MCP client at the local HTTP or SSE endpoint. Use + host.docker.internal instead of + localhost from containers. +

+
+
+

Endpoints

+
+ + + + + + + + + + + + + + + +
Transporthttp (default) or sse
URL (HTTP)http://localhost:8000/mcp
URL (SSE) + http://localhost:8021/sse (example port) +
+
+
+
+

Databases (examples)

+

+ All Data360 databases are supported; list indicators with + data360_list_indicators. +

+
    +
  • WB_WDI — World Development Indicators
  • +
  • + WB_SSGD — Social Sustainability and Global + Database +
  • +
+
+
+
+
+ + + + diff --git a/docs/index.md b/docs/overview.md similarity index 93% rename from docs/index.md rename to docs/overview.md index d5680f2..eb1281c 100644 --- a/docs/index.md +++ b/docs/overview.md @@ -1,5 +1,7 @@ # Data360 MCP Server +> **Project site:** The marketing landing page for this repository is published at **[https://worldbank.github.io/data360-mcp](https://worldbank.github.io/data360-mcp)** (`docs/index.html`). This file is the markdown overview for readers browsing the repo on GitHub. + A [Model Context Protocol (MCP)](https://modelcontextprotocol.io) server that gives LLM agents direct access to the World Bank's [Data360 Platform](https://data360.worldbank.org/). ## Overview From 1b24260c83d22349969579ba4097b3c7c9cc49b1 Mon Sep 17 00:00:00 2001 From: "Aivin V. Solatorio" Date: Wed, 1 Apr 2026 13:56:58 -0400 Subject: [PATCH 2/5] Update ref branch Signed-off-by: Aivin V. Solatorio --- .github/workflows/pages.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/pages.yml b/.github/workflows/pages.yml index 20e8532..b065927 100644 --- a/.github/workflows/pages.yml +++ b/.github/workflows/pages.yml @@ -6,6 +6,7 @@ on: branches: - main - dev + - feat/landing-page workflow_dispatch: permissions: From fcf649ea175ee4268d5686312cfbb1c5ee60e086 Mon Sep 17 00:00:00 2001 From: "Aivin V. Solatorio" Date: Wed, 1 Apr 2026 22:38:17 -0400 Subject: [PATCH 3/5] tasks: add MCP tasks from PR review (baseline eval) Co-Authored-By: Claude Sonnet 4.6 --- TODO/CROSS-REPO-GRAPH.md | 3 ++ TODO/MCP-001-system-prompt-sync.md | 31 +++++++++++++++++ TODO/MCP-002-vega-line-tooltips.md | 33 ++++++++++++++++++ TODO/MCP-003-beeswarm-multi-country.md | 32 ++++++++++++++++++ TODO/MCP-004-viz-empty-data-guard.md | 47 ++++++++++++++++++++++++++ TODO/MCP-005-cache-configurable-ttl.md | 44 ++++++++++++++++++++++++ TODO/README.md | 41 ++++++++++++++++++++++ TODO/tasks.json | 8 +++++ 8 files changed, 239 insertions(+) create mode 100644 TODO/CROSS-REPO-GRAPH.md create mode 100644 TODO/MCP-001-system-prompt-sync.md create mode 100644 TODO/MCP-002-vega-line-tooltips.md create mode 100644 TODO/MCP-003-beeswarm-multi-country.md create mode 100644 TODO/MCP-004-viz-empty-data-guard.md create mode 100644 TODO/MCP-005-cache-configurable-ttl.md create mode 100644 TODO/README.md create mode 100644 TODO/tasks.json diff --git a/TODO/CROSS-REPO-GRAPH.md b/TODO/CROSS-REPO-GRAPH.md new file mode 100644 index 0000000..2efd32b --- /dev/null +++ b/TODO/CROSS-REPO-GRAPH.md @@ -0,0 +1,3 @@ +# Cross-repo task dependencies + +See the canonical graph in **[data-ai-chatbot/TODO/CROSS-REPO-GRAPH.md](../../data-ai-chatbot/TODO/CROSS-REPO-GRAPH.md)**. diff --git a/TODO/MCP-001-system-prompt-sync.md b/TODO/MCP-001-system-prompt-sync.md new file mode 100644 index 0000000..880171c --- /dev/null +++ b/TODO/MCP-001-system-prompt-sync.md @@ -0,0 +1,31 @@ +--- +id: MCP-001 +repo: data360-mcp +title: SYSTEM_PROMPT — align with chatbot Writer/Planner rules +status: pending +depends_on: + - BE-001 +blocks: [] +external_ref: vercel-ai-chatbot/TODO/BE-001-writer-planner-prompts.md +--- + +# MCP-001 — Sync `data360://system-prompt` with chatbot prompts + +## Goal + +Reduce drift between MCP-only LLM clients and the Data AI Chatbot: update `SYSTEM_PROMPT` in `src/data360/mcp_server/prompts.py` so tool-loop behavior and **final answer structure / link expectations** stay consistent with `vercel-ai-chatbot` `backend/app/ai/prompts.py` after **BE-001**. + +## Context + +- Resource registration: `src/data360/mcp_server/resources.py` exposes `data360://system-prompt`. +- Do not duplicate entire Writer prompt if inappropriate for MCP; align **non-conflicting** bullets: section order hints, link discipline, when to call `data360_get_viz_spec`. + +## Acceptance criteria + +- [ ] `SYSTEM_PROMPT` updated to reflect agreed structure and link/viz nudges (scoped to MCP tool use). +- [ ] README/docs that quote the resource updated if needed (`README.md`, `docs/overview.md`). +- [ ] No contradiction with existing mandatory tool-call rules in the same file. + +## Dependencies + +- **BE-001** (vercel-ai-chatbot) — source of truth for product wording; complete or sync in pair review. diff --git a/TODO/MCP-002-vega-line-tooltips.md b/TODO/MCP-002-vega-line-tooltips.md new file mode 100644 index 0000000..a22777f --- /dev/null +++ b/TODO/MCP-002-vega-line-tooltips.md @@ -0,0 +1,33 @@ +--- +id: MCP-002 +repo: data360-mcp +title: Vega-Lite — default tooltips for line/area charts +status: pending +depends_on: [] +blocks: [] +--- + +# MCP-002 — Default chart tooltips + +## Goal + +Line (and similar) charts emitted by `get_viz_spec` show hover tooltips by default for readability in embedded previews (e.g. chat `ChartPreview`). + +## Context + +- Implementation likely in `src/data360/visualization.py` and/or `src/data360/viz_config.py` where Vega-Lite spec is built. +- Frontend: `vercel-ai-chatbot` renders spec via vega-embed in `frontend/components/data360/chart-preview.tsx` — no change required if spec includes tooltips. + +## Acceptance criteria + +- [ ] Line/area (and other relevant marks) include `tooltip` encoding or mark-level tooltip defaults. +- [ ] Tests in `tests/test_visualization.py` updated or added to assert tooltip presence in spec JSON. +- [ ] Performance/tooltip density acceptable for many points (coordinate with MCP-003). + +## Dependencies + +- None. + +## Related + +- **MCP-003** often touches the same spec-building code; coordinate or sequence to reduce merge conflicts (soft dependency only). diff --git a/TODO/MCP-003-beeswarm-multi-country.md b/TODO/MCP-003-beeswarm-multi-country.md new file mode 100644 index 0000000..b6da28f --- /dev/null +++ b/TODO/MCP-003-beeswarm-multi-country.md @@ -0,0 +1,32 @@ +--- +id: MCP-003 +repo: data360-mcp +title: Many countries — beeswarm/strip or facet instead of clutter +status: pending +depends_on: [] +blocks: [] +soft_depends_on: + - MCP-002 +--- + +# MCP-003 — High cardinality series (multi-country) visualization + +## Goal + +When many countries/series make line charts unreadable or data is truncated, prefer a strip/beeswarm-style layout, faceting, or another documented strategy in the viz pipeline. + +## Context + +- `src/data360/viz_config.py`, `src/data360/visualization.py`. +- May require `data360_get_supported_chart_types` updates if new chart type names are exposed. +- Product threshold (country count) may come from design — document default in code comments. + +## Acceptance criteria + +- [ ] Heuristic selects alternate mark/layout above a configurable threshold. +- [ ] Unit tests cover threshold boundary and spec shape. +- [ ] Planner/chatbot prompts (BE-001) mention behavior if user-facing copy needed — optional follow-up. + +## Dependencies + +- **MCP-002** (soft): tooltip behavior should remain sane on the chosen mark. diff --git a/TODO/MCP-004-viz-empty-data-guard.md b/TODO/MCP-004-viz-empty-data-guard.md new file mode 100644 index 0000000..8402c26 --- /dev/null +++ b/TODO/MCP-004-viz-empty-data-guard.md @@ -0,0 +1,47 @@ +--- +id: MCP-004 +repo: data360-mcp +title: viz — guard against empty data in generate_vega_spec() +status: pending +depends_on: [] +blocks: [] +source: pr-review +file_ref: src/data360/mcp_server/viz.py +line_ref: 142 +--- + +# MCP-004 — Handle empty `data` list in `generate_vega_spec()` + +## Goal + +Prevent `generate_vega_spec()` from crashing with a `KeyError` when `data` is an empty list. Return a user-friendly error message or a graceful empty chart spec instead. + +## Context + +- PR review flagged that `data[0]['date']` raises `KeyError` when `data=[]`. +- The referenced file path in the review is `src/data360/mcp_server/viz.py` (line 142); the actual implementation lives in `src/data360/visualization.py` and/or `src/data360/viz_config.py` — confirm exact location before patching. +- Empty data can legitimately occur when an API query returns no results for a given filter combination; callers should receive a clear signal rather than an unhandled exception. + +## Acceptance criteria + +- [ ] `generate_vega_spec()` (or equivalent entry point) checks for an empty `data` list before accessing `data[0]`. +- [ ] Returns either a well-formed error dict/message or a minimal empty-chart Vega-Lite spec (document the chosen convention in a code comment). +- [ ] Unit test added: `test_generate_vega_spec_empty_data` asserts no exception is raised and the return value is well-formed. +- [ ] No regression on existing non-empty data tests. + +## Implementation hints + +- **Entry point:** `src/data360/visualization.py` → `get_viz_spec()` (function starting ~line 183). The `_fetch_data_internal()` helper at line ~109 already raises `ValueError("No data found...")` for empty API responses, and `get_viz_spec()` catches that at line ~243. However, if `data` reaches downstream spec-building steps (e.g. after filtering in the `relevant_fields` branch), an empty DataFrame/list can still cause `KeyError` or `IndexError` on first-row access. +- **Current behavior:** When `data` is empty after filtering, the code may attempt `data[0]['date']` (or an equivalent first-row access) and raise `KeyError` uncaught. +- **Desired behavior:** Add a guard immediately after any potential reduction to empty (e.g. after `viz_data = data[valid_cols].copy()`) that returns `err("Error: No data available after applying the requested filters.")`. The existing check at line ~439 (`if viz_data.empty: return err(...)`) provides the pattern to follow. +- **Test file:** `tests/test_visualization.py` — add a test case in the existing `TestGetVizSpecDracoFallbackWarning` class or a new class. Mock `_fetch_data_internal` to return a DataFrame that becomes empty after field filtering, and assert the returned dict has `url=None` and a non-None `error`. +- **Prior art:** `_fetch_data_internal()` at line ~109 already checks `if not raw_data: raise ValueError(...)` — follow the same early-exit pattern. The `if viz_data.empty: return err(...)` guard at line ~439 shows how to do it after cleaning. +- **Gotchas:** The file path in the original PR review (`src/data360/mcp_server/viz.py`) does not exist yet. If a separate `viz.py` module is created as part of a refactor, place the guard there; otherwise add it to `src/data360/visualization.py`. + +## Dependencies + +- None. + +## Related + +- **MCP-002** / **MCP-003** — touch the same spec-building pipeline; sequence or coordinate to avoid merge conflicts. diff --git a/TODO/MCP-005-cache-configurable-ttl.md b/TODO/MCP-005-cache-configurable-ttl.md new file mode 100644 index 0000000..6a6fba7 --- /dev/null +++ b/TODO/MCP-005-cache-configurable-ttl.md @@ -0,0 +1,44 @@ +--- +id: MCP-005 +repo: data360-mcp +title: cache — make TTL configurable via MCP_CACHE_TTL env var +status: pending +depends_on: [] +blocks: [] +source: pr-review +file_ref: src/data360/mcp_server/cache.py +line_ref: 23 +--- + +# MCP-005 — Configurable cache TTL via `MCP_CACHE_TTL` environment variable + +## Goal + +Replace the hardcoded 300-second TTL in the cache module with a value read from the `MCP_CACHE_TTL` environment variable, keeping 300 as the default. + +## Context + +- PR review flagged a hardcoded `300` (seconds) in `src/data360/mcp_server/cache.py` line 23. +- The file `src/data360/mcp_server/cache.py` does not currently exist in the repo; the cache logic may live elsewhere (e.g. `src/data360/config.py`). Locate the hardcoded value before patching. +- Making the TTL configurable allows operators to tune cache aggressiveness in different deployment environments without code changes. + +## Acceptance criteria + +- [ ] Cache TTL is read from `os.environ.get("MCP_CACHE_TTL", 300)` (or equivalent via `src/data360/config.py` if that module centralises env vars). +- [ ] Value is cast to `int` with a clear `ValueError` or fallback if the env var is set to a non-integer. +- [ ] Default behaviour (TTL = 300 s) is unchanged when the env var is unset. +- [ ] `MCP_CACHE_TTL` documented in `README.md` (or equivalent config docs) under environment variables. +- [ ] Unit test added verifying TTL is picked up from the environment variable. + +## Implementation hints + +- **Entry point:** The file `src/data360/mcp_server/cache.py` (line 23 per the PR review) does not yet exist in the repo. The likely location for the hardcoded TTL is wherever a TTL-based caching layer is added. The existing settings system in `src/data360/config.py` uses `pydantic-settings` with `env_prefix="MCP_"` (`MCPServerSettings` class, line ~10). The cleanest approach is to add a `cache_ttl: int = Field(default=300, ...)` field to `MCPServerSettings` — with `env_prefix="MCP_"` already in place, it will automatically read `MCP_CACHE_TTL` from the environment. +- **Current behavior:** TTL is hardcoded to `300` (seconds) in the cache module (or will be when that module is created). +- **Desired behavior:** TTL is read from `MCP_CACHE_TTL` env var via `MCPServerSettings.cache_ttl`, defaulting to `300`. Cache module imports `get_mcp_server_settings()` and uses `settings.cache_ttl`. +- **Test file:** No dedicated cache test file exists yet. Create `tests/test_cache.py`. Test that `MCPServerSettings(MCP_CACHE_TTL=600).cache_ttl == 600` and that the default is `300` when env var is unset. +- **Prior art:** `src/data360/config.py` `MCPServerSettings` — all other MCP tunables (port, log_level, charts_api_url) follow exactly this pattern. Add `cache_ttl` here to stay consistent. +- **Gotchas:** `get_mcp_server_settings()` is decorated with `@ft.cache` (functools LRU cache), meaning it returns the same instance after first call. In tests, ensure settings are created fresh with `MCPServerSettings(...)` directly rather than calling the cached factory, or use `monkeypatch.setenv` before the factory is first called. + +## Dependencies + +- None. diff --git a/TODO/README.md b/TODO/README.md new file mode 100644 index 0000000..2feae1b --- /dev/null +++ b/TODO/README.md @@ -0,0 +1,41 @@ +# data360-mcp — task index + +Agent-oriented work items for visualization and MCP `SYSTEM_PROMPT` alignment with the Data AI Chatbot backlog. + +**Sibling repo (same parent folder as this repo):** [vercel-ai-chatbot/TODO](../../vercel-ai-chatbot/TODO/README.md) + +## Task IDs + +| ID | File | Summary | +|----|------|---------| +| MCP-001 | [MCP-001-system-prompt-sync.md](./MCP-001-system-prompt-sync.md) | Mirror Writer/Planner-facing rules into `SYSTEM_PROMPT` (`src/data360/mcp_server/prompts.py`) | +| MCP-002 | [MCP-002-vega-line-tooltips.md](./MCP-002-vega-line-tooltips.md) | Default Vega-Lite tooltips for line/area (and similar) in generated specs | +| MCP-003 | [MCP-003-beeswarm-multi-country.md](./MCP-003-beeswarm-multi-country.md) | High series cardinality → strip/beeswarm (or facet) instead of unreadable line clutter | +| MCP-004 | [MCP-004-viz-empty-data-guard.md](./MCP-004-viz-empty-data-guard.md) | Guard `generate_vega_spec()` against empty `data` list (KeyError crash) | +| MCP-005 | [MCP-005-cache-configurable-ttl.md](./MCP-005-cache-configurable-ttl.md) | Make cache TTL configurable via `MCP_CACHE_TTL` env var (default 300 s) | + +## Dependency graph (this repo) + +```mermaid +flowchart TB + MCP001[MCP-001 SYSTEM_PROMPT] + MCP002[MCP-002 tooltips] + MCP003[MCP-003 beeswarm] + MCP004[MCP-004 empty-data guard] + MCP005[MCP-005 configurable TTL] + MCP002 -.->|shared code| MCP003 + MCP002 -.->|shared viz pipeline| MCP004 + MCP003 -.->|shared viz pipeline| MCP004 +``` + +**Note:** MCP-002 → MCP-003 is a **soft** ordering (shared viz pipeline); MCP-001 is independent of MCP-002/003 but should follow chatbot **BE-001** for content sync. + +## Cross-repo + +See [CROSS-REPO-GRAPH.md](./CROSS-REPO-GRAPH.md) (mirrors vercel-ai-chatbot’s graph for MCP edges). + +## How to use + +1. Point an agent at this folder or a single `MCP-*.md`. +2. **MCP-001** should be coordinated with `vercel-ai-chatbot/TODO/BE-001-writer-planner-prompts.md`. +3. Update task `status` in frontmatter when done. diff --git a/TODO/tasks.json b/TODO/tasks.json new file mode 100644 index 0000000..0f7744f --- /dev/null +++ b/TODO/tasks.json @@ -0,0 +1,8 @@ +{ + "name": "data360-mcp", + "domains": ["MCP"], + "github_repo": "", + "peers": { + "data-ai-chatbot": "${DATA_AI_CHATBOT_PATH}" + } +} From c90e0062ffafca185c91e4347430fee6ca2dc47a Mon Sep 17 00:00:00 2001 From: "Aivin V. Solatorio" Date: Thu, 2 Apr 2026 00:14:32 -0400 Subject: [PATCH 4/5] =?UTF-8?q?tasks:=20upgrade=20MCP-001=E2=80=93003=20wi?= =?UTF-8?q?th=20priority=20+=20implementation=20hints?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add missing `priority:` frontmatter and `## Implementation hints` sections to pre-existing MCP task files. Hints include specific file paths, function names, line numbers, current vs desired behavior, test locations, and gotchas. Co-Authored-By: Claude Sonnet 4.6 --- TODO/MCP-001-system-prompt-sync.md | 9 +++++++++ TODO/MCP-002-vega-line-tooltips.md | 10 ++++++++++ TODO/MCP-003-beeswarm-multi-country.md | 11 +++++++++++ 3 files changed, 30 insertions(+) diff --git a/TODO/MCP-001-system-prompt-sync.md b/TODO/MCP-001-system-prompt-sync.md index 880171c..22fb015 100644 --- a/TODO/MCP-001-system-prompt-sync.md +++ b/TODO/MCP-001-system-prompt-sync.md @@ -3,6 +3,7 @@ id: MCP-001 repo: data360-mcp title: SYSTEM_PROMPT — align with chatbot Writer/Planner rules status: pending +priority: medium depends_on: - BE-001 blocks: [] @@ -20,6 +21,14 @@ Reduce drift between MCP-only LLM clients and the Data AI Chatbot: update `SYSTE - Resource registration: `src/data360/mcp_server/resources.py` exposes `data360://system-prompt`. - Do not duplicate entire Writer prompt if inappropriate for MCP; align **non-conflicting** bullets: section order hints, link discipline, when to call `data360_get_viz_spec`. +## Implementation hints +- **Entry point:** `src/data360/mcp_server/prompts.py` — `SYSTEM_PROMPT` string literal lines 11–56. +- **Current structure:** 4 sections — `### Non-negotiable rule` (line 15), `### Operating loop` (line 19, 5-step workflow: search → codelist → availability → get data → visualize), `### Defaults` (line 47, 20-year range hardcoded in text), `### Output behavior` (line 52). +- **Exposed via:** `src/data360/mcp_server/resources.py` line 159–162 as MCP resource `data360://system-prompt`. +- **Desired behavior:** After BE-001 finalises section order and link discipline in the chatbot Writer prompt, mirror the non-conflicting bullets here — specifically: section order hints for final answer, link discipline for API/indicator URLs, when to call `data360_get_viz_spec`. Do NOT duplicate the full Writer prompt. +- **Test file:** No tests for SYSTEM_PROMPT content. After update, manually verify that `data360://system-prompt` resource returns the updated string via MCP inspector or a quick test script. +- **Gotchas:** SYSTEM_PROMPT is a plain string — preserve existing markdown formatting exactly. The 20-year default is hardcoded in text (line 48–49), not a config variable. Sync changes in `prompts.py` to `README.md` and `docs/overview.md` if they quote the resource content. + ## Acceptance criteria - [ ] `SYSTEM_PROMPT` updated to reflect agreed structure and link/viz nudges (scoped to MCP tool use). diff --git a/TODO/MCP-002-vega-line-tooltips.md b/TODO/MCP-002-vega-line-tooltips.md index a22777f..33c2df8 100644 --- a/TODO/MCP-002-vega-line-tooltips.md +++ b/TODO/MCP-002-vega-line-tooltips.md @@ -3,6 +3,7 @@ id: MCP-002 repo: data360-mcp title: Vega-Lite — default tooltips for line/area charts status: pending +priority: medium depends_on: [] blocks: [] --- @@ -18,6 +19,15 @@ Line (and similar) charts emitted by `get_viz_spec` show hover tooltips by defau - Implementation likely in `src/data360/visualization.py` and/or `src/data360/viz_config.py` where Vega-Lite spec is built. - Frontend: `vercel-ai-chatbot` renders spec via vega-embed in `frontend/components/data360/chart-preview.tsx` — no change required if spec includes tooltips. +## Implementation hints +- **Entry point (Draco path):** `src/data360/visualization.py` line 555–556 — `tooltip_cols = list(viz_data.columns); chart = chart.encode(tooltip=tooltip_cols)`. This runs after Draco rendering. +- **Entry point (fallback path):** `src/data360/visualization.py` line 603 — `base = alt.Chart(viz_data).mark_line().encode(tooltip=fc)`. Fallback uses `fc` (field config) for tooltip. +- **Current behavior:** Tooltip is a flat list of all columns. Column names at this point are already renamed: `time_period → year`, `obs_value → value`, `ref_area → country` (renaming happens at lines 414–434). Tooltip shows raw renamed columns with no formatting. +- **Desired behavior:** Structured tooltips per chart type — for line/area: show `year`, `value`, `country` with human-readable labels and number formatting. Use `alt.Tooltip(field, title, format)` objects instead of raw column names. +- **Chart types to cover:** line (line 149), area (line 167) as minimum. bar (line 155), point (line 161), tick (line 173) as stretch. +- **Test file:** `tests/test_visualization.py` — existing class `TestGetVizSpecDracoFallbackWarning` (lines 11–147). Add a new test class `TestTooltipEncoding` that asserts tooltip fields and titles on the returned spec dict. +- **Gotchas:** Tooltip encoding happens AFTER Draco (line 555) — this is intentional to allow override. Fallback path (line 603) hardcodes `mark_line()` regardless of chart type — tooltip fix should handle both paths. Codelist mapping runs before viz (lines 421–431) so `country` column contains names not codes at tooltip time. + ## Acceptance criteria - [ ] Line/area (and other relevant marks) include `tooltip` encoding or mark-level tooltip defaults. diff --git a/TODO/MCP-003-beeswarm-multi-country.md b/TODO/MCP-003-beeswarm-multi-country.md index b6da28f..52251cc 100644 --- a/TODO/MCP-003-beeswarm-multi-country.md +++ b/TODO/MCP-003-beeswarm-multi-country.md @@ -3,6 +3,7 @@ id: MCP-003 repo: data360-mcp title: Many countries — beeswarm/strip or facet instead of clutter status: pending +priority: medium depends_on: [] blocks: [] soft_depends_on: @@ -21,6 +22,16 @@ When many countries/series make line charts unreadable or data is truncated, pre - May require `data360_get_supported_chart_types` updates if new chart type names are exposed. - Product threshold (country count) may come from design — document default in code comments. +## Implementation hints +- **Cardinality detection:** `src/data360/visualization.py` lines 490–515 — loops over breakdown dims (`country`, `sex`, `age`, `urbanisation`), picks first with `.nunique() > 1` as color dim. No upper-bound check for "too many series". +- **Chart type selection:** `src/data360/viz_config.py` → `should_use_temporal_x_axis()` lines 367–448. Returns `(True, None)` for time-series or `(False, categorical_field)` for categorical. Preference scores by chart type: tick=1.0, point=0.7, bar=0.5, line=0.2, area=0.1. +- **Current behavior:** With 20+ countries, Draco picks a line or bar chart with color encoding for each country. Above ~10 series, the chart becomes unreadable — overlapping lines, crowded legend. +- **Desired behavior:** When `country.nunique() > N` (suggested threshold: 10, make it configurable) AND single-year data (temporal cardinality = 1), switch to strip/beeswarm: `mark_point()` with jitter, value on y-axis, country as color limited to top-N or omitted. +- **Beeswarm in Vega-Lite:** No native beeswarm mark. Use `mark_point()` with `transform: [{"calculate": "random()", "as": "jitter"}]` and encode jitter on x-axis. Or use Altair's `mark_tick()` (already in the chart type list at line 173) as a simpler strip chart alternative. +- **Where to add:** Add `should_use_beeswarm(viz_data)` function in `viz_config.py` near `should_use_temporal_x_axis()`. Call it in `visualization.py` before Draco encoding (around line 490) when cardinality exceeds threshold. +- **Test file:** `tests/test_visualization.py`. Add `TestHighCardinalityChartSelection` — assert beeswarm/strip selected when country count > threshold, line chart when ≤ threshold. +- **Gotchas:** Vega-Lite jitter requires a `calculate` transform — adds complexity to the spec. `tick` mark already exists and may be sufficient as a simpler strip chart without jitter. Clarify with product whether jitter is required or if a tick/strip is acceptable. Cardinality thresholds should be centralised (add to `MCPServerSettings` in `config.py` alongside future `MCP_CACHE_TTL`). + ## Acceptance criteria - [ ] Heuristic selects alternate mark/layout above a configurable threshold. From e42bd0d9742b141770394bea15c1918f93f9ff27 Mon Sep 17 00:00:00 2001 From: "Aivin V. Solatorio" Date: Thu, 2 Apr 2026 00:23:41 -0400 Subject: [PATCH 5/5] tasks: regenerate TODO index Co-Authored-By: Claude Sonnet 4.6 --- TODO/README.md | 41 +++++++++++++++++------------------------ 1 file changed, 17 insertions(+), 24 deletions(-) diff --git a/TODO/README.md b/TODO/README.md index 2feae1b..55d9c99 100644 --- a/TODO/README.md +++ b/TODO/README.md @@ -1,41 +1,34 @@ # data360-mcp — task index -Agent-oriented work items for visualization and MCP `SYSTEM_PROMPT` alignment with the Data AI Chatbot backlog. +Agent-oriented work items. Each task is a standalone `.md` file with context, acceptance criteria, and dependency metadata. (5 active) -**Sibling repo (same parent folder as this repo):** [vercel-ai-chatbot/TODO](../../vercel-ai-chatbot/TODO/README.md) +**Sibling repos:** [data-ai-chatbot/TODO](../data-ai-chatbot/TODO/README.md) ## Task IDs | ID | File | Summary | |----|------|---------| -| MCP-001 | [MCP-001-system-prompt-sync.md](./MCP-001-system-prompt-sync.md) | Mirror Writer/Planner-facing rules into `SYSTEM_PROMPT` (`src/data360/mcp_server/prompts.py`) | -| MCP-002 | [MCP-002-vega-line-tooltips.md](./MCP-002-vega-line-tooltips.md) | Default Vega-Lite tooltips for line/area (and similar) in generated specs | -| MCP-003 | [MCP-003-beeswarm-multi-country.md](./MCP-003-beeswarm-multi-country.md) | High series cardinality → strip/beeswarm (or facet) instead of unreadable line clutter | -| MCP-004 | [MCP-004-viz-empty-data-guard.md](./MCP-004-viz-empty-data-guard.md) | Guard `generate_vega_spec()` against empty `data` list (KeyError crash) | -| MCP-005 | [MCP-005-cache-configurable-ttl.md](./MCP-005-cache-configurable-ttl.md) | Make cache TTL configurable via `MCP_CACHE_TTL` env var (default 300 s) | +| MCP-001 | [MCP-001-system-prompt-sync.md](./MCP-001-system-prompt-sync.md) | Reduce drift between MCP-only LLM clients and the Data AI Chatbot: update `SY... | +| MCP-002 | [MCP-002-vega-line-tooltips.md](./MCP-002-vega-line-tooltips.md) | Line (and similar) charts emitted by `get_viz_spec` show hover tooltips by de... | +| MCP-003 | [MCP-003-beeswarm-multi-country.md](./MCP-003-beeswarm-multi-country.md) | When many countries/series make line charts unreadable or data is truncated, ... | +| MCP-004 | [MCP-004-viz-empty-data-guard.md](./MCP-004-viz-empty-data-guard.md) | Prevent `generate_vega_spec()` from crashing with a `KeyError` when `data` is... | +| MCP-005 | [MCP-005-cache-configurable-ttl.md](./MCP-005-cache-configurable-ttl.md) | Replace the hardcoded 300-second TTL in the cache module with a value read fr... | ## Dependency graph (this repo) ```mermaid flowchart TB - MCP001[MCP-001 SYSTEM_PROMPT] - MCP002[MCP-002 tooltips] - MCP003[MCP-003 beeswarm] - MCP004[MCP-004 empty-data guard] - MCP005[MCP-005 configurable TTL] - MCP002 -.->|shared code| MCP003 - MCP002 -.->|shared viz pipeline| MCP004 - MCP003 -.->|shared viz pipeline| MCP004 + MCP001["MCP-001 SYSTEM_PROMPT — align with cha"] + MCP002["MCP-002 Vega-Lite — default tooltips f"] + MCP003["MCP-003 Many countries — beeswarm/stri"] + MCP004["MCP-004 viz — guard against empty data"] + MCP005["MCP-005 cache — make TTL configurable "] + MCP002 -.-> MCP003 ``` -**Note:** MCP-002 → MCP-003 is a **soft** ordering (shared viz pipeline); MCP-001 is independent of MCP-002/003 but should follow chatbot **BE-001** for content sync. - -## Cross-repo - -See [CROSS-REPO-GRAPH.md](./CROSS-REPO-GRAPH.md) (mirrors vercel-ai-chatbot’s graph for MCP edges). - ## How to use -1. Point an agent at this folder or a single `MCP-*.md`. -2. **MCP-001** should be coordinated with `vercel-ai-chatbot/TODO/BE-001-writer-planner-prompts.md`. -3. Update task `status` in frontmatter when done. +1. Point an agent at `TODO/` or a specific task file. +2. Check `depends_on` in the task frontmatter before starting — all hard deps must be `done`. +3. Claim a task by setting `status: in_progress` and committing immediately. +4. After completing a task, set `status: done`, tick acceptance criteria checkboxes, and check `blocks` for newly unblocked tasks.