A curated list of tools, frameworks, and resources for AI-driven data analytics.
Agentic analytics is the practice of using AI agents to autonomously query, analyze, and act on data. Rather than writing SQL by hand or clicking through dashboards, agents reason over schemas, compose queries, and return governed answers.
- Agentic Analytics Platforms
- Text-to-SQL Engines
- Semantic Layers for Agents
- Agent Frameworks with Data Capabilities
- Notebook and Code Agent Tools
- Benchmarks
- Learning Resources
Tools that let AI agents query, analyze, and act on data autonomously.
- Bonnard - Agentic semantic layer with MCP server, multi-tenant publishable keys, React SDK, and markdown dashboards.
- Databricks Genie - Natural language interface that translates questions into SQL over governed Unity Catalog datasets.
- DataGPT - Conversational AI data analyst providing analyst-grade answers to data questions in seconds.
- Definite - AI-native analytics platform with semantic layer support for teams.
- Dot - AI data analyst that answers data questions via Slack and Teams, connecting to Snowflake, BigQuery, Redshift, and Databricks.
- MindsDB - Open-source query engine for AI analytics that lets you build self-reasoning agents across live data sources.
- Snowflake Cortex Analyst - LLM-powered feature that answers business questions via natural language using a YAML semantic model.
- SuperSonic - AI+BI platform from Tencent Music unifying Chat BI and Headless BI with a semantic layer.
- Tellius - Enterprise agentic analytics platform combining conversational analytics and multi-agent orchestration.
- ThoughtSpot - Search-driven analytics platform with agentic data prep and autonomous reporting.
- WrenAI - Open-source GenBI platform for natural language to SQL, charts, and insights with built-in semantic modeling.
- Zerve AI - Agentic data workspace where an intelligent agent plans and executes analysis tasks through chat or code.
Engines that translate natural language questions into SQL queries.
- DAIL-SQL - Systematic fine-tuning approach achieving 86.6% execution accuracy on Spider with GPT-4.
- Dataherald - Enterprise natural language to SQL engine built on LangChain, providing a REST API for database Q&A.
- DIN-SQL - Decomposes text-to-SQL into sub-tasks with different prompts per complexity class.
- MAC-SQL - Multi-agent collaborative framework using Selector, Decomposer, and Refiner agents for text-to-SQL.
- PandasAI - Chat with your database or datalake using LLMs and RAG for conversational data analysis.
- QueryWeaver - Graph-powered text-to-SQL tool that maps schemas into knowledge graphs for contextual understanding.
- RESDSQL - Decouples schema linking from skeleton parsing for enhanced text-to-SQL accuracy.
- SQLChat - Chat-based SQL client using natural language to interact with databases.
- SQLCoder - Open-source LLM for text-to-SQL with 7B, 15B, and 70B variants. The 70B version hits 93% accuracy.
- Vanna.ai - Open-source Python RAG framework for text-to-SQL, rewritten as a production-ready agent framework in v2.0.
Tools providing governed data access and consistent metric definitions to AI agents.
- AtScale - Enterprise semantic layer with its own Semantic Modeling Language (SML) and universal metric definitions across BI tools.
- Bonnard - Agent-native semantic layer with MCP server, multi-tenant publishable keys, React SDK, and CLI-first deployment.
- Cube - Open-source semantic layer with REST, GraphQL, and SQL APIs. Pre-aggregation caching and multi-tenant security contexts.
- Databricks Metric Views - Warehouse-native metrics layer within Unity Catalog powering Genie and other AI/BI features.
- dbt Semantic Layer - MetricFlow-powered metrics definitions integrated into the dbt workflow.
- dotML - Lightweight open-source semantic layer written in Python, developed by the Dot/Snowboard team.
- Lightdash - Open-source BI tool with a built-in semantic layer that turns dbt projects into full-stack BI platforms.
- Snowflake Semantic Views - Warehouse-native semantic layer allowing metric definitions directly within Snowflake.
- Synmetrix - Open-source semantic layer built on Cube for self-hosted metric management.
General-purpose agent frameworks with built-in data and analytics tooling.
- AutoGen - Microsoft's multi-agent conversation framework supporting data analysis workflows with code execution.
- AWS Strands SDK - AWS agent SDK with data analyst agent templates supporting SQL, pandas, and Matplotlib.
- Composio - Tool integration platform for AI agents with 250+ app connectors including database and analytics tools.
- CrewAI - Multi-agent orchestration framework with data analysis crew templates.
- LangChain - LLM framework with SQL Database toolkit, document loaders, and vector store integrations for data-aware agents.
- LlamaIndex - Data framework for LLM applications with SQL query engines and agentic RAG capabilities.
- Open Interpreter - Natural language interface that executes Python, JS, and Shell code locally for data analysis through conversation.
AI agents that write and execute analysis code in notebook environments.
- Jupyter AI - Official JupyterLab extension adding a chat sidebar and cell magic for LLM-powered code generation.
- Jupyter AI Agents - AI agents for JupyterLab with MCP tools for optimized notebook interaction and code execution.
- Julius AI - AI data analyst that processes uploaded files and writes Python code for analysis and visualization.
- LAMBDA - Open-source code-free multi-agent data analysis system using programmer and inspector agents.
- OpenAI Code Interpreter - Built-in Assistants API tool that writes and runs Python in a sandboxed environment for data analysis.
- RunCell - AI-native notebook for data analysis positioned as the next-generation alternative to Jupyter.
Datasets and evaluation frameworks for measuring text-to-SQL and agentic analytics accuracy.
- BIRD-SQL - 12,751 question-SQL pairs across 95 large databases. Introduces Valid Efficiency Score measuring both correctness and query efficiency.
- CoSQL - Conversational text-to-SQL challenge combining dialogue state tracking with SQL generation.
- dbt Semantic Layer LLM Benchmarking - Benchmark for evaluating LLM performance when querying through a semantic layer interface.
- SParC - Context-dependent multi-turn version of Spider, testing sequential question understanding.
- Spider 1.0 - 10,181 questions across 200 databases. The foundational cross-domain text-to-SQL benchmark from Yale.
- Spider 2.0 - 632 real-world enterprise problems with 3000+ column schemas and multiple SQL dialects. Best agents solve only 21.3%.
- WikiSQL - 80,654 NL-SQL pairs over 24,241 tables. Single-table, simple queries. Largely saturated at >90% accuracy.
- Agentic Analytics: Complete Guide to AI-Driven Data Intelligence - GoodData overview of agentic analytics concepts and architecture patterns.
- Semantic Layer and AI: The Future of Data Querying - Cube's perspective on semantic layers enabling natural language data querying.
- Semantic Layers: A Buyers Guide - Independent comparison of semantic layer architectures.
- Text-to-SQL Benchmarks and the Current State of the Art - Analysis of major benchmarks and current performance levels.
- The Agentic Future Demands an Open Semantic Layer - Salesforce on open standards for agent interoperability.
- Why AI Agents Need a Semantic Layer - How semantic layers solve text-to-SQL governance problems for AI agents.
- Why Agentic AI Needs a Semantic Core - Why agents need governed metric definitions, not raw table access.
- Why Enterprise AI Agents Need a Semantic Layer - AtScale on enterprise requirements for agentic data access.
- Awesome AI Analytics - Curated list of AI analytics assistants and text-to-SQL tools.
- Awesome LLM-based Text2SQL - Papers, benchmarks, and projects for LLM-based text-to-SQL.
- Awesome Semantic Layer - Curated list of semantic layer tools, frameworks, and resources.
- Awesome Text2SQL - Tutorials and resources for text-to-SQL, text-to-DSL, and text-to-API.
- NL2SQL Handbook - Continuously updated handbook tracking the latest text-to-SQL techniques.
- Open Semantic Interchange (OSI) - Cross-vendor spec for interoperable semantic layer definitions. V1 released January 2026.
Contributions welcome! Read the contribution guidelines first.