Skip to content

bonnard-data/awesome-agentic-analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Awesome Agentic Analytics Awesome

A curated list of tools, frameworks, and resources for AI-driven data analytics.

Agentic analytics is the practice of using AI agents to autonomously query, analyze, and act on data. Rather than writing SQL by hand or clicking through dashboards, agents reason over schemas, compose queries, and return governed answers.

Contents

Agentic Analytics Platforms

Tools that let AI agents query, analyze, and act on data autonomously.

  • Bonnard - Agentic semantic layer with MCP server, multi-tenant publishable keys, React SDK, and markdown dashboards.
  • Databricks Genie - Natural language interface that translates questions into SQL over governed Unity Catalog datasets.
  • DataGPT - Conversational AI data analyst providing analyst-grade answers to data questions in seconds.
  • Definite - AI-native analytics platform with semantic layer support for teams.
  • Dot - AI data analyst that answers data questions via Slack and Teams, connecting to Snowflake, BigQuery, Redshift, and Databricks.
  • MindsDB - Open-source query engine for AI analytics that lets you build self-reasoning agents across live data sources.
  • Snowflake Cortex Analyst - LLM-powered feature that answers business questions via natural language using a YAML semantic model.
  • SuperSonic - AI+BI platform from Tencent Music unifying Chat BI and Headless BI with a semantic layer.
  • Tellius - Enterprise agentic analytics platform combining conversational analytics and multi-agent orchestration.
  • ThoughtSpot - Search-driven analytics platform with agentic data prep and autonomous reporting.
  • WrenAI - Open-source GenBI platform for natural language to SQL, charts, and insights with built-in semantic modeling.
  • Zerve AI - Agentic data workspace where an intelligent agent plans and executes analysis tasks through chat or code.

Text-to-SQL Engines

Engines that translate natural language questions into SQL queries.

  • DAIL-SQL - Systematic fine-tuning approach achieving 86.6% execution accuracy on Spider with GPT-4.
  • Dataherald - Enterprise natural language to SQL engine built on LangChain, providing a REST API for database Q&A.
  • DIN-SQL - Decomposes text-to-SQL into sub-tasks with different prompts per complexity class.
  • MAC-SQL - Multi-agent collaborative framework using Selector, Decomposer, and Refiner agents for text-to-SQL.
  • PandasAI - Chat with your database or datalake using LLMs and RAG for conversational data analysis.
  • QueryWeaver - Graph-powered text-to-SQL tool that maps schemas into knowledge graphs for contextual understanding.
  • RESDSQL - Decouples schema linking from skeleton parsing for enhanced text-to-SQL accuracy.
  • SQLChat - Chat-based SQL client using natural language to interact with databases.
  • SQLCoder - Open-source LLM for text-to-SQL with 7B, 15B, and 70B variants. The 70B version hits 93% accuracy.
  • Vanna.ai - Open-source Python RAG framework for text-to-SQL, rewritten as a production-ready agent framework in v2.0.

Semantic Layers for Agents

Tools providing governed data access and consistent metric definitions to AI agents.

  • AtScale - Enterprise semantic layer with its own Semantic Modeling Language (SML) and universal metric definitions across BI tools.
  • Bonnard - Agent-native semantic layer with MCP server, multi-tenant publishable keys, React SDK, and CLI-first deployment.
  • Cube - Open-source semantic layer with REST, GraphQL, and SQL APIs. Pre-aggregation caching and multi-tenant security contexts.
  • Databricks Metric Views - Warehouse-native metrics layer within Unity Catalog powering Genie and other AI/BI features.
  • dbt Semantic Layer - MetricFlow-powered metrics definitions integrated into the dbt workflow.
  • dotML - Lightweight open-source semantic layer written in Python, developed by the Dot/Snowboard team.
  • Lightdash - Open-source BI tool with a built-in semantic layer that turns dbt projects into full-stack BI platforms.
  • Snowflake Semantic Views - Warehouse-native semantic layer allowing metric definitions directly within Snowflake.
  • Synmetrix - Open-source semantic layer built on Cube for self-hosted metric management.

Agent Frameworks with Data Capabilities

General-purpose agent frameworks with built-in data and analytics tooling.

  • AutoGen - Microsoft's multi-agent conversation framework supporting data analysis workflows with code execution.
  • AWS Strands SDK - AWS agent SDK with data analyst agent templates supporting SQL, pandas, and Matplotlib.
  • Composio - Tool integration platform for AI agents with 250+ app connectors including database and analytics tools.
  • CrewAI - Multi-agent orchestration framework with data analysis crew templates.
  • LangChain - LLM framework with SQL Database toolkit, document loaders, and vector store integrations for data-aware agents.
  • LlamaIndex - Data framework for LLM applications with SQL query engines and agentic RAG capabilities.
  • Open Interpreter - Natural language interface that executes Python, JS, and Shell code locally for data analysis through conversation.

Notebook and Code Agent Tools

AI agents that write and execute analysis code in notebook environments.

  • Jupyter AI - Official JupyterLab extension adding a chat sidebar and cell magic for LLM-powered code generation.
  • Jupyter AI Agents - AI agents for JupyterLab with MCP tools for optimized notebook interaction and code execution.
  • Julius AI - AI data analyst that processes uploaded files and writes Python code for analysis and visualization.
  • LAMBDA - Open-source code-free multi-agent data analysis system using programmer and inspector agents.
  • OpenAI Code Interpreter - Built-in Assistants API tool that writes and runs Python in a sandboxed environment for data analysis.
  • RunCell - AI-native notebook for data analysis positioned as the next-generation alternative to Jupyter.

Benchmarks

Datasets and evaluation frameworks for measuring text-to-SQL and agentic analytics accuracy.

  • BIRD-SQL - 12,751 question-SQL pairs across 95 large databases. Introduces Valid Efficiency Score measuring both correctness and query efficiency.
  • CoSQL - Conversational text-to-SQL challenge combining dialogue state tracking with SQL generation.
  • dbt Semantic Layer LLM Benchmarking - Benchmark for evaluating LLM performance when querying through a semantic layer interface.
  • SParC - Context-dependent multi-turn version of Spider, testing sequential question understanding.
  • Spider 1.0 - 10,181 questions across 200 databases. The foundational cross-domain text-to-SQL benchmark from Yale.
  • Spider 2.0 - 632 real-world enterprise problems with 3000+ column schemas and multiple SQL dialects. Best agents solve only 21.3%.
  • WikiSQL - 80,654 NL-SQL pairs over 24,241 tables. Single-table, simple queries. Largely saturated at >90% accuracy.

Learning Resources

Articles

Curated Lists

Standards

Contributing

Contributions welcome! Read the contribution guidelines first.

About

A curated list of agentic analytics tools, text-to-SQL engines, semantic layers, and resources.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors