Multi-Agent Data Analysis System

Overview

This project implements an automated data analysis system using multiple specialized agents. The system can download datasets from Hugging Face, perform comprehensive data analysis, and generate detailed reports automatically.

Features

Automated dataset download from Hugging Face repositories
Comprehensive data quality assessment
Statistical and qualitative analysis
Automated visualization generation
PDF report generation
Multi-agent architecture for specialized tasks

System Requirements

Python 3.x
Git LFS (for downloading datasets)
Required Python packages:
- autogen
- pandas
- numpy
- matplotlib
- seaborn
- fpdf

Project Structure

├── agents.py           # Agent definitions and initialization
├── config.py          # Configuration settings
├── main.py            # Main application entry point
├── system_messages.py # Agent system messages/instructions
└── utils.py           # Utility functions for data processing

Agent Roles

Data Source Manager: Handles dataset downloading and file management
Data Quality Agent: Assesses dataset quality and completeness
Statistical Analysis Agent: Performs numerical analysis and statistics
Qualitative Analysis Agent: Analyzes data structure and patterns
Visualization Agent: Creates data visualizations
Report Generation Agent: Compiles findings into PDF reports

Setup Instructions

Clone the repository:

git clone [repository-url]
cd [repository-name]

Install dependencies:

pip install autogen pandas numpy matplotlib seaborn fpdf

Configure the API key:
- Open config.py
- Replace "ENTER YOUR API KEY" with your actual API key

Usage

Basic usage:

from main import process_dataset

# Process a dataset from Hugging Face
process_dataset("https://huggingface.co/datasets/scikit-learn/iris")

Output structure:

datasets/
├── quality_assessment/
│   └── quality_assessment.txt
├── insights/
│   └── insights.txt
├── visualizations/
│   ├── correlation_heatmap.png
│   └── feature_distributions.png
└── output/
    └── analysis_report.pdf

Key Components

agents.py

Defines and initializes all agent types
Configures agent behaviors and capabilities
Registers execution functions for each agent

config.py

Contains LLM configuration
API key settings
Model specifications

main.py

Sets up the group chat between agents
Manages the orchestration of the analysis workflow
Provides the main entry point for processing datasets

system_messages.py

Defines the role and responsibilities of each agent
Contains system prompts for agent behavior
Establishes workflow protocols

utils.py

Implements core functionality for:
- Dataset downloading
- Quality assessment
- Statistical analysis
- Visualization generation
- Report creation

Quality Assessment Metrics

The system evaluates datasets on four key dimensions:

Completeness (25 points)
Consistency (25 points)
Accuracy (25 points)
Uniqueness (25 points)

Total quality score is calculated out of 100 points.

Error Handling

The system includes comprehensive error handling for:

Dataset download failures
File access issues
Data processing errors
Report generation problems

Each function includes try-catch blocks with detailed error messages.

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Agent Data Analysis System

Overview

Features

System Requirements

Project Structure

Agent Roles

Setup Instructions

Usage

Key Components

agents.py

config.py

main.py

system_messages.py

utils.py

Quality Assessment Metrics

Error Handling

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
agents.py		agents.py
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt
system_messages.py		system_messages.py
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent Data Analysis System

Overview

Features

System Requirements

Project Structure

Agent Roles

Setup Instructions

Usage

Key Components

agents.py

config.py

main.py

system_messages.py

utils.py

Quality Assessment Metrics

Error Handling

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages