▶️ Execution

🔗 A2A Research Agent for arXiv: Multi-Agent Architecture with Agent-to-Agent Protocol

Distributed agent system using Google's A2A protocol for autonomous research and summarization of ArXiv papers.

A production-ready implementation of Google's Agent-to-Agent (A2A) protocol demonstrating distributed AI agents that discover capabilities, delegate tasks, and collaborate via HTTP to perform research and summarization workflows.

📖 Description

🔹 What it does

Implements a distributed multi-agent system where autonomous agents communicate via Google's A2A protocol. The ResearchAgent searches arXiv for academic papers, then delegates summarization work to a specialized SummarizerAgent via standardized Task objects. The system showcases true agent-to-agent communication with capability discovery, stateful task management, and typed artifacts.

🔹 What problem it solves

Demonstrates scalable AI agent architecture that separates data retrieval (ResearchAgent) from LLM processing (SummarizerAgent). Unlike monolithic AI systems, this architecture enables:

Independent scaling of specialized agents
Capability discovery without hardcoded dependencies
Agent delegation for complex multi-step workflows
Fault isolation and distributed processing

🔹 Key Differentiators

Unlike typical AI applications that make direct LLM calls, this system:

Uses standardized A2A protocol for agent interoperability
Implements capability discovery via Agent Cards (no hardcoded URLs)
Maintains stateful task lifecycle (SUBMITTED → WORKING → COMPLETED/FAILED)
Demonstrates agent delegation via HTTP with retry logic and polling
Features modular architecture where agents can be developed, deployed, and scaled independently

✨ Key Features

Feature	Description
🤖 A2A Protocol Implementation	Full implementation of Google's Agent-to-Agent protocol with Agent Cards, Tasks, and Artifacts
🔄 Agent Delegation	ResearchAgent discovers and delegates work to SummarizerAgent dynamically
📊 Stateful Task Lifecycle	Tasks progress through SUBMITTED → WORKING → COMPLETED with status tracking
🔍 Capability Discovery	Agents advertise capabilities via `/agent-card` endpoints (JSON metadata)
📦 Typed Artifacts	Structured data exchange with artifact_type fields (papers, summary, json)
🌐 HTTP-based Communication	Agents communicate via RESTful endpoints (discovery and task delegation)
⚡ Async Processing	Asynchronous HTTP client with timeout handling and retry logic
🖥️ Streamlit UI	Optional web interface for user-friendly interaction (CLI also available)
🔗 arXiv Integration	Direct API access to 2M+ academic papers with metadata
🧠 LLM Summarization	Groq-powered extraction of insights and follow-up questions

🏗️ Architecture Overview

Behind the Scenes: How It Works

┌──────────────────────────┐        HTTP GET        ┌──────────────────────────┐
│        User Input        │ ─────────────────────► │      ResearchAgent       │
│     (Streamlit / CLI)    │                        │        (Port 8000)       │
└──────────────────────────┘                        └───────────┬──────────────┘
                                                                │
                                                                │ 1. Query arXiv API
                                                                │    (External Tool)
                                                                ▼
                                                        ┌──────────────────────┐
                                                        │      arXiv API       │
                                                        │      (REST Tool)     │
                                                        └───────────┬──────────┘
                                                                    │
                                                                    │ 2. Papers Retrieved
                                                                    │    (List[Dict])
                                                                    ▼
                                                                3. Create Task
                                                                   with Papers
                                                                   as Artifact
                                                                    │
                                                                    │ 4. HTTP POST
                                                                    │    to SummarizerAgent
                                                                    ▼
                                                        ┌──────────────────────┐
                                                        │    SummarizerAgent   │
                                                        │       (Port 8001)    │
                                                        └───────────┬──────────┘
                                                                    │
                                                                    │ 5. Process Papers
                                                                    │    via Groq LLM
                                                                    ▼
                                                                6. Generate Summary
                                                                   + Questions
                                                                    │
                                                                    │ 7. Update Task
                                                                    │    → COMPLETED
                                                                    ▼
                                                        ┌──────────────────────┐
                                                        │   Return Task to     │
                                                        │    ResearchAgent     │
                                                        └───────────┬──────────┘
                                                                    │
                                                                    │ 8. Aggregate Results
                                                                    │    (Papers + Summary)
                                                                    ▼
┌──────────────────────────┐       HTTP Response       ┌──────────────────────┐
│      Display Results     │ ◄─────────────────────── │      Final Output     │
│     (CLI / Streamlit)    │                          │        (JSON)        │
└──────────────────────────┘                          └──────────────────────┘

📂 Folder Structure

a2a-research-agent/
├── agents/
│   ├── __init__.py              # Package exports
│   ├── research_agent.py        # FastAPI server: Queries arXiv, delegates to Summarizer
│   └── summarizer_agent.py      # FastAPI server: Generates summaries via Groq LLM
│
├── shared/
│   ├── __init__.py              # Shared package exports
│   ├── schemas.py               # Pydantic models: Tasks, Artifacts, Agent Cards
│   ├── logger.py                # Centralized logging configuration
│   └── utils.py                 # Helper functions
│
├── frontend/
│   └── app.py                   # Streamlit UI (optional client interface)
│
├── cli/
│   └── client.py                # Command-line interface client
│
├── artifacts/
│   ├── screenshots/             # Project screenshots and demo images
│   │   ├── a2a_terminal_startup.jpg
│   │   ├── streamlit_frontend.jpg
│   │   ├── research_agent_card.jpg
│   │   ├── summarizer_agent_card.jpg
│   │   ├── task_delegation_flow.jpg
│   │   └── results_display.jpg
│   └── docs/                    # Documentation
│
├── .env                         # Environment variables)
├── .gitignore                   # Git exclusion rules
├── pyproject.toml               # Project dependencies (uv-based)
├── uv.lock                      # Locked dependency versions
└── README.md                    # This file

⚠️ Prerequisites:
Create a .env file in the root directory with the following variables:
GROQ_API_KEY=your_groq_api_key_here
RESEARCH_AGENT_URL=http://localhost:8000
SUMMARIZER_AGENT_URL=http://localhost:8001
GROQ_MODEL=llama-3.3-70b-versatile
Notes:

GROQ_API_KEY → Required for LLM summarization

RESEARCH_AGENT_URL → Default: http://localhost:8000

SUMMARIZER_AGENT_URL → Default: http://localhost:8001

GROQ_MODEL → Default: llama-3.3-70b-versatile

🛠️ Setup Instructions

1️⃣ Clone the Repository

git clone https://github.com/inv-fourier-transform/a2a-research-agent.git
cd a2a-research-agent

2️⃣ Install Dependencies (using uv)

If you don't have uv installed:

pip install uv

Install project dependencies:

uv sync

3️⃣ Configure Environment Variables

Create a .env file in the project root:

GROQ_API_KEY=your_groq_api_key_here
RESEARCH_AGENT_URL=http://localhost:8000
SUMMARIZER_AGENT_URL=http://localhost:8001
GROQ_MODEL=llama-3.3-70b-versatile

▶️ Execution

Option A: Terminal Setup (Full A2A Demonstration)

Terminal 1 – Start SummarizerAgent

uv run agents/summarizer_agent.py

Waits on port 8001, exposes /agent-card endpoint.

Terminal 2 – Start ResearchAgent

uv run agents/research_agent.py

Waits on port 8000, delegates to SummarizerAgent on port 8001.

Terminal 3A – CLI Client

uv run cli/client.py

Terminal 3B – Streamlit UI (For demos)

uv run streamlit run frontend/app.py

🎯 A2A Protocol Implementation

Why A2A Protocol is Used Here

Reason	Explanation
Standardization	A2A provides a common language for agents to communicate, regardless of implementation details
Interoperability	Agents built by different teams can discover and work with each other via standardized endpoints
Capability Discovery	No hardcoded URLs or dependencies—agents advertise what they can do via Agent Cards
Scalability	Each agent can be scaled independently (e.g., 3 SummarizerAgents behind a load balancer)
Fault Isolation	If SummarizerAgent fails, ResearchAgent can retry or fail gracefully without crashing
Modularity	ResearchAgent focuses on data retrieval, SummarizerAgent on LLM processing—separation of concerns

How A2A Protocol is Used Here

1. Capability Discovery (Agent Cards)

# ResearchAgent fetches SummarizerAgent's capabilities before delegating
GET http://localhost:8001/agent-card

Returns JSON metadata advertising skills like text_summarization and insight_extraction.

2. Task Creation & Delegation

# ResearchAgent creates a Task with arXiv papers as input artifact
POST http://localhost:8001/tasks/send
{
  "task": {
    "id": "task_001",
    "status": "SUBMITTED",
    "input_artifacts": [
      {"artifact_type": "papers", "data": [...]}
    ]
  }
}

3. Stateful Processing

SummarizerAgent receives task → Status: WORKING
Processes via Groq LLM → Generates summary
Updates task → Status: COMPLETED
ResearchAgent polls /tasks/{task_id} or receives immediate response (sync mode)

4. Result Aggregation

ResearchAgent combines:

Original papers (from arXiv)
Generated summary (from SummarizerAgent)
Follow-up questions (from SummarizerAgent)

Why Not Directly Call the LLM from ResearchAgent?

Approach	Direct LLM Call	A2A Delegation
Coupling	High – ResearchAgent must know LLM details	Low – ResearchAgent only knows SummarizerAgent's interface
Scaling	ResearchAgent becomes bottleneck	Can add more SummarizerAgents independently
Flexibility	Hard to switch from Groq to OpenAI	Easy – just point to different SummarizerAgent
Complexity	Monolithic, harder to maintain	Modular, easier to test and deploy
Failure Handling	Single point of failure	Retry logic at delegation layer
Expertise Separation	One agent does everything	Research focuses on retrieval, Summarizer on LLM

Key Insight:
ResearchAgent shouldn't care how summarization happens—only that a capable agent exists to do it. This is the delegation pattern in distributed systems.

📋 A2A Protocol Components Used

Component	Purpose	Implementation Details
Agent Card	Capability advertisement	JSON served at `GET /agent-card`; includes skills array (e.g., `text_summarization`), name, url, version
Task	Work unit with lifecycle	Pydantic model with `id`, `status` (SUBMITTED, WORKING, COMPLETED, FAILED), `input_artifacts`, `output_artifacts`, timestamps
TaskSendRequest	Task submission wrapper	Pydantic model wrapping Task with `sync_mode` boolean
Artifact	Typed data container	Contains `artifact_type` (papers, summary, json, text), `data`, optional description and metadata
Status Updates	Lifecycle management	SummarizerAgent updates status internally; ResearchAgent polls `/tasks/{task_id}` or receives immediate response

🚀 The 2-Agent Architecture

Agent 1: ResearchAgent (Port 8000)

Responsibilities

Accept user queries via POST /tasks/send
Query arXiv API for academic papers
Create Task with papers artifact
Delegate to SummarizerAgent (port 8001)
Aggregate results (papers + summary)
Return final response to client

Capabilities Advertised

arxiv_search
research_orchestration

Agent 2: SummarizerAgent (Port 8001)

Responsibilities

Accept Task with papers artifact
Extract text from paper metadata
Call Groq LLM for summarization
Generate summary points and follow-up questions
Update Task status to COMPLETED
Return Task with summary artifact

Capabilities Advertised

text_summarization
insight_extraction
json_processing

📸 Screenshots

1. Agent Terminal Startup

Shows both ResearchAgent (port 8000) and SummarizerAgent (port 8001) starting and exposing their respective endpoints

2. Agent Cards (Capability Discovery)

Demonstrates A2A protocol's discovery mechanism—ResearchAgent fetching SummarizerAgent's advertised capabilities via /agent-card endpoint

ResearchAgent Card (Port 8000):

SummarizerAgent Card (Port 8001):

3. Streamlit Frontend Interface

Clean web UI for entering research topics—replaces CLI with an interactive interface while maintaining the same A2A backend architecture

4. Task Delegation Flow

HTTP POST request from ResearchAgent to SummarizerAgent showing Task submission with papers artifact and COMPLETED response with summary artifact

5. Results Display

Final aggregated output showing: arXiv papers with metadata, LLM-generated summary bullet points, and follow-up questions

🎥 Demo Video

Watch the demo

Complete walkthrough of the A2A protocol in action: discovery, delegation, and aggregation.
Click the preview below to download the video (~21 MB) and watch it locally.

🛠️ Technologies Used

Python 3.10+ – Core language
FastAPI – High-performance HTTP servers
Pydantic – Data validation and serialization
Groq – High-speed LLM inference
arXiv API – Academic paper source
HTTPX – Async HTTP client
Uvicorn – ASGI server
Streamlit – Optional web interface
Python-dotenv – Environment configuration
UV – Modern Python package manager

🔮 Roadmap

Agent Registry (centralized discovery)
Async Callbacks (true async task completion)
Authentication (JWT-based security)
Multi-Agent Orchestration (CitationAgent, FactCheckAgent, etc.)
Docker Deployment
Kubernetes Scaling for SummarizerAgent

🙏 Credits

Google for the A2A Protocol specification
arXiv.org for open access to academic papers
Groq for fast LLM inference
Streamlit team for the excellent web interface

⚠️ Disclaimer

This is a demonstration project showcasing the power of A2A protocol.
For production use, implement proper authentication, rate limiting, and error handling.

📜 License

This project is licensed under the MIT License. See the LICENSE.md file for full details.

Because why build one monolithic AI… when you can orchestrate an army?

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agent_cards		agent_cards
agents		agents
artifacts		artifacts
frontend		frontend
shared		shared
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
client.py		client.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

🔗 A2A Research Agent for arXiv: Multi-Agent Architecture with Agent-to-Agent Protocol

📖 Description

🔹 What it does

🔹 What problem it solves

🔹 Key Differentiators

✨ Key Features

🏗️ Architecture Overview

Behind the Scenes: How It Works

📂 Folder Structure

🛠️ Setup Instructions

1️⃣ Clone the Repository

2️⃣ Install Dependencies (using uv)

3️⃣ Configure Environment Variables

▶️ Execution

Option A: Terminal Setup (Full A2A Demonstration)

Terminal 1 – Start SummarizerAgent

Terminal 2 – Start ResearchAgent

Terminal 3A – CLI Client

Terminal 3B – Streamlit UI (For demos)

🎯 A2A Protocol Implementation

Why A2A Protocol is Used Here

How A2A Protocol is Used Here

1. Capability Discovery (Agent Cards)

2. Task Creation & Delegation

3. Stateful Processing

4. Result Aggregation

Why Not Directly Call the LLM from ResearchAgent?

📋 A2A Protocol Components Used

🚀 The 2-Agent Architecture

Agent 1: ResearchAgent (Port 8000)

Responsibilities

Capabilities Advertised

Agent 2: SummarizerAgent (Port 8001)

Responsibilities

Capabilities Advertised

📸 Screenshots

1. Agent Terminal Startup

2. Agent Cards (Capability Discovery)

3. Streamlit Frontend Interface

4. Task Delegation Flow

5. Results Display

🎥 Demo Video

🛠️ Technologies Used

🔮 Roadmap

🙏 Credits

⚠️ Disclaimer

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages