This project implements a multi-agent system using AutoGen to collaboratively search and download PDF files related to specific topics.
The system consists of three main agents:
- Research Agent: Searches for PDF files related to a given topic and compiles a list of URLs.
 - Download Agent: Receives URLs from the Research Agent and handles the downloading process.
 - User Proxy Agent: Coordinates the interaction between agents and manages the overall workflow.
 
The Multi-Agent PDF Discovery System is an AI-powered collaborative system designed to automatically search and download research PDFs on specified topics. The system utilizes AutoGen's multi-agent framework to coordinate between specialized agents that handle different aspects of the PDF discovery and download process.
- 
User Proxy Agent
- Acts as an intermediary between the user and other agents
 - Coordinates the workflow between Research and Download agents
 - Handles user input and system output
 
 - 
Research Agent
- Specializes in finding relevant PDF URLs
 - Uses semantic search to identify appropriate academic papers
 - Filters and ranks results based on relevance
 - Returns structured URL data with paper titles
 
 - 
Download Agent
- Handles PDF file downloading
 - Implements robust error handling
 - Manages file naming and storage
 - Reports download status and results
 
 
- Python Version: 3.8+
 - Primary Framework: AutoGen v0.2.0
 - Key Dependencies:
pyautogen: Multi-agent orchestrationrequests: HTTP handlingbeautifulsoup4: Web scrapingpython-dotenv: Environment variable managementhashlib: Secure filename generation
 
# Agent Configuration
assistant_config = {
    "seed": 42,
    "temperature": 0,
    "config_list": config_list,
    "timeout": 120
}
# System Parameters
MAX_CONVERSATIONS = 5
DOWNLOAD_TIMEOUT = 120- 
API Key Management
- OpenAI API keys stored in 
.envfile - Secure environment variable loading
 - No hardcoded credentials
 
 - OpenAI API keys stored in 
 - 
File Security
- URL sanitization
 - Secure filename generation
 - Download validation checks
 
 - 
Error Handling
- Network timeout management
 - Invalid URL detection
 - Corrupt file checking
 - Rate limiting compliance
 
 
- 
Initialization
# Load environment variables load_dotenv() # Initialize agents user_proxy = autogen.UserProxyAgent(...) research_agent = autogen.AssistantAgent(...) download_agent = autogen.AssistantAgent(...)
 - 
Research Phase
- User provides research topic
 - Research Agent searches for relevant PDFs
 - Returns structured list of URLs and titles
 
 - 
Download Phase
- Download Agent processes each URL
 - Implements retry logic for failed downloads
 - Validates downloaded files
 - Reports success/failure status
 
 - 
Error Management
try: response = requests.get(url, timeout=DOWNLOAD_TIMEOUT) # Download handling except requests.exceptions.RequestException as e: logging.error(f"Download failed: {str(e)}")
 
- CPU: Modern multi-core processor
 - RAM: Minimum 4GB
 - Storage: Sufficient for PDF storage
 
- Python 3.8 or higher
 - pip package manager
 - Virtual environment (recommended)
 
- 
Environment Setup
python -m venv venv source venv/bin/activate # Unix/macOS pip install -r requirements.txt
 - 
Configuration
cp .env.example .env # Edit .env with your OpenAI API key 
# Initialize the system
user_proxy.initiate_chat(
    research_agent,
    message="Find PDFs about artificial intelligence in healthcare"
)The system implements comprehensive logging:
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)| Error Code | Description | Resolution | 
|---|---|---|
| E001 | API Key Missing | Check .env file | 
| E002 | Download Failed | Check network/retry | 
| E003 | Invalid URL | Verify URL format | 
| E004 | File Corruption | Retry download | 
- Implements rate limiting for API calls
 - Chunked downloading for large files
 - Efficient memory management
 - Parallel download capabilities
 
- Advanced PDF content filtering
 - Multiple search engine support
 - Machine learning-based relevance scoring
 - Academic database integration
 - Enhanced metadata extraction
 
- Fork the repository
 - Create feature branch
 - Submit pull request
 - Follow coding standards
 
MIT License - See LICENSE file for details
- Install the required dependencies:
 
pip install -r requirements.txt- Create a 
.envfile in the project root with your OpenAI API key: 
OPENAI_API_KEY=your_api_key_here
- Run the script:
 
python pdf_finder_agents.pypdf_finder_agents.py: Main script containing the agent implementationsrequirements.txt: List of Python dependenciespdf_downloads/: Directory where downloaded PDFs are stored
- Collaborative multi-agent system using AutoGen
 - Automated PDF discovery based on topics
 - Coordinated downloading of found PDFs
 - Error handling and download verification
 - Clean separation of agent responsibilities