This project is an agentic research assistant that searches, extracts, and analyzes academic papers from arXiv using a multi-agent workflow. It combines local paper search, section extraction, vector database (RAG) embedding, and web search to provide synthesized answers to user queries.
- Search arXiv for papers using custom queries (via arxivxplorer.com)
- Extract structured sections from arXiv papers (using ar5iv.org HTML)
- Embed paper sections into a LanceDB vector database using Ollama embeddings
- Perform RAG (Retrieval-Augmented Generation) search over local papers
- Supplement answers with DuckDuckGo web search
- Multi-agent workflow: Librarian, Web Researcher, Lead Analyst
- Streamlit-based interactive chat UI
agent.py: Main Streamlit app, agent logic, RAG, embedding, and chat interfacesearch_arxiv.py: Scrapes arxivxplorer.com for papers matching a queryextract_paper_sections.py: Extracts and saves structured sections from arXiv papers using ar5iv.org
Download and install Miniconda or Anaconda from conda.io.
conda create -n arxiv_agent python=3.10
conda activate arxiv_agent- Download and install Ollama from ollama.com/download
- Start the Ollama server:
- Run
ollama servein a terminal
- Run
pip install -r requirements.txtpython -m playwright installstreamlit run agent.py- Use
/search [topic]to search arXiv and build the local knowledge base - Use
/analysis [question]to analyze and synthesize answers using local and web data - Or just chat for web-based answers
- Ollama must be running for LLM and embedding features
- LanceDB is used for vector storage (no external DB setup needed)
- All data is stored locally in the
arxiv_datafolder
See requirements.txt for Python dependencies.
- If scraping fails, check Playwright browser installation
- If embeddings fail, ensure Ollama is running and the required models are pulled (e.g.,
ollama pull phi3:3.8b,ollama pull embeddinggemma:latest) - For Windows users, run commands in Anaconda Prompt or PowerShell
MIT License
Below are example scenarios and screenshots illustrating how SARA works in practice:
When you search for papers, SARA finds relevant arXiv papers, splits them into sections, and embeds each section into the knowledge base:
Papers are searched, split into sections, and embedded into LanceDB for semantic retrieval.
When you use /analysis, SARA first searches the local knowledge base for relevant information, then generates targeted web queries based on both your question and the local findings. All results are synthesized for a structured answer:
The agent first searches the knowledge base, then uses the findings to generate web queries, and finally combines all information for a comprehensive answer.
Final output: A well-structured, referenced answer combining local and web knowledge.
You can also chat directly with the LLM, without using any agentic workflow:
Direct conversation with the LLM
In normal chat mode, SARA can also perform web searches to provide up-to-date information:
The LLM supplements its answers with real-time web search results for current topics.