A personalized daily arXiv paper recommendation system powered by AI. Get relevant papers delivered to your inbox based on your research interests and Zotero library.
- π Smart Filtering: Pre-filters papers by category and keywords
- π€ LLM Evaluation: Uses OpenAI GPT models to intelligently match papers to your interests
- π Zotero Integration: Learns from your existing research library
- π§ Email Delivery: Sends daily digest with relevant papers, abstracts, and links
- β° Automated: Set it up once, get digests daily via cron/launchd
- βοΈ Configurable: Customize categories, keywords, relevance threshold
- Python 3.7+
- pip
- OpenAI API key
- Gmail account (or other SMTP email)
- Zotero library export (optional but recommended)
- Clone the repository
git clone https://github.com/yourusername/whatsup.git
cd whatsup- Install dependencies
pip install -r requirements.txt- Configure the system
cp config.yaml.example config.yaml
nano config.yaml # Edit with your settingsRequired configuration:
- Email SMTP settings (see docs/GMAIL_SETUP.md)
- OpenAI API key
- ArXiv categories and keywords
- Research interests description
- Zotero library path (see docs/ZOTERO_EXPORT.md)
- Run manually (test)
python src/main.py- Set up automation (optional)
See docs/CRON_SETUP.md for instructions on scheduling daily runs.
Find categories at https://arxiv.org/category_taxonomy
Common physics/astronomy categories:
cond-mat.supr-con- Superconductivitycond-mat.mes-hall- Mesoscale and Nanoscale Physicsphysics.ins-det- Instrumentation and Detectorsastro-ph.IM- Instrumentation and Methodsquant-ph- Quantum Physics
Use keywords for pre-filtering to improve speed and reduce OpenAI API costs. Papers matching ANY keyword will be evaluated by the LLM.
Example:
keywords:
- superconductor
- quasiparticle
- THz detector
- single photonPapers are scored 0-10 by the LLM. The default threshold is 7.0. Adjust based on your needs:
5.0-6.0: More papers, some less relevant7.0: Balanced (recommended)8.0-9.0: Fewer papers, highly relevant only
Be specific about your current interests. The LLM uses this along with your Zotero library to evaluate papers.
Example:
interests:
description: |
I study superconducting quantum devices, specifically:
1. Athermal quasiparticle dynamics and trapping
2. THz single-photon detectors for cosmology
3. Microwave kinetic inductance detectors (MKIDs)
4. Quantum sensing applicationswhatsup/
βββ config.yaml # Your configuration
βββ config.yaml.example # Configuration template
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ src/
β βββ __init__.py
β βββ config_parser.py # Load and validate config
β βββ zotero_parser.py # Parse Zotero exports
β βββ arxiv_client.py # Fetch papers from arXiv
β βββ llm_evaluator.py # OpenAI-based evaluation
β βββ email_sender.py # Send digest emails
β βββ main.py # Main orchestrator
βββ docs/
βββ ZOTERO_EXPORT.md # Zotero export guide
βββ GMAIL_SETUP.md # Gmail app password setup
βββ CRON_SETUP.md # Automation scheduling
Run the digest manually:
python src/main.pyWith custom config file:
python src/main.py /path/to/custom_config.yamlSee docs/CRON_SETUP.md for full instructions.
Quick example (runs daily at 8 AM):
crontab -eAdd:
0 8 * * * cd /path/to/whatsup && python src/main.pyThe system will:
- Load your configuration
- Parse your Zotero library
- Fetch papers from arXiv (last 24 hours by default)
- Evaluate each paper with OpenAI
- Send email with relevant papers (score >= threshold)
Example email:
ArXiv Digest: 3 relevant papers - 2025-11-01
======================================================================
1. Quasiparticle Dynamics in Disordered Superconductors
Authors: Smith, J., Doe, A.
Published: 2025-10-31
Relevance: 8.5/10 - Directly addresses athermal
quasiparticle dynamics
URL: https://arxiv.org/abs/2510.12345
PDF: https://arxiv.org/pdf/2510.12345.pdf
Abstract:
We study the dynamics of athermal quasiparticles...
- Zotero Export Guide - How to export your library
- Gmail Setup Guide - Configure email sending
- Cron Setup Guide - Automate daily runs
Using gpt-4o-mini (recommended):
- ~$0.15 per 1M input tokens
- ~$0.60 per 1M output tokens
Typical daily run:
- 10-50 papers to evaluate
- ~1000 tokens per paper evaluation
- Estimated cost: $0.01-0.05 per day (~$1-2/month)
Using gpt-4o:
- ~10x more expensive
- Better quality, but usually unnecessary
- Use narrow arXiv categories
- Add specific keywords for pre-filtering
- Reduce
max_days_backto 1 - Increase relevance threshold
- Check arXiv categories are correct
- Verify papers were published recently (within
max_days_back) - Remove or broaden keyword filters
- Lower the relevance threshold (try 6.0)
- Update your interests description to be more specific
- Check OpenAI API key is valid
- Verify Gmail app password (see docs/GMAIL_SETUP.md)
- Check spam folder
- Review email configuration in config.yaml
- Check logs for errors
- Ensure config.yaml exists (copy from config.yaml.example)
- Use absolute path or run from project directory
- Verify API key is correct
- Check you have API credits
- Check OpenAI service status
The code is modular:
config_parser.py: Add new config sectionszotero_parser.py: Support new export formatsarxiv_client.py: Change arXiv query logicllm_evaluator.py: Modify evaluation prompt/scoringemail_sender.py: Change email format
Test individual components:
from src.config_parser import ConfigParser
config = ConfigParser('config.yaml')
print(config.get_arxiv_config())Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests if applicable
- Submit a pull request
See LICENSE file for details.
- Built with arxiv Python package
- Powered by OpenAI
- Inspired by researchers tired of manually browsing arXiv
For issues, questions, or feature requests:
- Open an issue on GitHub
- Check existing documentation in
docs/
Current version: 0.0.0
Happy paper hunting! ππ¬