ArXiv Paper Recommender

A personalized daily paper recommendation system that learns from your preferences and feedback to suggest relevant papers from ArXiv. Built with FastAPI (backend) and React (frontend).

Features

🎯 Personalized Recommendations: Machine learning algorithm that learns from your likes, dislikes, and preferences
📚 Customizable Categories: Select from 123+ ArXiv categories and subcategories (cs.AI, cs.LG, math., physics., etc.)
🔍 Smart Search: Search through all papers in your database
💾 Save Papers: Bookmark papers for later reading
📊 Learning System: Improves recommendations based on:
- Your initial keywords
- Papers you like/dislike
- Semantic similarity using TF-IDF
- Category preferences (for filtering, not scoring)
📄 Export: Export saved papers as JSON or CSV
⚡ Fast & Responsive: Paginated results (100 papers per page) for quick loading

Screenshots

Your personalized paper recommendations, ranked by relevance to your interests

Quick Start

Prerequisites

Python 3.8 or higher
Node.js 16 or higher
npm or yarn

Installation

Clone the repository:

git clone https://github.com/jonstraveladventures/arxivrecommendations.git
cd arxivrecommendations

Install Python dependencies:
```
pip install -r requirements.txt
```
Install frontend dependencies:
```
cd frontend
npm install
cd ..
```

Running the Application

Start the backend server:
```
cd backend
python main.py
```
The API will be available at http://localhost:8000
Start the frontend (in a new terminal):
```
cd frontend
npm run dev
```
The web interface will be available at http://localhost:3000
Open your browser: Navigate to http://localhost:3000 to start using the recommender!

Usage

Initial Setup

Set Your Preferences:
- Click "⚙️ Preferences" in the header
- Add keywords that describe your research interests (comma-separated)
- Select ArXiv categories you want to follow
- Click "Save Preferences"
View Today's Papers:
- Papers are automatically fetched from your selected categories
- They're ranked by relevance score (shown as a percentage badge)
- Click on a paper title to expand and see the full abstract
Provide Feedback:
- 👍 Like: Improves recommendations for similar papers
- 👎 Dislike: Reduces similar papers in future recommendations
- 💾 Save: Bookmark to read later (doesn't affect recommendations)
- ⏭️ Skip: No opinion, just move on
You can click both Like and Save on the same paper!
Load More Papers:
- Click "Load More" at the bottom to see additional papers
- Papers are paginated (100 per page) for fast loading
Search Papers:
- Click "🔍 Search" to search through all papers in the database
- Search by title, abstract, or authors
View Saved Papers:
- Click "💾 Saved Papers" to see all your bookmarked papers
- Export as JSON or CSV for reference

How It Works

Recommendation Algorithm

The system uses a multi-factor scoring approach:

Base Score: 0.5 (neutral starting point)
Keyword Matching (+0.05 per keyword, up to +0.3):
- Matches your provided keywords against paper titles/abstracts
- Also extracts keywords from papers you've liked
Semantic Similarity (+0.3 max):
- Uses TF-IDF vectorization to convert papers to numerical vectors
- Builds a "user profile" from the average of all your liked papers
- Computes cosine similarity between each paper and your profile
- Papers similar to ones you like get higher scores
Negative Feedback (-0.2 max):
- Penalizes papers similar to ones you've disliked
- Helps filter out unwanted content
Ranking Boost (+0.1 max):
- Top-ranked papers get a slight boost to maintain ordering

Note: Category matching is NOT used in scoring (as requested) - categories are only used for filtering which papers to fetch.

Data Storage

The system uses SQLite to store:

Papers: Fetched papers with metadata
User Preferences: Your keywords and selected categories
User Feedback: Your likes, dislikes, saves, and skips

The database file (arxiv_recommender.db) is created automatically in the backend directory.

Configuration

Selecting Categories

Click "⚙️ Preferences"
Scroll to "Select ArXiv Categories"
Check the categories you want to include
Click "Save Preferences"
Click "🔄 Force Refresh" to fetch papers from your new categories

Available Categories

The system supports 123+ ArXiv categories organized by subject:

Computer Science: cs.AI, cs.LG, cs.CV, cs.CL, cs.RO, etc.
Mathematics: math.AG, math.AT, math.CO, etc.
Statistics: stat.ML, stat.AP, stat.CO, etc.
Physics: physics.optics, physics.quant-ph, etc.
Quantitative Biology: q-bio.*
Quantitative Finance: q-fin.*
Economics: econ.*
Electrical Engineering: eess.*

See the full list in the Preferences panel!

Project Structure

arxivrecommendations/
├── backend/
│   ├── main.py              # FastAPI application
│   ├── database.py          # Database setup
│   ├── models.py            # SQLAlchemy models
│   ├── arxiv_fetcher.py     # ArXiv API integration
│   ├── arxiv_categories.py  # ArXiv category definitions
│   └── recommender.py       # Recommendation algorithm
├── frontend/
│   ├── src/
│   │   ├── App.jsx          # Main React component
│   │   ├── App.css          # Styles
│   │   ├── main.jsx         # React entry point
│   │   └── index.css        # Global styles
│   ├── package.json
│   └── vite.config.js
├── requirements.txt         # Python dependencies
├── .gitignore
└── README.md

API Endpoints

GET /api/health - Health check
GET /api/papers/today?page=1&per_page=100 - Get today's papers (paginated)
GET /api/papers/search?q=query - Search papers
GET /api/papers/saved - Get saved papers
GET /api/papers/saved/export?format=json|csv - Export saved papers
GET /api/categories - Get all available categories
GET /api/categories/selected - Get user's selected categories
GET /api/preferences - Get user preferences
POST /api/preferences - Update preferences
POST /api/feedback - Submit feedback on a paper
GET /api/stats - Get user statistics

Customization

Changing Default Categories

Edit backend/arxiv_categories.py and modify DEFAULT_SELECTED.

Adjusting Recommendation Weights

Edit backend/recommender.py to adjust how different factors contribute to the relevance score.

Changing Paper Fetch Settings

Edit backend/main.py to modify:

days_back: How many days of papers to fetch (default: 1 for last 24 hours)
max_results: Max papers per category (default: 2000, ArXiv API limit)
per_page: Papers per page (default: 100)

Troubleshooting

Backend won't start

Check Python version: python --version (should be 3.8+)
Install dependencies: pip install -r requirements.txt
Check if port 8000 is available

Frontend won't load

Make sure backend is running on port 8000
Install frontend dependencies: cd frontend && npm install
Check browser console for errors (F12)
Try hard refresh: Cmd+Shift+R (Mac) or Ctrl+Shift+R (Windows/Linux)

No papers showing

Click "🔄 Force Refresh" to fetch new papers
Check that you've selected at least one category in Preferences
Check backend terminal for error messages
Verify ArXiv API is accessible

Categories not loading

Check browser console for errors
Verify backend is running: curl http://localhost:8000/api/categories
Try refreshing the page

Development

Running with Auto-Reload

Backend:

cd backend
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Frontend: The Vite dev server already auto-reloads on file changes.

Database Reset

To reset your database and start fresh:

rm backend/arxiv_recommender.db
# Restart the backend - it will create a new database

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - feel free to use this project for your own ArXiv paper recommendations!

Acknowledgments

Built with FastAPI
Frontend with React and Vite
Uses ArXiv API for paper fetching
Recommendation algorithm uses scikit-learn for TF-IDF and similarity calculations

Future Enhancements

Potential improvements:

Email notifications for daily papers
More sophisticated ML models (e.g., transformer-based embeddings)
User accounts and cloud sync
Paper summaries using AI
Collaborative filtering (recommendations based on similar users)
Paper clustering and topic modeling

Enjoy discovering new papers! 📚✨

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
backend		backend
frontend		frontend
.cursor_workflow.md		.cursor_workflow.md
.gitignore		.gitignore
AUTO_RESTART.md		AUTO_RESTART.md
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
SETUP.md		SETUP.md
requirements.txt		requirements.txt
restart_backend.py		restart_backend.py
restart_backend.sh		restart_backend.sh
start.sh		start.sh

Folders and files

Latest commit

History

Repository files navigation

ArXiv Paper Recommender

Features

Screenshots

Quick Start

Prerequisites

Installation

Running the Application

Usage

Initial Setup

How It Works

Recommendation Algorithm

Data Storage

Configuration

Selecting Categories

Available Categories

Project Structure

API Endpoints

Customization

Changing Default Categories

Adjusting Recommendation Weights

Changing Paper Fetch Settings

Troubleshooting

Backend won't start

Frontend won't load

No papers showing

Categories not loading

Development

Running with Auto-Reload

Database Reset

Contributing

License

Acknowledgments

Future Enhancements

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages