A personalized daily paper recommendation system that learns from your preferences and feedback to suggest relevant papers from ArXiv. Built with FastAPI (backend) and React (frontend).
- π― Personalized Recommendations: Machine learning algorithm that learns from your likes, dislikes, and preferences
- π Customizable Categories: Select from 123+ ArXiv categories and subcategories (cs.AI, cs.LG, math., physics., etc.)
- π Smart Search: Search through all papers in your database
- πΎ Save Papers: Bookmark papers for later reading
- π Learning System: Improves recommendations based on:
- Your initial keywords
- Papers you like/dislike
- Semantic similarity using TF-IDF
- Category preferences (for filtering, not scoring)
- π Export: Export saved papers as JSON or CSV
- β‘ Fast & Responsive: Paginated results (100 papers per page) for quick loading
Your personalized paper recommendations, ranked by relevance to your interests
- Python 3.8 or higher
- Node.js 16 or higher
- npm or yarn
-
Clone the repository:
git clone https://github.com/jonstraveladventures/arxivrecommendations.git cd arxivrecommendations -
Install Python dependencies:
pip install -r requirements.txt
-
Install frontend dependencies:
cd frontend npm install cd ..
-
Start the backend server:
cd backend python main.pyThe API will be available at
http://localhost:8000 -
Start the frontend (in a new terminal):
cd frontend npm run devThe web interface will be available at
http://localhost:3000 -
Open your browser: Navigate to
http://localhost:3000to start using the recommender!
-
Set Your Preferences:
- Click "βοΈ Preferences" in the header
- Add keywords that describe your research interests (comma-separated)
- Select ArXiv categories you want to follow
- Click "Save Preferences"
-
View Today's Papers:
- Papers are automatically fetched from your selected categories
- They're ranked by relevance score (shown as a percentage badge)
- Click on a paper title to expand and see the full abstract
-
Provide Feedback:
- π Like: Improves recommendations for similar papers
- π Dislike: Reduces similar papers in future recommendations
- πΎ Save: Bookmark to read later (doesn't affect recommendations)
- βοΈ Skip: No opinion, just move on
You can click both Like and Save on the same paper!
-
Load More Papers:
- Click "Load More" at the bottom to see additional papers
- Papers are paginated (100 per page) for fast loading
-
Search Papers:
- Click "π Search" to search through all papers in the database
- Search by title, abstract, or authors
-
View Saved Papers:
- Click "πΎ Saved Papers" to see all your bookmarked papers
- Export as JSON or CSV for reference
The system uses a multi-factor scoring approach:
-
Base Score: 0.5 (neutral starting point)
-
Keyword Matching (+0.05 per keyword, up to +0.3):
- Matches your provided keywords against paper titles/abstracts
- Also extracts keywords from papers you've liked
-
Semantic Similarity (+0.3 max):
- Uses TF-IDF vectorization to convert papers to numerical vectors
- Builds a "user profile" from the average of all your liked papers
- Computes cosine similarity between each paper and your profile
- Papers similar to ones you like get higher scores
-
Negative Feedback (-0.2 max):
- Penalizes papers similar to ones you've disliked
- Helps filter out unwanted content
-
Ranking Boost (+0.1 max):
- Top-ranked papers get a slight boost to maintain ordering
Note: Category matching is NOT used in scoring (as requested) - categories are only used for filtering which papers to fetch.
The system uses SQLite to store:
- Papers: Fetched papers with metadata
- User Preferences: Your keywords and selected categories
- User Feedback: Your likes, dislikes, saves, and skips
The database file (arxiv_recommender.db) is created automatically in the backend directory.
- Click "βοΈ Preferences"
- Scroll to "Select ArXiv Categories"
- Check the categories you want to include
- Click "Save Preferences"
- Click "π Force Refresh" to fetch papers from your new categories
The system supports 123+ ArXiv categories organized by subject:
- Computer Science: cs.AI, cs.LG, cs.CV, cs.CL, cs.RO, etc.
- Mathematics: math.AG, math.AT, math.CO, etc.
- Statistics: stat.ML, stat.AP, stat.CO, etc.
- Physics: physics.optics, physics.quant-ph, etc.
- Quantitative Biology: q-bio.*
- Quantitative Finance: q-fin.*
- Economics: econ.*
- Electrical Engineering: eess.*
See the full list in the Preferences panel!
arxivrecommendations/
βββ backend/
β βββ main.py # FastAPI application
β βββ database.py # Database setup
β βββ models.py # SQLAlchemy models
β βββ arxiv_fetcher.py # ArXiv API integration
β βββ arxiv_categories.py # ArXiv category definitions
β βββ recommender.py # Recommendation algorithm
βββ frontend/
β βββ src/
β β βββ App.jsx # Main React component
β β βββ App.css # Styles
β β βββ main.jsx # React entry point
β β βββ index.css # Global styles
β βββ package.json
β βββ vite.config.js
βββ requirements.txt # Python dependencies
βββ .gitignore
βββ README.md
GET /api/health- Health checkGET /api/papers/today?page=1&per_page=100- Get today's papers (paginated)GET /api/papers/search?q=query- Search papersGET /api/papers/saved- Get saved papersGET /api/papers/saved/export?format=json|csv- Export saved papersGET /api/categories- Get all available categoriesGET /api/categories/selected- Get user's selected categoriesGET /api/preferences- Get user preferencesPOST /api/preferences- Update preferencesPOST /api/feedback- Submit feedback on a paperGET /api/stats- Get user statistics
Edit backend/arxiv_categories.py and modify DEFAULT_SELECTED.
Edit backend/recommender.py to adjust how different factors contribute to the relevance score.
Edit backend/main.py to modify:
days_back: How many days of papers to fetch (default: 1 for last 24 hours)max_results: Max papers per category (default: 2000, ArXiv API limit)per_page: Papers per page (default: 100)
- Check Python version:
python --version(should be 3.8+) - Install dependencies:
pip install -r requirements.txt - Check if port 8000 is available
- Make sure backend is running on port 8000
- Install frontend dependencies:
cd frontend && npm install - Check browser console for errors (F12)
- Try hard refresh: Cmd+Shift+R (Mac) or Ctrl+Shift+R (Windows/Linux)
- Click "π Force Refresh" to fetch new papers
- Check that you've selected at least one category in Preferences
- Check backend terminal for error messages
- Verify ArXiv API is accessible
- Check browser console for errors
- Verify backend is running:
curl http://localhost:8000/api/categories - Try refreshing the page
Backend:
cd backend
uvicorn main:app --reload --host 0.0.0.0 --port 8000Frontend: The Vite dev server already auto-reloads on file changes.
To reset your database and start fresh:
rm backend/arxiv_recommender.db
# Restart the backend - it will create a new databaseContributions are welcome! Please feel free to submit a Pull Request.
MIT License - feel free to use this project for your own ArXiv paper recommendations!
- Built with FastAPI
- Frontend with React and Vite
- Uses ArXiv API for paper fetching
- Recommendation algorithm uses scikit-learn for TF-IDF and similarity calculations
Potential improvements:
- Email notifications for daily papers
- More sophisticated ML models (e.g., transformer-based embeddings)
- User accounts and cloud sync
- Paper summaries using AI
- Collaborative filtering (recommendations based on similar users)
- Paper clustering and topic modeling
Enjoy discovering new papers! πβ¨