Chat with PDF

A web application that allows users to upload PDF files and ask questions about their content, with answers generated by the microsoft/Phi-3-mini-4k-instruct-fast model via the Hugging Face Inference API. Answers are strictly based on the PDF text, and unrelated questions return "This question is outside the PDF content." Ideal for technical use cases like analyzing API documentation or test plans.

Features

Upload PDFs and extract text using pdf.js.
Ask questions about the PDF, receiving precise, detailed answers with quotes/references.
Rejects unrelated questions to ensure accuracy.
No input/output restrictions for handling large PDFs.
Frontend hosted on GitHub Pages, backend on Render’s free plan.

Tech Stack

Frontend: React, pdf.js, Tailwind CSS, Babel, GitHub Pages
Backend: FastAPI, Python 3.10, requests, Pydantic, Uvicorn, Render
AI: Hugging Face Inference API, microsoft/Phi-3-mini-4k-instruct-fast
DevOps: Docker, GitHub
Security: Environment variables for API key, CORS for frontend access

Prerequisites

GitHub Account: For hosting frontend and backend repositories.
Render Account: For deploying the backend (free plan).
Hugging Face Account: For API access to microsoft/Phi-3-mini-4k-instruct-fast.
- Generate an API key (HF_API_KEY) from https://huggingface.co/settings/tokens.
- Free tier (~~10k tokens/day) may hit rate limits with large PDFs; consider a Pro plan (~~$9/month).
Docker: For local testing (optional).
Git: For version control.

Project Structure

chat-with-pdf Repository: Frontend (React, hosted on GitHub Pages)
- index.html: Main file with UI and logic
chat-with-pdf-backend Repository: Backend (FastAPI, hosted on Render)
- main.py: FastAPI app with /api/chat endpoint
- Dockerfile: Containerizes the backend
- requirements.txt: Python dependencies

Deployment Steps

1. Deploy Frontend (GitHub Pages)

Create Repository:
- Create a public GitHub repository named chat-with-pdf.
Add index.html:
- Copy the index.html file from this project to the repository root.
- Update the fetch URL in handleSendMessage (line ~70) to your Render backend URL (e.g., https://chat-with-pdf-backend.onrender.com/api/chat).
Enable GitHub Pages:
- Go to repository Settings > Pages.
- Set Source to Deploy from a branch, select main branch, and / (root) folder.
- Save and note the URL (e.g., https://your-username.github.io/chat-with-pdf/).
Push Changes:
```
git add index.html
git commit -m "Add frontend"
git push origin main
```
- GitHub Pages will deploy the frontend (may take a few minutes).

2. Deploy Backend (Render)

Create Repository:
- Create a public GitHub repository named chat-with-pdf-backend.
Add Backend Files:
- Copy main.py, Dockerfile, and requirements.txt to the repository root.
- In main.py, ensure allow_origins includes your GitHub Pages URL (e.g., https://your-username.github.io).
Set Up Render:
- Sign in to Render (https://render.com).
- Create a new Web Service, selecting your chat-with-pdf-backend repository.
- Configure:
  - Runtime: Docker
  - Region: Choose closest (e.g., Oregon)
  - Plan: Free
  - Environment Variables:
    - Key: HF_API_KEY
    - Value: Your Hugging Face API key (hf_xxxxxxxxxxxxxxxxxxxxxxxx)
- Deploy the service.
Verify Deployment:
- Check Render logs to confirm FastAPI starts on port 8000.
- Note the Render URL (e.g., https://chat-with-pdf-backend.onrender.com).

3. Test the Application

Access Frontend:
- Visit your GitHub Pages URL (e.g., https://your-username.github.io/chat-with-pdf/).
Upload PDF:
- Upload a PDF (e.g., API documentation).
- Verify the system message: “PDF uploaded successfully. You can now ask questions about the content.”
Ask Questions:
- PDF-related (e.g., “What are the endpoint parameters?”): Expect a detailed response with PDF quotes.
- Unrelated (e.g., “What is Python?”): Expect “This question is outside the PDF content.”

Debug with Curl:

curl -X POST https://your-render-service.onrender.com/api/chat \
-H "Content-Type: application/json" \
-d '{"pdfText":"The endpoint requires a JWT token.","userInput":"What does the endpoint need?"}'

Expected: {"reply": "The endpoint requires a JWT token."}

curl -X POST https://your-render-service.onrender.com/api/chat \
-H "Content-Type: application/json" \
-d '{"pdfText":"The endpoint requires a JWT token.","userInput":"What is Java?"}'

Expected: {"reply": "This question is outside the PDF content."}

Token Usage Warning

Hugging Face Free Tier: Limited to ~10k tokens/day. Large PDFs (e.g., 10,000 characters) + questions + 512-token responses can consume ~5,000-10,000 tokens per request, hitting limits quickly.
Recommendation: Use a Hugging Face Pro plan (~$9/month) or compute credits for reliable access.
Monitor: Check usage at https://huggingface.co/settings/billing.

Mitigation: If rate limits occur, add temporary truncation in main.py:

pdf_text = request.pdfText[:5000]
user_input = request.userInput[:1000]

Troubleshooting

Rate Limit Exceeded:
- Check Hugging Face Console for usage.
- Reduce max_tokens to 200 or add truncation.
- Upgrade to Pro plan.
Model Unavailable:
- Verify access to microsoft/Phi-3-mini-4k-instruct-fast in Hugging Face Hub.
- Test with meta-llama/Llama-3.2-8B-Instruct.
Invalid API Key:
- Update HF_API_KEY in Render dashboard.
CORS Issues:
- Ensure allow_origins in main.py matches your GitHub Pages URL.
Vague Responses:
- Share example PDF and response for prompt refinement.

Contributing

Fork the repositories (chat-with-pdf, chat-with-pdf-backend).
Submit pull requests with improvements (e.g., prompt tuning, UI enhancements).

License

MIT License. See LICENSE file (add to repositories if needed).

Contact

For issues or feature requests, open a GitHub issue in either repository or contact the project maintainer.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
chat-with-pdf-backend		chat-with-pdf-backend
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chat with PDF

Features

Tech Stack

Prerequisites

Project Structure

Deployment Steps

1. Deploy Frontend (GitHub Pages)

2. Deploy Backend (Render)

3. Test the Application

Token Usage Warning

Troubleshooting

Contributing

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

srthorat/chat-with-pdf

Folders and files

Latest commit

History

Repository files navigation

Chat with PDF

Features

Tech Stack

Prerequisites

Project Structure

Deployment Steps

1. Deploy Frontend (GitHub Pages)

2. Deploy Backend (Render)

3. Test the Application

Token Usage Warning

Troubleshooting

Contributing

License

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages