WIP: Backend PDF Parsing by harshkhandeparkar · Pull Request #152 · metakgp/iqps-go

harshkhandeparkar · 2026-04-19T05:02:26Z

Fixes #146

Description

WIP: Created a (testing sandbox) binary that takes a PDF, renders its pages to images, parses each one using tesseract, and then detects QPs in them.

Detect multiple QPs in one PDF
Remove OCR on the frontend, which takes too long, especially on mobile phones
- The upload endpoint should create DB entries and dump all files in a directory
- The PDF parser looks for changes in the uploaded papers directory and parses them one by one
- If duplicate papers are found, add them unapproved and show on the Admin dashboard
- If the detected paper is not duplicate, directly approve it (unless important details are missing, in which case also add to the admin dashboard)
FUTURE: Report paper feature for incorrect detections

vercel · 2026-04-19T05:02:31Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
iqps	Ready	Preview, Comment	Apr 19, 2026 5:02am

feat[WIP]: basic pdf parsing stuff

c40931c

harshkhandeparkar marked this pull request as draft April 19, 2026 05:02