Add PDF text extraction feature using PyPDF2 #85

haragam22 · 2025-10-19T11:57:32Z

Summary

This pull request introduces a Python script that reads all .pdf files from a folder and extracts their text into corresponding .txt files using the PyPDF2 library.This will solve issue #81.

Changes Made

Added main.py for reading and processing PDF files.
Created a pdfs/ folder for input PDFs.
Created an output/ folder for saving extracted text files.
Added a requirements.txt file listing dependencies.

How It Works

The script scans the pdfs/ folder for .pdf files.
For each PDF, it extracts text using PyPDF2.PdfReader.
Writes the extracted text into .txt files in the output/ folder.

Commands Used

git checkout -b feature/pdf-text-extraction
git add .
git commit -m "Add PDF text extraction script using PyPDF2"
git push -u origin feature/pdf-text-extraction

Notes

Works best for text-based PDFs.
Future improvement: integrate OCR for scanned PDFs.

devmalik7 · 2025-10-23T08:42:40Z

@haragam22 , Great work , merging it now.

haragam22 added 2 commits October 19, 2025 17:21

Initial commit: PDF text extractor script

43a055b

Merge branch 'devmalik7:main' into main

9c1a7b4

devmalik7 assigned haragam22 Oct 23, 2025

devmalik7 added good first issue Good for newcomers hacktoberfest-accepted hacktoberfest python labels Oct 23, 2025

devmalik7 approved these changes Oct 23, 2025

View reviewed changes

devmalik7 merged commit 9f99f72 into devmalik7:main Oct 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add PDF text extraction feature using PyPDF2 #85

Add PDF text extraction feature using PyPDF2 #85

Uh oh!

haragam22 commented Oct 19, 2025 •

edited

Loading

Uh oh!

devmalik7 commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add PDF text extraction feature using PyPDF2 #85

Add PDF text extraction feature using PyPDF2 #85

Uh oh!

Conversation

haragam22 commented Oct 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes Made

How It Works

Commands Used

Notes

Uh oh!

devmalik7 commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

haragam22 commented Oct 19, 2025 •

edited

Loading