🔍 OCR Enhanced - Advanced Document Processing

A comprehensive OCR solution that combines local (Tesseract) and cloud (Mistral AI) processing with dynamic folder selection, searchable PDF generation, and hybrid processing modes.

✨ Features

🎯 Core Functionality

Hybrid OCR Processing: Try local first, fallback to cloud if needed
Multiple Engines: Tesseract (local) + Mistral AI (cloud)
Searchable PDFs: Generate PDFs with invisible text layer
Batch Processing: Handle multiple files efficiently
Dynamic Folders: Choose input/output directories via GUI

🔧 Processing Modes

🔄 Hybrid: Local first, cloud fallback (recommended)
☁️ Cloud Only: Mistral AI processing only
💻 Local Only: Tesseract processing only
🔒 Privacy: Force local processing (no data sent to cloud)

🎨 User Experience

Modern GUI: Intuitive Tkinter interface with drag & drop
Real-time Progress: Detailed progress tracking and logging
Folder Selection: Choose custom input/output directories
Multi-format Output: JSON, Markdown, and searchable PDF

🚀 Quick Start

Installation

# Install from PyPI (recommended)
pip install ocr-enhanced

# Or install from source
git clone https://github.com/leo-dower/ocr-enhanced-projec.git
cd ocr-enhanced-projec
pip install -e .

System Dependencies

Ubuntu/Debian:

sudo apt install tesseract-ocr tesseract-ocr-por tesseract-ocr-eng poppler-utils

Windows:

Download Tesseract from UB-Mannheim
Install Poppler from conda-forge

macOS:

brew install tesseract poppler

Usage

GUI Application:

ocr-enhanced-gui

Command Line:

ocr-cli --input /path/to/pdfs --output /path/to/results

Python API:

from src.core import OCRProcessor

processor = OCRProcessor(mode='hybrid')
result = processor.process_file('document.pdf')

📁 Project Structure

ocr-enhanced/
├── src/                    # Source code
│   ├── core/              # Core processing logic
│   ├── gui/               # User interface
│   ├── ocr/               # OCR engines
│   └── utils/             # Utilities
├── tests/                 # Test suite
├── docs/                  # Documentation
├── examples/              # Usage examples
└── requirements/          # Dependencies

🔧 Configuration

Environment Variables

# API Configuration
MISTRAL_API_KEY=your_api_key_here

# Default Folders
OCR_INPUT_PATH=/path/to/input
OCR_OUTPUT_PATH=/path/to/output

# Processing Settings
OCR_MODE=hybrid
OCR_LANGUAGE=por+eng

Configuration File

Create ~/.ocr-enhanced.json:

{
  "default_mode": "hybrid",
  "tesseract_path": "/usr/bin/tesseract",
  "default_language": "por+eng",
  "max_pages_per_batch": 200,
  "confidence_threshold": 0.75
}

🧪 Development

Setup Development Environment

# Clone repository
git clone https://github.com/leo-dower/ocr-enhanced-projec.git
cd ocr-enhanced-projec

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/Mac
# or
venv\Scripts\activate  # Windows

# Install development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Run Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src --cov-report=html

# Run specific test types
pytest -m unit          # Unit tests only
pytest -m integration   # Integration tests only
pytest -m "not slow"    # Skip slow tests

Code Quality

# Format code
black src tests

# Sort imports
isort src tests

# Lint code
flake8 src tests

# Type checking
mypy src

# Security check
bandit -r src

📊 Performance

Mode	Speed	Accuracy	Privacy	Cost
Local	Fast	Good	100%	Free
Cloud	Medium	Excellent	Depends	Paid
Hybrid	Optimal	Best	Balanced	Mixed

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes
Add tests for your changes
Ensure all tests pass (pytest)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Tesseract OCR for local processing
Mistral AI for cloud OCR capabilities
PyMuPDF for PDF manipulation

📞 Support

Made with ❤️ by the OCR Enhanced Team - Leo-dower and claudecode =)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
build_scripts		build_scripts
requirements		requirements
scripts		scripts
src		src
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
AUTOMATION_SUMMARY.md		AUTOMATION_SUMMARY.md
CLAUDE.md		CLAUDE.md
DISTRIBUTION.md		DISTRIBUTION.md
DISTRIBUTION_SUMMARY.md		DISTRIBUTION_SUMMARY.md
Makefile		Makefile
OCR_Enhanced_Hybrid_v1.py		OCR_Enhanced_Hybrid_v1.py
OCR_Enhanced_with_Local_Processing.py		OCR_Enhanced_with_Local_Processing.py
OCR_Enhanced_with_Searchable_PDF.py		OCR_Enhanced_with_Searchable_PDF.py
OCR_Enhanced_with_Searchable_PDF_REAL.py		OCR_Enhanced_with_Searchable_PDF_REAL.py
README.md		README.md
README_API_KEY_MANAGER.md		README_API_KEY_MANAGER.md
SETUP_FOLDERS.md		SETUP_FOLDERS.md
TESTING_SUMMARY.md		TESTING_SUMMARY.md
api_key_manager.py		api_key_manager.py
install_dependencies.sh		install_dependencies.sh
install_manual.sh		install_manual.sh
install_pdf_dependencies.sh		install_pdf_dependencies.sh
install_python_only.sh		install_python_only.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
test_api_integration.py		test_api_integration.py
test_cache_system.py		test_cache_system.py
test_hybrid_setup.py		test_hybrid_setup.py
test_image_preprocessing.py		test_image_preprocessing.py
test_integration_automation.py		test_integration_automation.py
test_multi_engine_basic.py		test_multi_engine_basic.py
test_multi_engine_final.py		test_multi_engine_final.py
test_multi_engine_integration.py		test_multi_engine_integration.py
test_ocr_setup.py		test_ocr_setup.py
test_parallel_processing.py		test_parallel_processing.py
test_preprocessing_simple.py		test_preprocessing_simple.py
test_text_processing.py		test_text_processing.py
test_xml_output.py		test_xml_output.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔍 OCR Enhanced - Advanced Document Processing

✨ Features

🎯 Core Functionality

🔧 Processing Modes

🎨 User Experience

🚀 Quick Start

Installation

System Dependencies

Usage

📁 Project Structure

🔧 Configuration

Environment Variables

Configuration File

🧪 Development

Setup Development Environment

Run Tests

Code Quality

📊 Performance

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

leo-dower/ocr-enhanced-projec

Folders and files

Latest commit

History

Repository files navigation

🔍 OCR Enhanced - Advanced Document Processing

✨ Features

🎯 Core Functionality

🔧 Processing Modes

🎨 User Experience

🚀 Quick Start

Installation

System Dependencies

Usage

📁 Project Structure

🔧 Configuration

Environment Variables

Configuration File

🧪 Development

Setup Development Environment

Run Tests

Code Quality

📊 Performance

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages