Skip to content

MukundaKatta/OCRLite

Repository files navigation

OCRLite

CI Python 3.10+ License: MIT Code style: black

Lightweight OCR text extraction — a Python library that extracts text from images using pixel analysis and pattern matching for printed characters. No ML models needed.


Architecture

graph TD
    A[Input Image] --> B[ImageLoader]
    B --> C[Preprocessor]
    C --> D[Binarizer]
    D --> E[Line Segmenter]
    E --> F[Character Segmenter]
    F --> G[Pattern Matcher]
    G --> H[Text Output]

    subgraph OCRLite Pipeline
        B
        C
        D
        E
        F
        G
    end

    style A fill:#e1f5fe
    style H fill:#e8f5e9
Loading
classDiagram
    class OCRLite {
        +OCRConfig config
        +load_image(path) NDArray
        +preprocess(image) NDArray
        +binarize(image, threshold) NDArray
        +segment_lines(image) list~NDArray~
        +segment_characters(image) list~NDArray~
        +extract_text(image) OCRResult
        +detect_orientation(image) int
        +get_confidence(result) float
        +export(result, format) str
    }
    class OCRConfig {
        +int default_threshold
        +int min_line_height
        +int min_char_width
        +float confidence_threshold
        +bool auto_orient
    }
    class OCRResult {
        +str text
        +float confidence
        +int orientation
        +list~LineResult~ lines
    }
    OCRLite --> OCRConfig
    OCRLite --> OCRResult
Loading

Quickstart

Installation

pip install -e .

Usage

from ocrlite import OCRLite

ocr = OCRLite()

# Extract text from an image
result = ocr.extract_text(ocr.load_image("document.png"))

print(result.text)
print(f"Confidence: {ocr.get_confidence(result):.2%}")

# Export as JSON
json_output = ocr.export(result, format="json")

Configuration

from ocrlite.config import OCRConfig

config = OCRConfig(
    default_threshold=128,
    min_line_height=10,
    min_char_width=5,
    confidence_threshold=0.5,
    auto_orient=True,
)

ocr = OCRLite(config=config)

CLI

python -m ocrlite extract document.png
python -m ocrlite extract document.png --format json

Features

  • No ML models — pure pixel analysis and pattern matching
  • Binarization — adaptive and global thresholding via numpy
  • Line segmentation — horizontal projection-based splitting
  • Character segmentation — connected component analysis
  • Orientation detection — automatic rotation correction
  • Multiple export formats — plain text, JSON, CSV
  • Configurable — full control over thresholds and parameters via Pydantic models

Development

make install      # Install dependencies
make test         # Run tests
make lint         # Run linter
make format       # Format code

Inspired by OCR and document AI trends


Built by Officethree Technologies | Made with love and AI

About

Lightweight OCR text extraction — binarization, line segmentation, connected components with PIL/NumPy

Topics

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors