- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5
mikemccabe/analyze_ocr
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Some code for analyzing OCR'ed documents. It's currently pretty specific to Internet Archive OCR'd books, but it may be generalizable. Entry point: analyze_ocr.py - run this against an archive scanned book. Functionality: find headers/footers, page numbers, tables of contents.
About
Parse OCR result files for pagenos, tables of contents, etc.
Resources
Stars
Watchers
Forks
Releases
No releases published
              Packages 0
        No packages published