Skip to content

AME-OCRacle/Final-Task

Repository files navigation

Assignment 2: Optical Character Recognition (OCR)

Objective
The objective of this assignment is to implement and compare two Optical Character Recognition (OCR) pipelines on a common evaluation task. The assignment involves working with both a traditional OCR system and a neural network–based OCR system using transfer learning, followed by a comparative analysis of their performance.

Part A: OCR using PyTesseract

In this part, an OCR pipeline is implemented using PyTesseract. PyTesseract is a ready-to-use OCR engine and does not require any training. The focus of this part is on preprocessing, configuration, and evaluation.

Tasks performed

  • Load the provided test image(s) and perform OCR using PyTesseract
  • Apply suitable image preprocessing techniques such as resizing, thresholding, noise removal, and morphological operations
  • Experiment with different Page Segmentation Modes (PSM) and document the chosen configuration
  • Report the recognized text output
  • Compute and report evaluation metrics such as Character Error Rate (CER) and/or Word Error Rate (WER) on the provided evaluation page

Part B: OCR using Transfer Learning (EasyOCR)

In this part, a neural network–based OCR system is implemented using transfer learning. EasyOCR is used as the recognition framework.

Tasks performed

  • Use the provided training dataset to fine-tune an EasyOCR recognition model
  • Clearly document the transfer learning strategy used, including freezing layers, partial fine-tuning, and learning rate selection
  • Train the model and report training metrics such as loss curves
  • Run inference on the same evaluation image(s) used in Part A
  • Compute and report CER and/or WER for the EasyOCR model

Documentation details
The implementation clearly documents:

  • Which components of the model were fine-tuned
  • What changes were made compared to the default pre-trained setup
  • The reasoning behind these changes

Common Evaluation and Comparison

To ensure a fair comparison, both OCR systems are evaluated on the same provided test page.

Tasks performed

  • Present the OCR outputs from both PyTesseract and EasyOCR
  • Compare the quantitative metrics (CER and/or WER)
  • Provide a brief qualitative discussion explaining which method performs better and why

Deliverables

  • One Colab notebook for Part A (PyTesseract)
  • One Colab notebook for Part B (EasyOCR with transfer learning)
  • A short written report (1–2 pages) summarizing the results and comparison

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 10