LP-ICR - License Plate Intelligent Character Recognition

LP-ICR is a Seq2Seq deep learning model for License Plate Intelligent Character Recognition, which is based on a CRNN (Convolutional Recurrent Neural Network) architecture with Attention and Spatial Transformer Network (STN) head. It has been trained using the Connectionist Temporal Classification (CTC) loss function and has achieved a Word Error Rate (WER) of 0.16 and a Character Error Rate (CER) of 0.03.

You can download the pretrained model weights & datasets from the following link: LP-ICR Model Weights & datasets

Model Details

The LP-ICR model is designed to recognize characters and numbers on single & multi row license plates in images. It is based on a deep learning architecture that combines Convolutional Neural Networks (CNNs) for feature extraction, Recurrent Neural Networks (RNNs) for sequence modeling, Attention mechanisms for selective focus, and a Spatial Transformer Network (STN) for spatial transformation. LPICR model is an Seq2Seq End2End model.

Model Performance

Word Error Rate (WER): 0.16
Character Error Rate (CER): 0.03

The model is compact, with a weight size of approximately 250MB (pre-optimization like ONNX), making it suitable for various deployment scenarios. It also boasts impressive inference speed, with an inference time on a CPU of just 0.027 seconds.

Here are the performance metrics and loss curves for the LP-ICR model:

Word Error Rate (WER)	Character Error Rate (CER)	Loss Curve

Results

Here is a table showcasing the results of LP-ICR on various license plate images:

Image	Recognized Text
	KA05MT4918
	KA01AJ9528
	KA04A9383
	KA51AE0104
	KA05AC2170

The above table provides a snapshot of the model's recognition accuracy on different license plate images. Model is robust to diffrent plate backgrounds, dark images and 2 row licence plates.

Usage

To use the LP-ICR model for license plate recognition, follow these steps:

Clone this GitHub repository to your local machine.

git clone https://github.com/IshantPundir/LP-ICR.git

Install the required dependencies (see Dependencies).
Load the pretrained model weights.
Use the model to recognize characters on license plates in your own images.

Example code snippet for license plate recognition:

from lpicr import LPICREngine

def load_image(path:str) -> np.ndarray:
    """Function to load the image"""
    image = cv2.imread(path)
    return image

# Initialize the engine....
lpicr = LPICREngine(model_path=model_path)

# Load an image...
image = load_image("image_path.jpg")

# Do the inference
text = lpicr(image)

print(f"Results: {text}")

You can run this command to test the model as well:

python lpicr.py model.pth image.png

Dependencies

You can install the dependencies using pip:

pip install -r requirements.txt

Training

You can start the training by running train.py like this:

python train.py --run_name TestRun --config path/to/config.yaml

The script will create a folder output/TestRun, where it will save the model and ceckpoints.

Config files can be found in config directory. Config file looks this this:

train:
  epochs: 1000
  batch_size: 64
  learning_rate: 0.0001
  train_denoiser: false

image_transform_config:
  image_width: 128
  image_height: 32

label_encoder:
  max_sequence_length: 32

model_config:
  embed_size: 512
  
datasets:
  downsample: 1
  make_2_row: false
  train_datasets:
    - data/INLP-RAW-augmented/train
    - data/voc_plate-augmented/train
    - data/2-row-lp-augmented/train
    - data/2-row-lp-v2-augmented/train
    - data/INLP-RAW/train
    - data/voc_plate/train
    - data/2-row-lp/train
    - data/2-row-lp-v2/train
  test_datasets:
    - data/INLP-RAW/test
    - data/2-row-lp/test
    - data/2-row-lp-v2/test
    - data/voc_plate/test
    - data/INLP-RAW-augmented/test

Most of the configurations are self-explanatory. You can list multiple datasets & the E2EDataset Dataset class will automatically combine them all in a single dataset object.

make_2_row is an optional arugment for data augmentation, is True it will randomly slip images in half and join them on the X axis, artifically creating 2-row text images. This is only used during pre-training on OCR dataset.

downsample is an artifact from previous architecture with an Denoiser layer, keep it's value to 1.

NOTE: Image transformer config, Label encoder config and Model's config are all stored in the final .pth weight files, allowing you to initiaize image transformer, label encoder and model from a single weight files.

Fine-tuning

You can resume the training using --resume argument like this

python train.py --run_name TestRun2 --config config/lpicr.yaml --resume output/TestRun1/model.pth

Logging

I support wandb for logging and monitoring the model's performance. you can start logging by passing --wandb flag.

python train.py --run_name TestRun --config config/ocr.yaml --wandb

Datasets

LP-ICR supports LMDB datsets, datasets can be found inside data directory As mentioned above, you can link many lmdb datasets together in configuration file.

Custom datasets:

you can label your own images by running the following command:

python data/lp-annotation-tool.py --images path/to/image/directory  --output path/to/output/directory --split --split_size 0.2

The labeled dataset will be stored in json format, you can run the script below to convert dataset to lmdb format:

python data/dataset_to_lmdb.py --data path/to/json/directory

This will save the dataset in Lbdb format, ready to be used for training.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
assets		assets
config		config
data		data
models		models
notebooks		notebooks
utils		utils
.gitignore		.gitignore
README.md		README.md
lpicr.py		lpicr.py
recovery.py		recovery.py
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LP-ICR - License Plate Intelligent Character Recognition

Table of Contents

Model Details

Model Performance

Results

Usage

Dependencies

Training

Fine-tuning

Logging

Datasets

Custom datasets:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

IshantPundir/LP-ICR

Folders and files

Latest commit

History

Repository files navigation

LP-ICR - License Plate Intelligent Character Recognition

Table of Contents

Model Details

Model Performance

Results

Usage

Dependencies

Training

Fine-tuning

Logging

Datasets

Custom datasets:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages