LP-ICR is a Seq2Seq deep learning model for License Plate Intelligent Character Recognition, which is based on a CRNN (Convolutional Recurrent Neural Network) architecture with Attention and Spatial Transformer Network (STN) head. It has been trained using the Connectionist Temporal Classification (CTC) loss function and has achieved a Word Error Rate (WER) of 0.16 and a Character Error Rate (CER) of 0.03.
You can download the pretrained model weights & datasets from the following link: LP-ICR Model Weights & datasets
The LP-ICR model is designed to recognize characters and numbers on single & multi row license plates in images. It is based on a deep learning architecture that combines Convolutional Neural Networks (CNNs) for feature extraction, Recurrent Neural Networks (RNNs) for sequence modeling, Attention mechanisms for selective focus, and a Spatial Transformer Network (STN) for spatial transformation. LPICR model is an Seq2Seq End2End model.
- Word Error Rate (WER): 0.16
- Character Error Rate (CER): 0.03
The model is compact, with a weight size of approximately 250MB (pre-optimization like ONNX), making it suitable for various deployment scenarios. It also boasts impressive inference speed, with an inference time on a CPU of just 0.027 seconds.
Here are the performance metrics and loss curves for the LP-ICR model:
| Word Error Rate (WER) | Character Error Rate (CER) | Loss Curve |
|---|---|---|
![]() |
![]() |
![]() |
Here is a table showcasing the results of LP-ICR on various license plate images:
| Image | Recognized Text |
|---|---|
![]() |
KA05MT4918 |
![]() |
KA01AJ9528 |
![]() |
KA04A9383 |
![]() |
KA51AE0104 |
![]() |
KA05AC2170 |
The above table provides a snapshot of the model's recognition accuracy on different license plate images. Model is robust to diffrent plate backgrounds, dark images and 2 row licence plates.
To use the LP-ICR model for license plate recognition, follow these steps:
-
Clone this GitHub repository to your local machine.
git clone https://github.com/IshantPundir/LP-ICR.git
-
Install the required dependencies (see Dependencies).
-
Load the pretrained model weights.
-
Use the model to recognize characters on license plates in your own images.
Example code snippet for license plate recognition:
from lpicr import LPICREngine
def load_image(path:str) -> np.ndarray:
"""Function to load the image"""
image = cv2.imread(path)
return image
# Initialize the engine....
lpicr = LPICREngine(model_path=model_path)
# Load an image...
image = load_image("image_path.jpg")
# Do the inference
text = lpicr(image)
print(f"Results: {text}")You can run this command to test the model as well:
python lpicr.py model.pth image.pngYou can install the dependencies using pip:
pip install -r requirements.txtYou can start the training by running train.py like this:
python train.py --run_name TestRun --config path/to/config.yamlThe script will create a folder output/TestRun, where it will save the model and ceckpoints.
Config files can be found in config directory. Config file looks this this:
train:
epochs: 1000
batch_size: 64
learning_rate: 0.0001
train_denoiser: false
image_transform_config:
image_width: 128
image_height: 32
label_encoder:
max_sequence_length: 32
model_config:
embed_size: 512
datasets:
downsample: 1
make_2_row: false
train_datasets:
- data/INLP-RAW-augmented/train
- data/voc_plate-augmented/train
- data/2-row-lp-augmented/train
- data/2-row-lp-v2-augmented/train
- data/INLP-RAW/train
- data/voc_plate/train
- data/2-row-lp/train
- data/2-row-lp-v2/train
test_datasets:
- data/INLP-RAW/test
- data/2-row-lp/test
- data/2-row-lp-v2/test
- data/voc_plate/test
- data/INLP-RAW-augmented/testMost of the configurations are self-explanatory. You can list multiple datasets & the E2EDataset Dataset class will automatically combine them all in a single dataset object.
make_2_row is an optional arugment for data augmentation, is True it will randomly slip images in half and join them on the X axis, artifically creating 2-row text images. This is only used during pre-training on OCR dataset.
downsample is an artifact from previous architecture with an Denoiser layer, keep it's value to 1.
NOTE: Image transformer config, Label encoder config and Model's config are all stored in the final .pth weight files, allowing you to initiaize image transformer, label encoder and model from a single weight files.
You can resume the training using --resume argument like this
python train.py --run_name TestRun2 --config config/lpicr.yaml --resume output/TestRun1/model.pthI support wandb for logging and monitoring the model's performance. you can start logging by passing --wandb flag.
python train.py --run_name TestRun --config config/ocr.yaml --wandbLP-ICR supports LMDB datsets, datasets can be found inside data directory As mentioned above, you can link many lmdb datasets together in configuration file.
you can label your own images by running the following command:
python data/lp-annotation-tool.py --images path/to/image/directory --output path/to/output/directory --split --split_size 0.2The labeled dataset will be stored in json format, you can run the script below to convert dataset to lmdb format:
python data/dataset_to_lmdb.py --data path/to/json/directoryThis will save the dataset in Lbdb format, ready to be used for training.







