Logs digital meter readings from standalone devices from video.
All you need is to record a video of your reading.
This project will convert video of oximetry readings into logs as a use case. Internally, it relies on pretrained AI for OCR based on the library MMOCR.
- OpenCV
- Qt
- MMOCR
- Pandas
- Ubuntu 22.04.3 LTS (Windows is possible but not tested)
- Nvidia's GPU capable of installing CUDA 11.8 and it's drivers.
- Python's Anaconda for virtual environment
- git
Go to your terminal, create a folder that you would like to clone the repo.
Run the following command in your terminal to clone this repo.
git clone https://github.com/cpoohee/MeterVideo2TxtAssuming, anaconda is properly installed, run the following to create the environment
conda env create -f requirements.ymlActivate the environment
conda activate MV2TThen, run the app.
python src/main.pyFor the first time running the app, it will appear unresponsive. It will take some time to download and cache the AI models.
- load the video under
File->Load video - Select the appropriate rotation of the video under
Orientation-> - Select the choice of text detection model under
Detector Model->- if
Fixed Areais selected, click and drag a bounding box on the video for tracking. A label will be created for the area. - otherwise, double-click on the detected bounding box for tracking . A label will be attached to the box.
- if
- Select the choice of text recognition model under
Recognizer Model-> - No duplicated labels are allowed.
- Navigate the video using the Slider, or the spinbox to go to an exact frame.
- You can choose the starting frame where the detection starts.
- Click on
Track Next Frameto track values that are labelled. - Click on
Track Subsequent Framesto track values from the current frame onwards. - Click on
Exportto save the tracked values to CSV/Excel's xlsx/Panda's Pickle files.
-
If the
mmcvlibrary encounters errors, it might need to be recompiled. -
Why does the tracking drifted too far?
- Currently, the tracking works by finding the nearest bounding boxes from the previous location. Unfortunately, some frame could be missing the detected text which in turn will create erroneous tracking. In the future, I might add a threshold to prevent drifting.
- Test a very long video experience
- Allow editing/repositioning of tracked Labels.
- Add threshold for tracking nearest polygon in tracking mode.
- Add kalman filter for tracking between points
- Separate settings to download detectors and recognisers
- Utilise LLM to match words/header