Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/_apps/tesseract/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
layout: posts
classes: wide
title: tesseract
date: 1970-01-01T00:00:00+00:00
---
This tool applies Tesseract OCR to a video or image and generates text boxes and OCR results. Currenly only support English language.
- [v2.1](v2.1) ([`@keighrim`](https://github.com/keighrim))
104 changes: 104 additions & 0 deletions docs/_apps/tesseract/v2.1/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
---
layout: posts
classes: wide
title: "TesseractOCR Wrapper (v2.1)"
date: 2025-09-05T18:11:26+00:00
---
## About this version

- Submitter: [keighrim](https://github.com/keighrim)
- Submission Time: 2025-09-05T18:11:26+00:00
- Prebuilt Container Image: [ghcr.io/clamsproject/app-tesseractocr-wrapper:v2.1](https://github.com/clamsproject/app-tesseractocr-wrapper/pkgs/container/app-tesseractocr-wrapper/v2.1)
- Release Notes

> SDK update

## About this app (See raw [metadata.json](metadata.json))

**This tool applies Tesseract OCR to a video or image and generates text boxes and OCR results. Currenly only support English language.**

- App ID: [http://apps.clams.ai/tesseract/v2.1](http://apps.clams.ai/tesseract/v2.1)
- App License: Apache 2.0
- Source Repository: [https://github.com/clamsproject/app-tesseractocr-wrapper](https://github.com/clamsproject/app-tesseractocr-wrapper) ([source tree of the submitted version](https://github.com/clamsproject/app-tesseractocr-wrapper/tree/v2.1))
- Analyzer Version: tesseract5.3
- Analyzer License: Apache 2.0


#### Inputs
(**Note**: "*" as a property value means that the property is required but can be any value.)

- [http://mmif.clams.ai/vocabulary/VideoDocument/v1](http://mmif.clams.ai/vocabulary/VideoDocument/v1) (required)
(of any properties)

- [http://mmif.clams.ai/vocabulary/TimeFrame/v6](http://mmif.clams.ai/vocabulary/TimeFrame/v6) (required)
- _representatives_ = "?"

> The Time frame annotation that represents the video segment to be processed. When `representatives` property is present, the app will process videos still frames at the underlying time point annotations that are referred to by the `representatives` property. Otherwise, the app will process the middle frame of the video segment.


#### Configurable Parameters
(**Note**: _Multivalued_ means the parameter can have one or more values.)

- `tfLabel`: optional, defaults to `[]`

- Type: string
- Multivalued: True


> The label of the TimeFrame annotation to be processed. By default (`[]`), all TimeFrame annotations will be processed, regardless of their `label` property values.
- `pretty`: optional, defaults to `false`

- Type: boolean
- Multivalued: False
- Choices: **_`false`_**, `true`


> The JSON body of the HTTP response will be re-formatted with 2-space indentation
- `runningTime`: optional, defaults to `false`

- Type: boolean
- Multivalued: False
- Choices: **_`false`_**, `true`


> The running time of the app will be recorded in the view metadata
- `hwFetch`: optional, defaults to `false`

- Type: boolean
- Multivalued: False
- Choices: **_`false`_**, `true`


> The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata


#### Outputs
(**Note**: "*" as a property value means that the property is required but can be any value.)

(**Note**: Not all output annotations are always generated.)

- [http://mmif.clams.ai/vocabulary/TextDocument/v1](http://mmif.clams.ai/vocabulary/TextDocument/v1)
- _@lang_ = "en"

> Fully serialized text content of the recognized text in the input images. Serialization isdone by concatenating `text` values of `Paragraph` annotations with two newline characters.
- [http://vocab.lappsgrid.org/Token](http://vocab.lappsgrid.org/Token)
- _text_ = "*"
- _word_ = "*"

> Translation of the recognized tesseract "words" in the input images. `token` properties store the string values of the recognized text. The duplication is for keepingbackward compatibility and consistency with `Paragraph` and `Sentence` annotations.
- [http://vocab.lappsgrid.org/Sentence](http://vocab.lappsgrid.org/Sentence)
- _text_ = "*"

> Translation of the recognized tesseract "lines" in the input images. `sentence` property from LAPPS vocab stores the string value of space-joined words.
- [http://vocab.lappsgrid.org/Paragraph](http://vocab.lappsgrid.org/Paragraph)
- _text_ = "*"

> Translation of the recognized tesseract "blocks" in the input images. `paragraph` property from LAPPS vocab stores the string value of newline-joined sentences.
- [http://mmif.clams.ai/vocabulary/Alignment/v1](http://mmif.clams.ai/vocabulary/Alignment/v1)
(of any properties)

> Alignments between 1) `TimePoint` <-> `TextDocument`, 2) `TimePoint` <-> `Token`/`Sentence`/`Paragraph`, 3) `BoundingBox` <-> `Token`/`Sentence`/`Paragraph`
- [http://mmif.clams.ai/vocabulary/BoundingBox/v5](http://mmif.clams.ai/vocabulary/BoundingBox/v5)
- _label_ = "text"

> Bounding boxes of the detected text regions in the input images. No corresponding box for the entire image (`TextDocument`) region
97 changes: 97 additions & 0 deletions docs/_apps/tesseract/v2.1/metadata.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
{
"name": "TesseractOCR Wrapper",
"description": "This tool applies Tesseract OCR to a video or image and generates text boxes and OCR results. Currenly only support English language.",
"app_version": "v2.1",
"mmif_version": "1.1.0",
"analyzer_version": "tesseract5.3",
"app_license": "Apache 2.0",
"analyzer_license": "Apache 2.0",
"identifier": "http://apps.clams.ai/tesseract/v2.1",
"url": "https://github.com/clamsproject/app-tesseractocr-wrapper",
"input": [
{
"@type": "http://mmif.clams.ai/vocabulary/VideoDocument/v1",
"required": true
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v6",
"description": "The Time frame annotation that represents the video segment to be processed. When `representatives` property is present, the app will process videos still frames at the underlying time point annotations that are referred to by the `representatives` property. Otherwise, the app will process the middle frame of the video segment.",
"properties": {
"representatives": "?"
},
"required": true
}
],
"output": [
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"description": "Fully serialized text content of the recognized text in the input images. Serialization isdone by concatenating `text` values of `Paragraph` annotations with two newline characters.",
"properties": {
"@lang": "en"
}
},
{
"@type": "http://vocab.lappsgrid.org/Token",
"description": "Translation of the recognized tesseract \"words\" in the input images. `token` properties store the string values of the recognized text. The duplication is for keepingbackward compatibility and consistency with `Paragraph` and `Sentence` annotations.",
"properties": {
"text": "*",
"word": "*"
}
},
{
"@type": "http://vocab.lappsgrid.org/Sentence",
"description": "Translation of the recognized tesseract \"lines\" in the input images. `sentence` property from LAPPS vocab stores the string value of space-joined words.",
"properties": {
"text": "*"
}
},
{
"@type": "http://vocab.lappsgrid.org/Paragraph",
"description": "Translation of the recognized tesseract \"blocks\" in the input images. `paragraph` property from LAPPS vocab stores the string value of newline-joined sentences.",
"properties": {
"text": "*"
}
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"description": "Alignments between 1) `TimePoint` <-> `TextDocument`, 2) `TimePoint` <-> `Token`/`Sentence`/`Paragraph`, 3) `BoundingBox` <-> `Token`/`Sentence`/`Paragraph`"
},
{
"@type": "http://mmif.clams.ai/vocabulary/BoundingBox/v5",
"description": "Bounding boxes of the detected text regions in the input images. No corresponding box for the entire image (`TextDocument`) region",
"properties": {
"label": "text"
}
}
],
"parameters": [
{
"name": "tfLabel",
"description": "The label of the TimeFrame annotation to be processed. By default (`[]`), all TimeFrame annotations will be processed, regardless of their `label` property values.",
"type": "string",
"default": [],
"multivalued": true
},
{
"name": "pretty",
"description": "The JSON body of the HTTP response will be re-formatted with 2-space indentation",
"type": "boolean",
"default": false,
"multivalued": false
},
{
"name": "runningTime",
"description": "The running time of the app will be recorded in the view metadata",
"type": "boolean",
"default": false,
"multivalued": false
},
{
"name": "hwFetch",
"description": "The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata",
"type": "boolean",
"default": false,
"multivalued": false
}
]
}
6 changes: 6 additions & 0 deletions docs/_apps/tesseract/v2.1/submission.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"time": "2025-09-05T18:11:26+00:00",
"submitter": "keighrim",
"image": "ghcr.io/clamsproject/app-tesseractocr-wrapper:v2.1",
"releasenotes": "SDK update\n\n"
}
10 changes: 10 additions & 0 deletions docs/_data/app-index.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,14 @@
{
"http://apps.clams.ai/tesseract": {
"description": "This tool applies Tesseract OCR to a video or image and generates text boxes and OCR results. Currenly only support English language.",
"latest_update": "2025-09-05T18:11:26+00:00",
"versions": [
[
"v2.1",
"keighrim"
]
]
},
"http://apps.clams.ai/parakeet-wrapper": {
"description": "A CLAMS wrapper for NVIDIA NeMo Parakeet ASR models available on huggingface-hub with support for punctuation, capitalization, and word-level timestamping.",
"latest_update": "2025-07-29T15:07:13+00:00",
Expand Down
2 changes: 1 addition & 1 deletion docs/_data/apps.json

Large diffs are not rendered by default.

Loading