fix(parser): clean up resources on parse failure by Co-Messi · Pull Request #101 · run-llama/liteparse

Co-Messi · 2026-03-29T14:45:06Z

Problem

LiteParse.parse() only cleans up the converted PDF, closes the PDF document, and terminates Tesseract on the happy path. If parsing throws after conversion, the temporary conversion directory and OCR worker cleanup are skipped.

What changed

moved parser resource cleanup into a finally path
ensured PDF document close, Tesseract termination, and temp conversion cleanup all run after parse failures
added parser tests covering failure after conversion with and without OCR enabled

Why this fixes it

This keeps resource cleanup tied to parser lifetime instead of parser success, so failed parses do not leak temp files or OCR workers.

Verification

npm test -- src/core/parser.test.ts
npm run format:check
npm run build

Fixes #100

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5cebae6fd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-29T14:49:26Z

+      if (this.ocrEngine && "terminate" in this.ocrEngine) {
+        await cleanupResource(async () => {
+          await (this.ocrEngine as TesseractEngine).terminate();
+        });


Guard OCR termination to the current parse invocation

LiteParse keeps this.ocrEngine on the parser instance, but this finally block now calls terminate() on every parse exit, including failures that occur before OCR work starts. If callers run parse() concurrently on the same instance, a fast-failing call can tear down Tesseract while another call is still in runOCR, and processPageOcr will swallow those OCR errors and return degraded output. Please only terminate OCR resources that were actually used by this invocation (or otherwise prevent concurrent use) to avoid this regression.

Useful? React with 👍 / 👎.

This feels like a sensible suggestion? A little worried that this might destroy concurrent calls to parse()

Adjusted this so OCR termination only runs if that parse invocation actually entered OCR work. A failure before runOCR() now still closes the PDF and cleans up temp files, but it no longer tears down the shared Tesseract engine for another concurrent parse. I also updated the parser tests to cover that pre-OCR failure path while keeping termination on successful OCR parses.

logan-markewich · 2026-04-07T21:56:17Z

whoops, I let this sit a little too long. Seems like some gnarly merge conflicts

fix(parser): clean up resources on parse failure

b5cebae

chatgpt-codex-connector Bot reviewed Mar 29, 2026

View reviewed changes

fix(parser): only terminate OCR after use

93e66a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(parser): clean up resources on parse failure#101

fix(parser): clean up resources on parse failure#101
Co-Messi wants to merge 2 commits intorun-llama:mainfrom
Co-Messi:codex/parser-finally-cleanup

Co-Messi commented Mar 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 29, 2026

Uh oh!

logan-markewich Mar 30, 2026

Uh oh!

Co-Messi Mar 30, 2026

Uh oh!

logan-markewich commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Co-Messi commented Mar 29, 2026

Problem

What changed

Why this fixes it

Verification

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

logan-markewich Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Co-Messi Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

logan-markewich commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants