Architecture upgrade: multi-layer viewer, derived revisions, and processing pipeline scaffolding#38
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ee201b3b13
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ) | ||
| .map_err(|e| e.to_string())?; | ||
| conn.execute( | ||
| "INSERT INTO document_revisions (id, document_id, revision_number, managed_file_path, original_file_name, page_count, file_size_bytes, imported_at, source_revision_id, derivation_type) VALUES (?1, ?2, 1, ?3, ?4, NULL, ?5, ?6, NULL, 'imported_original')", |
There was a problem hiding this comment.
Migrate existing revision schema before writing new columns
This insert assumes document_revisions already has source_revision_id and derivation_type, but init_schema only uses CREATE TABLE IF NOT EXISTS and does not migrate databases created by the previous schema (which used source_path instead). On upgraded installs with an existing gitplant.db, this statement fails with no column named source_revision_id, blocking imports and any workflow that creates a new revision until the DB is manually reset.
Useful? React with 👍 / 👎.
| setBytes(new Uint8Array(base)); | ||
| setOverlayBytes(new Uint8Array(overlay)); | ||
| setCurrentRevisionId(derived.derivedRevisionId); |
There was a problem hiding this comment.
Keep displayed document bytes aligned with active revision
In the extract-proof flow you render the source bytes (base) but set currentRevisionId to the derived revision, and the onPageCount callback writes page counts to updatePageCount(currentRevisionId, n). That means opening the proof view can overwrite the derived revision's page_count with the source document's page count, corrupting revision metadata after extractPagesToDerivedRevision runs.
Useful? React with 👍 / 👎.
Motivation
Description
packages/viewer-coreincludingRenderLayer,LayerKind,RenderScene,createBaseRenderScene, andwithOverlayPdfLayerto represent stacked layers.packages/viewer-pdfjsto implement the extended contract (getPageInfo,renderLayer, etc.) while keeping PDF.js confined to the adapter.packages/document-transform-core,packages/document-transform-pdflib(adapter slot),packages/processing-core, andpackages/text-extraction-coreto model transformation requests/results and processing jobs/providers.packages/shared-typesandpackages/persistence-coreto include derived revision fields,ProcessingJobtypes/status andExtractedPageTextDTOs and persistence gateway APIs (e.g.extractPagesToDerivedRevision,triggerTextExtraction).apps/desktop/src-tauri/src/lib.rs): extended SQLite schema fordocument_revisions(lineage +derivation_type),processing_jobs,extracted_page_text, andaudit_events; addedextract_pages_to_derived_revisionthat creates a new derived revision usinglopdfand writes audit events; addedtrigger_text_extractionthat extracts text from text PDFs intoextracted_page_textand updates job status.apps/desktopto consume the new persistence contract (apps/desktop/src/lib/tauriGateway.ts) and surfaced minimal dev scaffolding controls in the UI (Trigger text extraction,Extract page 1 to derived revision) plus aPdfViewerupdate to show scene layer info and accept optional overlay bytes for comparison proof.docs/adr/0002-multi-layer-transform-processing-boundaries.md, updated architecture package and README to document the new boundaries and how they should be used going forward.PdfViewer.test.tsx,App.test.tsx) to assert scene layering and isolation of PDF.js from UI; Rust unit tests for schema, transform extraction helper, and processing/job table behaviors were added to Tauri service.Testing
vitesttest runner invocation vianpm testwas attempted but could not run in this environment due to missing workspace dependencies and registry access; outcome:npm testfailed (missingvitest/npm installblocked by registry403).npm run typecheckwas attempted and failed because TypeScript type dependencies could not be installed in the environment (missing@testing-library/jest-dom/vitest/globals).cargo testwas attempted but failed due to blocked crates.io access (network/registry403), preventing fetching dependencies; Rust unit tests were added (schema checks,extract_pagesproof, processing/job table tests) but could not execute here.npm install/cargo testcan fetch dependencies and run locally or in CI.To validate locally: run
npm installat repo root, thennpm testandnpm run typecheck, and runcargo testunderapps/desktop/src-taurito exercise Rust-side unit tests and proof paths.Codex Task