Skip to content

Conversation

@rpoelstra
Copy link

Input to the TJ command is a single string interspersed with glyph positioning commands.
This PR adds the whole string as a single entity to the canvas.strings property.

@maxpmaxp
Copy link
Owner

The array is used intentionally to reflect the source document "mark down". ifyou need to have a single string it's better to join TJs after tha parsing.

@maxpmaxp maxpmaxp closed this Aug 11, 2025
@rpoelstra
Copy link
Author

Hi,

Thank you for reviewing this PR.
I think you misunderstood the intention of the PR.
It does not combine multiple TJ's, but in combines the parts inside a single TJ.
Take for instance the attached PDF 29261JEGR08239 kopie.pdf.
Without this PR the (cropped) text output of the first four TJ's is:

['Pr', 'ogr', 'ess Report', 'Serial Number:', 'Softwar', 'e V', 'ersion:', 'OS V', 'ersion:',

With this PR this becomes:

['Progress Report', 'Serial Number:', 'Software Version:', 'OS Version:',

Which, I think, better reflects the markdown of the document. As the inter character spacing (kerning) is a font property and not related to the markdown.

@maxpmaxp
Copy link
Owner

Good point. I'm going to revisit it in a day or two. Thanks for the contribution!

@maxpmaxp maxpmaxp reopened this Aug 12, 2025
@rpoelstra
Copy link
Author

@maxpmaxp Can you please review these PR's? They are becoming critical and I would like to know how to proceed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants