Skip to content

fix(indesign): exclude rich media blocks from export output#4614

Open
wil-gerken wants to merge 5 commits intotrunkfrom
fix/indesign-exclude-rich-media-blocks
Open

fix(indesign): exclude rich media blocks from export output#4614
wil-gerken wants to merge 5 commits intotrunkfrom
fix/indesign-exclude-rich-media-blocks

Conversation

@wil-gerken
Copy link
Copy Markdown
Contributor

All Submissions:

Changes proposed in this Pull Request:

When a post contains a core/file block (PDF embed) or core/embed block (YouTube, etc.), the InDesign export was including raw HTML markup in the output — such as <object> tags, download link text, and bare embed URLs — instead of skipping those blocks entirely. This resulted in import errors and corrupted print output for publishers using the InDesign export tool.

Rich media blocks have no print equivalent and should be excluded from the export. This fix adds an explicit exclusion list in process_blocks() for core/file, core/embed, core/video, and core/audio.

These blocks are now skipped before serialization, ensuring that only printable content is exported while leaving surrounding content intact.

Closes NPPM-2677.

How to test the changes in this Pull Request:

Reproduce the problem (without this PR):

  1. Enable the InDesign export optional module: wp newspack optional-modules enable indesign-export
  2. Create a post containing a paragraph, a PDF file block (core/file), and another paragraph after it.
  3. From the Posts list, click Export as Adobe InDesign in the row actions for that post.
  4. Open the downloaded .txt file — observe that <object> tags, the PDF filename, and "Download" text appear between the two paragraphs in the export output.

Verify the fix (with this PR):

  1. Export the same post again.
  2. Open the downloaded .txt file — the two paragraphs should appear cleanly with no <object> tag, PDF filename, or "Download" text between them.
  3. Repeat with a post containing a YouTube embed block (core/embed) — the YouTube URL should not appear in the export.
  4. Test a post with only standard content (paragraphs, headings, blockquotes, lists) — verify all content is present and unchanged.

Other information:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your changes, as applicable?
  • Have you successfully ran tests with your changes locally?

Copilot AI review requested due to automatic review settings March 30, 2026 23:17
@wil-gerken wil-gerken requested a review from a team as a code owner March 30, 2026 23:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses an InDesign export issue where rich media Gutenberg blocks (e.g., PDF file embeds and oEmbed content) leak raw HTML/URLs into the exported tagged-text output, causing InDesign import problems.

Changes:

  • Add an exclusion list in the InDesign converter to skip core/file, core/embed, core/video, and core/audio blocks during block processing.
  • Add unit tests verifying core/file and core/embed blocks do not contribute raw markup/URLs to the export output.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
includes/optional-modules/indesign-export/class-indesign-converter.php Skips specific rich media block types during process_blocks() to prevent raw markup/URLs from entering the export.
tests/unit-tests/indesign-exporter/indesign-exporter.php Adds unit tests asserting core/file and core/embed content is excluded from exported output.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread includes/optional-modules/indesign-export/class-indesign-converter.php Outdated
Comment thread tests/unit-tests/indesign-exporter/indesign-exporter.php
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@wil-gerken wil-gerken added the [Status] Needs Review The issue or pull request needs to be reviewed label Mar 30, 2026
Copy link
Copy Markdown
Member

@adekbadek adekbadek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested in an isolated env by running InDesign_Converter::convert_post() against six fixture posts: file/embed/video/audio at top level, file inside a core/group, embed inside core/columns, and a plain control. All exclusions worked as expected and the surrounding content was preserved in every case. No PHP notices in debug.log.

The fix is correct and the recursion handles nested containers as advertised. Approving – a few non-blocking suggestions inline. The main one is that the unit tests only cover core/file/core/embed nested in core/group, but the implementation (and docblock) supports core/video, core/audio, and core/columns too. I verified those work at runtime, but it would be nice to lock them in with tests.

'core/file',
'core/embed',
'core/video',
'core/audio',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: expose this as a filter (e.g. newspack_indesign_export_excluded_blocks) so publishers with custom rich-media blocks can extend the list without patching. Also worth promoting to a private const since it's rebuilt on every process_blocks() call.

Minor related note: legacy core-embed/* block names (pre-WP 5.6 content) won't match core/embed – a filter sidesteps that, or a strpos( $blockName, 'core-embed/' ) === 0 check would catch them. Low priority, depends on how old your content is.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is a great suggestion.

  • I've promoted the array to an EXCLUDED_BLOCK_TYPES class constant. I left it as unscoped const rather than private const so the default list remains accessible if needed, but I'm happy to change if you think it should be private.

  • I added the newspack_indesign_export_excluded_blocks filter so publishers can extend the list for custom block types. The filter result is cast to an array and filtered to strings, so a misbehaving callback won't break the export.

  • I added an is_excluded_block() helper which handles the explicit list and legacy core-embed/* names via a strpos prefix check. Legacy variants are tied to core/embed's filter state, so if a publisher removes core/embed from the filter, the legacy variants are also un-blocked for consistency.

  • Last, I added a null guard in that helper since parse_blocks() returns blockName === null for freeform chunks. I have a pre-push review command that caught that as a PHP 8.3 deprecation.

I don't think this fix is urgent, and the changes were extensive enough that it seems best to re-review. Can you please give it another look when you have a chance?

if ( is_string( $chunk ) ) {
$new_inner_content[] = $chunk;
} else {
$inner_block = $block['innerBlocks'][ $inner_index++ ];
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defensive nit: $block['innerBlocks'][ $inner_index++ ] assumes the parse_blocks() invariant that null placeholders in innerContent line up positionally with innerBlocks entries. That's safe in practice, but a ! isset(...) guard would harden it against malformed input. Take it or leave it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Taking it / added. I like safeguards like this much better as well.

$this->assertStringContainsString( 'Before the group.', $content );
$this->assertStringContainsString( 'After the group.', $content );
$this->assertStringNotContainsString( 'youtube.com', $content );
$this->assertStringNotContainsString( 'abc123', $content );
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage gap worth filling: the implementation also strips core/video and core/audio and the docblock calls out core/columns as a supported container, but neither is exercised here. I verified at runtime that all three work, but suggest adding cases for:

  1. core/video and/or core/audio exclusion (locks in the full list).
  2. An embed nested in core/columns (different innerContent shape than core/group).
  3. Two excluded siblings in a row inside a container, to exercise the $inner_index increment path.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestions. I added all three:

  • core/video and core/audio top-level exclusion
  • an embed nested in core/columns
  • two excluded siblings in a row to exercise the $inner_index increment path.

I also added a test for legacy core-embed/youtube blocks while I was in there, to lock in the prefix check from the other thread.

All 27 InDesign tests pass.

@github-actions github-actions bot added [Status] Approved The pull request has been reviewed and is ready to merge and removed [Status] Needs Review The issue or pull request needs to be reviewed labels Apr 15, 2026
@wil-gerken wil-gerken removed the [Status] Approved The pull request has been reviewed and is ready to merge label Apr 16, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread includes/optional-modules/indesign-export/class-indesign-converter.php Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@wil-gerken wil-gerken added the [Status] Needs Review The issue or pull request needs to be reviewed label Apr 16, 2026
@wil-gerken wil-gerken requested a review from adekbadek April 16, 2026 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Status] Needs Review The issue or pull request needs to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants