Skip to content

Parsing converts UTF-8 characters (like öäüß) to HTML entities #242

@mockmeup

Description

@mockmeup

My settings

  • Typo3 v13.4
  • PHP 8.3
  • latest version of dpn_glossary

My issue

I noticed, that after enabling dpn_glossary, the output of the dom was changed; all Umlaute etc. were converted to HTML entities.

(Which I didn't want for some reasons.)

Reason

I found that

$parsedHtml = $DOM->saveHTML();

in ParserService.php appears to be the cause.

On https://php.net/manual/en/domdocument.savehtml.php the following is mentioned:

If you load HTML from a string ensure the charset is set.

...
<meta content="text/html; charset=utf-8" http-equiv="Content-Type"></head><body>';
...

Otherwise the charset will be ISO-8859-1!

So I added <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> to the HEAD of my page via Typoscript (in the Setup)

page.headerData.1234 = TEXT
page.headerData.1234.value =

and, Bingo, the conversion was gone!

Questions

  • Is it correct, that saveHTML() converts UTF-8 content to ISO-8859-1?
  • If yes: Can you change this? (Or introduce a setting to configure this?)

(If the whole setup - database, server, Typo3 - is configured to use UTF-8, why convert to ISO-8859-1?)

Can you please look into this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions