Skip to content

Error when string contains "Word Joiner", "BOM", "OGHAM Space Mark" or "zero-width space" characters #15

@JamoCA

Description

@JamoCA

While importing international data from a Microsoft Excel file (using ColdFusion2016u11 w/Java 11.0.2), I encountered a java.lang.NullPointerException error when a string contained a space/control character. Is this a known issue?

Here's a list of characters (decimal & hex codes provided) that I tested against.
https://gist.github.com/JamoCA/42c3be286185aff0476d5888f0a819ff

My initial tests included the Word Joiner (decimal 8288), BOM (decimal 65279), OGHAM space mark (decimal 5760) and zero-width space (decimal 8203) and each caused the same error.

To work around it, I wrote a separate function to sanitize "unsafe" characters prior to using Junidecode.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions