Skip to content

More issues with Unicode identifiers #42

@gwselke

Description

@gwselke

The module seems to handle Unicode well in some respects, but some methods apparently produce erroneous results. Here is code to reproduce erroneous handling in two different places and in unpredictable ways:

#! /usr/bin/perl
use v5.32.0;
use Graph::Undirected;
use utf8;
use open qw( :std :encoding(UTF-8) );
my $g = Graph::Undirected->new( );

$g->add_weighted_edge( 'Frýdek-Mistek', 'Horní Domaslavice', 11);
$g->add_weighted_edge( 'Hnojník', 'Horní Domaslavice', 8);

# $g->add_weighted_edge( 'Frydek-Mistek', 'Horni Domaslavice', 11);
# $g->add_weighted_edge( 'Hnojnik', 'Horni Domaslavice', 8);

say 'All nodes: ', join( ' ', $g->vertices( ) );

say 'Components:';
for my $c ( $g->connected_components( ) ) {
  say '  ', join( ' ', @$c );
}

say 'Radius: ', $g->radius( );

This outputs:

All nodes: Frýdek-Mistek Horní Domaslavice Hnojník
Components:
  Horní Domaslavice Hnojník Frýdek-Mistek
  
Radius: Inf

The first line of output is ok and as expected.
The second line of output has the character "í" once correctly and once garbled, and the character "ý" is garbled.
The last line of the output should show the number 11, but not Inf (which would be expected if the graph were empty or unconnected, which it isn't.)
Interestingly, the result is not deterministic: sometimes just one connected component of three nodes is reported; sometimes just one connected component with only two nodes (the third node is not shown at all); sometimes two connected components are reported (one with two nodes, one with one).
It would seem that there is some form of data corruption going on.

Commenting out the two code lines adding the edges with UTF-8 node names and uncommenting instead the two lines with plain ASCII names will make the code run flawlessly.

These two apparent bugs may or may not be related. Also both may or may not be related to the Unicode issue raised in #38.

I have tested this under Windows 10, Perl 5.42.0, Graph 0.9735.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions