-
Notifications
You must be signed in to change notification settings - Fork 10
More issues with Unicode identifiers #42
Description
The module seems to handle Unicode well in some respects, but some methods apparently produce erroneous results. Here is code to reproduce erroneous handling in two different places and in unpredictable ways:
#! /usr/bin/perl
use v5.32.0;
use Graph::Undirected;
use utf8;
use open qw( :std :encoding(UTF-8) );
my $g = Graph::Undirected->new( );
$g->add_weighted_edge( 'Frýdek-Mistek', 'Horní Domaslavice', 11);
$g->add_weighted_edge( 'Hnojník', 'Horní Domaslavice', 8);
# $g->add_weighted_edge( 'Frydek-Mistek', 'Horni Domaslavice', 11);
# $g->add_weighted_edge( 'Hnojnik', 'Horni Domaslavice', 8);
say 'All nodes: ', join( ' ', $g->vertices( ) );
say 'Components:';
for my $c ( $g->connected_components( ) ) {
say ' ', join( ' ', @$c );
}
say 'Radius: ', $g->radius( );
This outputs:
All nodes: Frýdek-Mistek Horní Domaslavice Hnojník
Components:
Horní Domaslavice HnojnÃk Frýdek-Mistek
Radius: Inf
The first line of output is ok and as expected.
The second line of output has the character "í" once correctly and once garbled, and the character "ý" is garbled.
The last line of the output should show the number 11, but not Inf (which would be expected if the graph were empty or unconnected, which it isn't.)
Interestingly, the result is not deterministic: sometimes just one connected component of three nodes is reported; sometimes just one connected component with only two nodes (the third node is not shown at all); sometimes two connected components are reported (one with two nodes, one with one).
It would seem that there is some form of data corruption going on.
Commenting out the two code lines adding the edges with UTF-8 node names and uncommenting instead the two lines with plain ASCII names will make the code run flawlessly.
These two apparent bugs may or may not be related. Also both may or may not be related to the Unicode issue raised in #38.
I have tested this under Windows 10, Perl 5.42.0, Graph 0.9735.