-
Notifications
You must be signed in to change notification settings - Fork 334
Closed
Description
Hi, current Data 8 TA here!
During a TA meeting we've realised that when creating boolean arrays with make_array(), the elements will be converted to integers and will not follow numpy behaviour due to the extra check for Windows machines (lines 42-45 in util.py).
if elements and all(isinstance(item, (int, np.integer)) for item in elements):
# Specifically added for Windows machines where the default
# integer is int32 - see GH issue #339.
return np.array(elements, dtype="int64")You can see this behaviour as follows:
In [1]: Table().with_columns("bool", make_array(True, False))
Out[1]:
bool
1
0
In [2]: Table().with_columns("bool", np.array([True, False]))
Out[2]:
bool
True
FalseThis is because Python behaviour will cause booleans to also return True for isinstance(b, int) and hence they will be converted to int64 by the extra check above.
>>> isinstance(True, int)
True
Unless this is intentional, we could fix this by either explicitly checking for bools so this behaviour does not occur, or changing the isinstance check.
I've created a pull request which adds an extra check for bools, but feel free to suggest another fix!
Metadata
Metadata
Assignees
Labels
No labels