-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Labels
bugSomething isn't workingSomething isn't working
Description
As of v0.6.3, I'm not sure if it's possible to serialize and deserialize NumPy arrays of nested record types.
Example model:
ChildRecord: !record
fields:
c: int
ParentRecord: !record
fields:
p: int
child: ChildRecord
MyProtocol: !protocol
sequence:
records: ParentRecord[,]Now, walking through the example as if I'm a new user of Yardl:
Step 1
If I attempt to write a NumPy array of ParentRecord, I get a Yardl error about its dtype:
child = issue.ChildRecord(c=42)
parent = issue.ParentRecord(p=7, child=child)
records = np.tile(parent, (3, 4))
with issue.BinaryMyProtocolWriter("data.bin") as w:
w.write_records(records)...
File "/workspaces/yardl/joe/issue-#194/python/issue/_binary.py", line 1129, in _write_data
raise ValueError(message)
ValueError: Expected dtype {'names': ['p', 'child'], 'formats': ['<i4', [('c', '<i4')]], 'offsets': [0, 4], 'itemsize': 8, 'aligned': True}, got object
This is documented behavior: https://microsoft.github.io/yardl/python/language.html#arrays.
Step 2
Unfortunately, we can't just "set" the correct dtype, e.g.
records = np.tile(parent, (3, 4)).astype(issue.get_dtype(issue.ParentRecord)) File "/workspaces/yardl/joe/issue-#194/python/test.py", line 60, in main
records = np.tile(parent, (3, 4)).astype(issue.get_dtype(issue.ParentRecord))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'ParentRecord'
Step 3
So I'll manually transform my data to a NumPy structured array (noting that this is not very user-friendly).
records = np.tile(
np.array(
(parent.p, (parent.child.c,)), dtype=issue.get_dtype(issue.ParentRecord)
),
(3, 4),
)This allows me to successfully write my array but now I get an error when reading the array!
with issue.BinaryMyProtocolReader("data.bin") as r:
records_read = r.read_records()...
File "/workspaces/yardl/joe/issue-#194/python/test.py", line 72, in main
records_read = r.read_records()
^^^^^^^^^^^^^^^^
File "/workspaces/yardl/joe/issue-#194/python/issue/protocols.py", line 113, in read_records
value = self._read_records()
^^^^^^^^^^^^^^^^^^^^
File "/workspaces/yardl/joe/issue-#194/python/issue/binary.py", line 43, in _read_records
return _binary.NDArraySerializer(ParentRecordSerializer(), 2).read(self._stream)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/yardl/joe/issue-#194/python/issue/_binary.py", line 1251, in read
return self._read_data(stream, shape)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/yardl/joe/issue-#194/python/issue/_binary.py", line 1149, in _read_data
result[i] = self.element_serializer.read_numpy(stream)
~~~~~~^^^
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'ChildRecord'
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working