b-tree read performance still not good enough

For stupidly large files we're still seeing slow b-tree reads.

On inspection, this occurs, in part, because we do a lot of very tiny reads which we can avoid by reading entire b-tree nodes in one go.

The offending piece of code is this:

```python
 def _read_node(self, offset, node_level):
        """ Return a single node in the b-tree located at a give offset. """
        node = self._read_node_header(offset, node_level)
        keys = []
        addresses = []
        for _ in range(node['entries_used']):
            chunk_size, filter_mask = struct.unpack('<II', self.fh.read(8))
            fmt = '<' + 'Q' * self.dims
            fmt_size = struct.calcsize(fmt)
            chunk_offset = struct.unpack(fmt, self.fh.read(fmt_size))
            chunk_address = struct.unpack('<Q', self.fh.read(8))[0]
```
There is an obvious optimisation for V1 btrees where we know the length of the entries.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

b-tree read performance still not good enough #153

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

b-tree read performance still not good enough #153

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions