Skip to content

Handling larger clusters #13

@rjpower

Description

@rjpower

When we have arrays with more than a few hundred tiles, I've noticed that our performance drops significantly; this is almost certainly due to the various extent operations needed to compute tiles. We can move the extent code to Cython which would give us a big speedup.

Also, the vast majority of arrays have tiles that are all the same shape; we can leverage this to avoid scanning a tile list, and instead use the tile shape to find the target tile, e.g.

pos_to_tile(pos, tile_shape):
  tx = pos[0] / tile_shape[0]
  ty = pos[1] / tile_shape[1]
  ...
  num_tiles_x = array.shape[0] / tile_shape.x
  return ty * num_tiles_x + tx
  • Run profiles to find bottlenecks for arrays with many tiles
  • Migrate extent.py to Cython
  • Special handling for regular tile shapes

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions