-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
Description
This code is really slow:
import tables
h5 = tables.openFile(os.path.join(*filename))
table = h5.root.data
times = (table.col('index') + 0.5) * dt # <<< READ ENTIRE COLUMN
row0, row1 = np.searchsorted(times, [tstart, tstop])
table_rows = table[row0:row1] # returns np.ndarray (structured array)
h5.close()
return (times[row0:row1], table_rows, row0, row1)
Instead create an index on index for each 5min and daily h5 file using h5.root.data.cols.index.createIndex(). This is a one-time operation (but also fix update_archive.py for the path where it creates a stat file fresh).
After this update then change the above to turn things around and compute index_start and index_stop based on tstart and tstop, then get the required rows with readWhere(...). This appears to reduce read times for short queries to less than 1 microsec, vs. 225 microsec now.
Reactions are currently unavailable