Skip to content

Possible optimization of file handles #55

@AgentOxygen

Description

@AgentOxygen

GenTS currently opens file handles per-task (per time series file), which possibly adds enough latency/overhead to cause significant slowdowns/hangs on parallel filesystems at scale.

One possible solution is to cache file handles per Dask worker so additional reads of the same history file don't require creating a redundant file handle. This would likely need to include some sort of blocking that prevents the total number of file handles from blowing up system limits.

This introduces additional complexity, but if the performance boosts are significant it could be a worthwhile enhancement.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingenhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions