improve caching allocator

The current caching allocator has a separate cache per instance of the class, which is templated on `<value_type, space>`. Separate per space is necessary, but per-type is not. It would be better to have global per-space caches. It may also be useful to expose pool allocations to clib/fortran, currently the container types will use caching allocators, but the clib routines call the underlying backend allocation. I don't think we should change the default for clib, but add an option to pool allocate.

It may be worth adding an option to plug in RMM as an alternative to direct cuda calls, to see if that improves things for nvidia. If it does, i.e. better or equal performance and less space used for application runs, then may be worth investing in a cross-backend implementation. We could also explore adapting Gator from YAKL.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve caching allocator #255

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

improve caching allocator #255

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions