Skip to content

improve caching allocator #255

@bd4

Description

@bd4

The current caching allocator has a separate cache per instance of the class, which is templated on <value_type, space>. Separate per space is necessary, but per-type is not. It would be better to have global per-space caches. It may also be useful to expose pool allocations to clib/fortran, currently the container types will use caching allocators, but the clib routines call the underlying backend allocation. I don't think we should change the default for clib, but add an option to pool allocate.

It may be worth adding an option to plug in RMM as an alternative to direct cuda calls, to see if that improves things for nvidia. If it does, i.e. better or equal performance and less space used for application runs, then may be worth investing in a cross-backend implementation. We could also explore adapting Gator from YAKL.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions