Implement this [paper](https://arxiv.org/abs/2407.02490). Similar to `class KVCacheFastGen` in that it involves a profiling step.