ik-llama

Here is 1 public repository matching this topic...

AIdevsmartdata / ramp-quant

RAMP: RL-guided Adaptive Mixed-Precision quantization for GGUF models. Data-free sensitivity analysis, evolutionary search, per-tensor type optimization. Produces hardware-optimized GGUF for consumer GPUs.

moe quantization sensitivity-analysis ramp mixed-precision llm llama-cpp qwen gguf qwen3 consumer-gpu imatrix ik-llama

Updated Apr 16, 2026
Python

Improve this page

Add a description, image, and links to the ik-llama topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ik-llama topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ik-llama

Here is 1 public repository matching this topic...

AIdevsmartdata / ramp-quant

Improve this page

Add this topic to your repo