A modular plugin library for vLLM.
📄 [Preprint] vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM
vLLM.hook is a plugin library designed to let developers and researchers inspect, analyze, and steer the internal operations of large language models running under the vLLM inference engine.
This includes dynamic analysis of:
- attention patterns
- attention heads
- activations
- custom intervention behaviors
- Model-agnostic plugin system for vLLM engines
- Extensible worker/analyzer abstraction
- Easy to define new hooks, analyzers, and behaviors
- Introspection of model internals
- Interventions (activation steering, attention control, etc.)
- Example applications:
- Safety guardrails
- Reranking
- Enhanced instruction following
git clone https://github.com/IBM/vLLM-Hook.git
cd vLLm.hookconda create -n vllm_hook_env
conda activate vllm_hook_envpip install -r requirement.txt
pip install -e vllm_hook_pluginsIf you plan to use the notebooks under notebooks/, you may need to register your environment as a Jupyter kernel:
pip install ipykernel
python -m ipykernel install --user --name vllm_hook_env --display-name "vllm_hook_env"Then inside Jupyter Lab:
Kernel → Change Kernel → vllm_hook_env
You can also use the included examples/ and/or notebooks/ directories to explore different functionalities.
Notebook 📓: notebooks/demo_attntracker.ipynb
CLI 🧰 :
python examples/demo_attntracker.pyNotebook 📓: notebooks/demo_corer.ipynb
CLI 🧰 :
python examples/demo_actsteer.pyNotebook 📓: notebooks/demo_actsteer.ipynb
CLI 🧰 :
python examples/demo_corer.pyYou can customize model configurations in the model_configs/ folder, e.g.:
model_configs/<example_name>/<model_name>.json
For example model_configs/attention_tracker/granite-3.1-8b-instruct.json.
The main package is structured as follows:
vllm_hook_plugins/
├── analyzers/
│ ├── attention_tracker_analyzer.py
│ ├── core_reranker_analyzer.py
├── workers/
│ ├── probe_hookqk_worker.py
│ ├── steer_activation_worker.py
├── hook_llm.py
├── registry.py
Each component handles a key stage of the plugin lifecycle:
- Registry — manages available hooks and extensions
- Workers — define execution behavior and orchestration
- Analyzers — optionally conduct analysis based on the saved statistics
We welcome contributions from the community!
- Fork this repository
- Create a branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to your branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Users are encouraged to define new worker/analyzer, but should not touch hook_llm
- Include examples and documentation for new features
- The registry will be updated by the admin
vLLM.hook has been started by IBM Research.
- Built for the vLLM ecosystem
- Inspired by community efforts to make LLMs more interpretable and controllable