🪝 vLLM.hook

A modular plugin library for vLLM.

📄 [Preprint] vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM

vLLM.hook is a plugin library designed to let developers and researchers inspect, analyze, and steer the internal operations of large language models running under the vLLM inference engine.

This includes dynamic analysis of:

attention patterns
attention heads
activations
custom intervention behaviors

🚀 Features

Model-agnostic plugin system for vLLM engines
Extensible worker/analyzer abstraction
- Easy to define new hooks, analyzers, and behaviors
Introspection of model internals
Interventions (activation steering, attention control, etc.)
Example applications:
- Safety guardrails
- Reranking
- Enhanced instruction following

📦 Installation

1. Clone the repository

git clone https://github.com/IBM/vLLM-Hook.git
cd vLLm.hook

2. (Optional) Create an environment

conda create -n vllm_hook_env
conda activate vllm_hook_env

3. Install the plugin and dependencies

pip install -r requirement.txt
pip install -e vllm_hook_plugins

📕 Notebook Setup

If you plan to use the notebooks under notebooks/, you may need to register your environment as a Jupyter kernel:

pip install ipykernel
python -m ipykernel install --user --name vllm_hook_env --display-name "vllm_hook_env"

Then inside Jupyter Lab:

Kernel → Change Kernel → vllm_hook_env

👉 Usage Examples (Notebook / CLI)

You can also use the included examples/ and/or notebooks/ directories to explore different functionalities.

1. Attention Tracker (In-Model Safety Guardrail)

Notebook 📓: notebooks/demo_attntracker.ipynb
CLI 🧰 :

python examples/demo_attntracker.py

2. Core Reranker (In-Model Relevance Ranking)

Notebook 📓: notebooks/demo_corer.ipynb
CLI 🧰 :

python examples/demo_actsteer.py

3. Activation Steering (Enhanced instruction following via activation steering)

Notebook 📓: notebooks/demo_actsteer.ipynb
CLI 🧰 :

python examples/demo_corer.py

You can customize model configurations in the model_configs/ folder, e.g.:

model_configs/<example_name>/<model_name>.json

For example model_configs/attention_tracker/granite-3.1-8b-instruct.json.

🏠 Plugin Architecture

The main package is structured as follows:

vllm_hook_plugins/
├── analyzers/
│   ├── attention_tracker_analyzer.py
│   ├── core_reranker_analyzer.py
├── workers/
│   ├── probe_hookqk_worker.py
│   ├── steer_activation_worker.py
├── hook_llm.py
├── registry.py

Each component handles a key stage of the plugin lifecycle:

Registry — manages available hooks and extensions
Workers — define execution behavior and orchestration
Analyzers — optionally conduct analysis based on the saved statistics

🤝 Contributing

We welcome contributions from the community!

To contribute:

Fork this repository
Create a branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to your branch (git push origin feature/amazing-feature)
Open a Pull Request

Guidelines:

Users are encouraged to define new worker/analyzer, but should not touch hook_llm
Include examples and documentation for new features
The registry will be updated by the admin

IBM ❤️ Open Source AI

vLLM.hook has been started by IBM Research.

Built for the vLLM ecosystem
Inspired by community efforts to make LLMs more interpretable and controllable

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
examples		examples
model_configs		model_configs
notebooks		notebooks
steering_vectors		steering_vectors
tests		tests
vllm_hook_plugins		vllm_hook_plugins
LICENSE.txt		LICENSE.txt
README.md		README.md
requirement.txt		requirement.txt
vLLM_Hook_v0.pdf		vLLM_Hook_v0.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🪝 vLLM.hook

🚀 Features

📦 Installation

1. Clone the repository

2. (Optional) Create an environment

3. Install the plugin and dependencies

📕 Notebook Setup

👉 Usage Examples (Notebook / CLI)

1. Attention Tracker (In-Model Safety Guardrail)

2. Core Reranker (In-Model Relevance Ranking)

3. Activation Steering (Enhanced instruction following via activation steering)

🏠 Plugin Architecture

🤝 Contributing

To contribute:

Guidelines:

IBM ❤️ Open Source AI

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Languages

Folders and files

Latest commit

History

Repository files navigation

🪝 vLLM.hook

🚀 Features

📦 Installation

1. Clone the repository

2. (Optional) Create an environment

3. Install the plugin and dependencies

📕 Notebook Setup

👉 Usage Examples (Notebook / CLI)

1. Attention Tracker (In-Model Safety Guardrail)

2. Core Reranker (In-Model Relevance Ranking)

3. Activation Steering (Enhanced instruction following via activation steering)

🏠 Plugin Architecture

🤝 Contributing

To contribute:

Guidelines:

IBM ❤️ Open Source AI

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 0

Languages

Packages

Contributors