Autoregressive Transformer visualization via dictionary learning

This is a reimplementation of this project based on the paper Transformer visualization via dictionary learning: contextualized embedding as a linear superposition of transformer factors by Zeyu Yun, Yubei Chen, Bruno A Olshausen, and Yann LeCun.

Updates

8/30: Dev branch for adding support for TransformerLens merged to main. Inference updated for TransformerLense. Testing with GPT2 underway. Implemented for models supported by TransformerLens AND Hugging Face's AutoModel loading.

8/29: Dev branch for adding support for TransformerLens. Training updated. Inference not updated.

8/17: Dictionary training for mid-layer resiudal stream added (PyTorch hook directly on the attention output, before addition of residual tensor). Module unstable; testing ongoing.

Inference for attention dictionary not yet implemented.

Set up

You must specify which cupy version to install based on your your cuda version. To get this information, run:

nvcc --version

or

nvidia-smi

Edit cupy in requirements.txt based on your system requirements and the available versions here.

Create and activate a virtual environment before installing the requirements with the comand:

pip install -r requirements.tx

Training and Inference

Please also see the original repo for more training details and customizations. Here are the basics:

Create dataset:

python dataset.py

Train a dictionaries (default is to train for each layer; in the paper they trained every other layer):

python train.py

For inference details and troubleshooting, please see the original repo. The methods have been adapted but remain untested

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.gitignore		.gitignore
__init__.py		__init__.py
core.py		core.py
dataset.py		dataset.py
inference.py		inference.py
make_saliency_maps.py		make_saliency_maps.py
readme.md		readme.md
requirements.txt		requirements.txt
sparsify_PyTorch.py		sparsify_PyTorch.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Autoregressive Transformer visualization via dictionary learning

Updates

Set up

Training and Inference

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Autoregressive Transformer visualization via dictionary learning

Updates

Set up

Training and Inference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages