Layer-wise Relevance Propagation eXplains Transformers (LXT)#
Welcome to the Documentation! LXT contains the Layer-wise Relevance Propagation (LRP) implementation extended to handle attention layers in Large Language Models (LLMs) and Vision Transformers (ViTs).
π₯ Highly efficient & Faithful Attributions
Attention-aware LRP (AttnLRP) outperforms gradient-, decomposition- and perturbation-based methods, provides faithful attributions for the entirety of a black-box transformer model while scaling in computational complexity O(1) and memory requirements O(βN) with respect to the number of layers.
π Latent Feature Attribution & Visualization
Since we get relevance values for each single neuron in the model as a by-product, we know exactly how important each neuron is for the prediction of the model. Combined with Activation Maximization, we can label neurons or SAE features in LLMs and even steer the generation process of the LLM by activating specialized knowledge neurons in latent space!
π Paper
For the mathematical details and foundational work, please take a look at our paper:
Achtibat, et al. βAttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers.β ICML 2024.
Important
Project is under active development!
Installation#
To install directly from PyPI using pip, write:
$ pip install lxt
or install from the cloned GitHub repository:
$ git clone https://github.com/rachtibat/LRP-for-Transformers
$ pip install ./lxt
License#
This project is licensed under the BSD-3 Clause License, which means that LRP is a patented technology that can only be used free of charge for personal and scientific purposes.
Citation#
@InProceedings{pmlr-v235-achtibat24a,
title = {{A}ttn{LRP}: Attention-Aware Layer-Wise Relevance Propagation for Transformers},
author = {Achtibat, Reduan and Hatefi, Sayed Mohammad Vakilzadeh and Dreyer, Maximilian and Jain, Aakriti and Wiegand, Thomas and Lapuschkin, Sebastian and Samek, Wojciech},
booktitle = {Proceedings of the 41st International Conference on Machine Learning},
pages = {135--168},
year = {2024},
editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
volume = {235},
series = {Proceedings of Machine Learning Research},
month = {21--27 Jul},
publisher = {PMLR}
}
Table of Content#
Efficient: