This repository is an implementation of Compositional Attention: Disentangling Search and Retrieval by MILA. Revisiting standard Multi-head attention

Rishit-dagli/Compositional-Attention

submited by

Style Pass

2022-06-30 02:30:06

This repository is an implementation of Compositional Attention: Disentangling Search and Retrieval by MILA. Revisiting standard Multi-head attention through the lens of multiple parallel and independent search and retrieval mechanisms, this leads to static pairings between searches and retrievals, often leading to redundancy of parameters. They reframe the "heads" of multi-head attention as "searches", and once the multi-headed/searched values are aggregated, there is an extra retrieval step (using attention) off the searched results. The experiments establish this as an easy drop-in replacement for Multi-head attention.

Awesome! If you want to contribute to this project, you're always welcome! See Contributing Guidelines. You can also take a look at open issues for getting more information about current or upcoming tasks.