Hopfield Networks is All You Need

submited by
Style Pass
2024-10-09 11:30:05

We introduce a new energy function and a corresponding new update rule which is guaranteed to converge to a local minimum of the energy function.

The new energy function is a generalization (discrete states \(\Rightarrow\) continuous states) of modern Hopfield Networks aka Dense Associative Memories introduced by Krotov and Hopfield and Demircigil et al. The new modern Hopfield Network with continuous states keeps the characteristics of its discrete counterparts:

Due to its continuous states this new modern Hopfield Network is differentiable and can be integrated into deep learning architectures. Typically patterns are retrieved after one update which is compatible with activating the layers of deep networks. This enables an abundance of new deep learning architectures. Three useful types of Hopfield layers are provided.

Surprisingly, the new update rule is the attention mechanism of transformer networks introduced in Attention Is All You Need. We use these new insights to analyze transformer models in the paper.

Leave a Comment