JavaScript is not available.

submited by
Style Pass
2023-01-23 22:00:09

Attention is all you need... but how much of it do you need? Announcing H3 - a new generative language models that outperforms GPT-Neo-2.7B with only *2* attention layers! Accepted as a *spotlight* at #ICLR2023 ! 📣 w/ @tri_dao 📜 https:// arxiv.org/abs/2212.14052 1/n

Attention is all you need... but how much of it do you need? Announcing H3 - a new generative language models that outperforms GPT-Neo-2.7B with only *2* attention layers! Accepted as a *spotlight* at #ICLR2023 ! 📣 w/ @tri_dao 📜 https:// arxiv.org/abs/2212.14052 1/n

Leave a Comment