[D] Wide Attention Is The Way Forward For Transformers Submitted by SuchOccasion457 t3_y2i7h1 on October 12, 2022 at 10:54 PM in MachineLearning 7 comments 4
lostmsu t1_is3ge2v wrote on October 13, 2022 at 12:46 AM Paper link: https://arxiv.org/abs/2210.00640 Permalink 1
Viewing a single comment thread. View all comments