[R] Multimodal Chain-of-Thought Reasoning in Language Models - Amazon Web Services Zhuosheng Zhang et al - Outperforms GPT-3.5 by 16% (75%->91%) and surpasses human performance on ScienceQA while having less than 1B params! Submitted by Singularian2501 t3_10svwch on February 3, 2023 at 9:31 PM in MachineLearning 56 comments 258
JClub t1_jabyhe8 wrote on February 28, 2023 at 9:30 AM GPT was never trained with image data, why is this a fair comparison? The UnifiedQA model is from 2020, so it doesn't seem fair either. Why don't we have some comparisons with other SOTA multimodal models? Such as OFA or UniT Permalink 1
Viewing a single comment thread. View all comments