kermunnist
kermunnist t1_j9kpp3s wrote
Reply to comment by __lawless in [R] Multimodal Chain-of-Thought Reasoning in Language Models - Amazon Web Services Zhuosheng Zhang et al - Outperforms GPT-3.5 by 16% (75%->91%) and surpasses human performance on ScienceQA while having less than 1B params! by Singularian2501
I wonder how flamingo would compare
kermunnist t1_j9kpbtz wrote
Reply to comment by EndTimer in What. The. ***k. [less than 1B parameter model outperforms GPT 3.5 in science multiple choice questions] by Destiny_Knight
That would make sense, having an understanding of images probably helps lead to a more intuitive grasp of physics
kermunnist t1_j8dm21f wrote
Reply to comment by Lawjarp2 in Bing Chat sending love messages and acting weird out of nowhere by BrownSimpKid
Does AGI need to necessarily be sentient? Could a very powerful and reliable LLM that can be accurately trained on any human task without actually being sentient or self aware be considered AGI? To me that's not only AGI, but a better AGI because now there's no ethical dilemmas.
kermunnist t1_j9kqsaw wrote
Reply to comment by IluvBsissa in A German AI startup just might have a GPT-4 competitor this year. It is 300 billion parameters model by Dr_Singularity
That's because the smaller models are less useful. With neural networks (likely including biological ones) there's a hard trade off between specialized performance and general performance. If these 100+x smaller models were trained on the same data as GPT-3 they would perform 100+x worse on these metrics (maybe not exactly because in this case the model was multimodal which definitely gave a performance advantage). The big reason this model performed so much is because it was fine tuned on problems similar to the ones on this exam where as GPT-3 was fine turned on anything and everything. This means that this model would likely not be a great conversationalist and would probably flounder at most other tasks GPT-3.5 does well on.