Viewing a single comment thread. View all comments

redpnd t1_j9je11m wrote

Not hard for a multimodal model to outperform a text only model on multimodal tasks..

Although still impressive, imagine what a scaled up version will be able to accomplish!

10