MysteryInc152 t1_jdj8x5e wrote on March 24, 2023 at 7:59 PM

>they mentioned an image takes 30 seconds to "comprehend" by the model...

wait really ? Cn you link source or something. There's no reason a native implementation should take that long.

Now i'm wondering if they're just doing something like this -https://github.com/microsoft/MM-REACT