Viewing a single comment thread. View all comments

ChocolateFit9026 t1_iqrrbt4 wrote

There’s a big misunderstanding that just because text to image is huge right now that everything will be done with text prompts. The only reason it works this way is because good image text pairs exist everywhere on the internet. Not the same with audio and lots of things.

3

Ohigetjokes OP t1_iqsn7xn wrote

It's a convenient lens for the larger issue

1

ChocolateFit9026 t1_iqvh2bk wrote

What’s the larger issue? It seems like if large ML models are trained in particular data and work a particular way (usually without text prompts), there isn’t any “speaking to machines” translation issue.

1