Submitted by Ohigetjokes t3_xsgzhl in singularity
ChocolateFit9026 t1_iqrrbt4 wrote
There’s a big misunderstanding that just because text to image is huge right now that everything will be done with text prompts. The only reason it works this way is because good image text pairs exist everywhere on the internet. Not the same with audio and lots of things.
Ohigetjokes OP t1_iqsn7xn wrote
It's a convenient lens for the larger issue
ChocolateFit9026 t1_iqvh2bk wrote
What’s the larger issue? It seems like if large ML models are trained in particular data and work a particular way (usually without text prompts), there isn’t any “speaking to machines” translation issue.
Viewing a single comment thread. View all comments