Viewing a single comment thread. View all comments

ElvinRath t1_j3ghhko wrote

First thing first: Yes, it will be impressive.

Even just text2video is impressive, we are taking this for granted, but 2 years ago suggesting that we would have that would be crazy.

​

If we have text2video + continuity (so that you can make a prompt, then make another that merges your first one with the second one to give it some kind of continuity) would be amazing.

​

But a full movie from a prompt doesn't make much sense to me, for now.

First, the model will be a video only thing. So, even if it was capable of making a movie, is not something to be used by consumers as entertainment, it's more like a tool. And in a tool you probably want more control.

Even if they could create a movie from a prompt, chances that everything was useful are slim, and the ammount of computation needed would be HUGE. It would be expensive.

People are not gonna pay that for now.

​

It's not the time to make a 90 minutes movie with one prompt, I think that it's time to get like, 0-2 minutes... Might be wrong, but I don't think that I will be 88 minutes wrong.

​

Anyway, to really get a movie you need like...A very good multimodal AI than can create both the image and the sound, including music and voices, we are very far away from that. (Now, "Very far away" might be just 2 or 3 years, but certainly not this month)

15

[deleted] t1_j3gmn4l wrote

Yes, I am aware this model is just about a video without sounds and other elements, but even without those other elements it will be very useful. A lot of animations can be made out of this, for example, many YouTube animators might see this as gold.

When it says movies I don't think it necessarily means 90 minutes at once, I seriously doubt that it will generate a whole movie at once, it might be able to generate 5-15 min animations then you can add them up as you said (which will reduce the compute cost). The one prompt thing won't be a thing (yet) I believe, having to give it several prompts will make the animation more aligned with the users' desires. I don't necessarily agree that it is just a tool, many fun things can be done with this from the average person's perspective. But that's beside the point I'm more interested in what companies like Deepmind, Open AI, google, and others might release when it comes to movie generators this year.

​

>A very good multimodal AI that can create both the image and the sound, including music and voices, we are very far away from that. (Now, "Very far away" might be just 2 or 3 years, but certainly not this month)

LOL, I give it a year (1,5 years max). Remember we need to think exponentially, this is just the beginning of 2023 we are yet to be blown away by the products that will be released this year.

9

ElvinRath t1_j3hx2ok wrote

My estimation of 2 or 3 years wasn't intented as a "realistic estimation", more like a "If everything goes very fast and we are very lucky" scenario. I think that it will probably take much more time to get there.

If you think that's too slow, well, we'll see, haha. I hope to be wrong and that you are right, it would be very cool :P

3

blueSGL t1_j3hf9vn wrote

> If we have text2video + continuity (so that you can make a prompt, then make another that merges your first one with the second one to give it some kind of continuity) would be amazing.

I take it you are refering to a video from StabilityAI and not Google, because Google has already shown off 'prompt sequence' video gen

https://phenaki.video/

3

ElvinRath t1_j3hsk47 wrote

Yep!

And about "we". They are supposed to release the model like they did with Stable Diffusion.

​

It probably won't have the same impact, because I guess that it might be a bit too much for most consumer GPUs, but it's very cool to have this kind of tech available.

3