Viewing a single comment thread. View all comments

ElvinRath t1_j3gcuuc wrote

I doubt that you can input a prompt and get a movie.
Well, I don't doubt it, I'm sure it's not true. I would be happy if I'm wrong, but I don't think I will.


[deleted] t1_j3gdtnk wrote

What makes you think so? And what if it takes several prompts, wouldn't that still be impressive? Also, you must look at the next possible models that will be released by other companies, that's what creates the excitement, not this particular model itself. This event raises significantly the probability of getting public AI movie generators this year.


ElvinRath t1_j3ghhko wrote

First thing first: Yes, it will be impressive.

Even just text2video is impressive, we are taking this for granted, but 2 years ago suggesting that we would have that would be crazy.


If we have text2video + continuity (so that you can make a prompt, then make another that merges your first one with the second one to give it some kind of continuity) would be amazing.


But a full movie from a prompt doesn't make much sense to me, for now.

First, the model will be a video only thing. So, even if it was capable of making a movie, is not something to be used by consumers as entertainment, it's more like a tool. And in a tool you probably want more control.

Even if they could create a movie from a prompt, chances that everything was useful are slim, and the ammount of computation needed would be HUGE. It would be expensive.

People are not gonna pay that for now.


It's not the time to make a 90 minutes movie with one prompt, I think that it's time to get like, 0-2 minutes... Might be wrong, but I don't think that I will be 88 minutes wrong.


Anyway, to really get a movie you need like...A very good multimodal AI than can create both the image and the sound, including music and voices, we are very far away from that. (Now, "Very far away" might be just 2 or 3 years, but certainly not this month)


[deleted] t1_j3gmn4l wrote

Yes, I am aware this model is just about a video without sounds and other elements, but even without those other elements it will be very useful. A lot of animations can be made out of this, for example, many YouTube animators might see this as gold.

When it says movies I don't think it necessarily means 90 minutes at once, I seriously doubt that it will generate a whole movie at once, it might be able to generate 5-15 min animations then you can add them up as you said (which will reduce the compute cost). The one prompt thing won't be a thing (yet) I believe, having to give it several prompts will make the animation more aligned with the users' desires. I don't necessarily agree that it is just a tool, many fun things can be done with this from the average person's perspective. But that's beside the point I'm more interested in what companies like Deepmind, Open AI, google, and others might release when it comes to movie generators this year.


>A very good multimodal AI that can create both the image and the sound, including music and voices, we are very far away from that. (Now, "Very far away" might be just 2 or 3 years, but certainly not this month)

LOL, I give it a year (1,5 years max). Remember we need to think exponentially, this is just the beginning of 2023 we are yet to be blown away by the products that will be released this year.


ElvinRath t1_j3hx2ok wrote

My estimation of 2 or 3 years wasn't intented as a "realistic estimation", more like a "If everything goes very fast and we are very lucky" scenario. I think that it will probably take much more time to get there.

If you think that's too slow, well, we'll see, haha. I hope to be wrong and that you are right, it would be very cool :P


blueSGL t1_j3hf9vn wrote

> If we have text2video + continuity (so that you can make a prompt, then make another that merges your first one with the second one to give it some kind of continuity) would be amazing.

I take it you are refering to a video from StabilityAI and not Google, because Google has already shown off 'prompt sequence' video gen


ElvinRath t1_j3hsk47 wrote


And about "we". They are supposed to release the model like they did with Stable Diffusion.


It probably won't have the same impact, because I guess that it might be a bit too much for most consumer GPUs, but it's very cool to have this kind of tech available.


overlordpotatoe t1_j3gjuat wrote

I think you have to look at the parts that go into it. If we don't have image generators that can make hands yet, presumably this can't either. If we don't have text bots that can create a coherent narrative, especially if it's lengthy, this probably can't either. This might include some impressive new tools, but we're just not at the point where you could put in a prompt and get a movie that would be anything but a complete fever dream.


[deleted] t1_j3gno2r wrote

You are right, but the AI image generator's flaws didn't stop it from being a good tool and from being a threat to the artist industry. We do not have to focus on details but on the overall functionality. Besides that, remember that when scaled is likely that those flaws will be corrected. So the point is that the movies don't have to be perfect for the AI to be released, it just needs to be good enough, and hopefully they will be publicly released this year.


TFenrir t1_j3gzyhm wrote

It's just the current state of the video generating models that exist. First - the best of the best are at Google, and we've seen what they currently can do. Even if Stability has been able to spend the last few months replicating the research out of Google, I can't imagine them being able to create a model that can output more than 1 minute of somewhat coherent video. The current large challenge is the inefficiency of these models, the longer the context the MUCH larger the memory and processing power required.

These are problems I would be very surprised to be solved first anywhere other than Google.

What I imagine is more likely is a sort of StyleGAN system that can be applied on a whole video, with some level of coherence.


maskedpaki t1_j3pxirp wrote

forget long format movies

if you can use this to make 10-20 minute porn videos that will probably decimate the entire industry and make 10s of billions.