They have "don't stop believing" chorus as one of the samples. This makes me distrust the rest of the results. I suspect they would have better luck if adapting one of the newly released video generating models to do this task, rather than a model that's for images.
Viewing a single comment thread. View all comments