Viewing a single comment thread. View all comments

markhachman OP t1_j4a1fyq wrote

I think what I'm talking would be an algorithm that understands the sounds of different instruments, their tonality, rhythm, and so on, in much the same way ChatGPT understands the relationship between words or presumably Vall-E understands phonemes -- and then understands how to put them together in the style of various artists.

I'll have to check out Riffusion, though, as I'm unfamiliar with it, thanks.

2

Kafke t1_j4a1yik wrote

Yes. Look at stable diffusion and riffusion for an example of this. Music isn't fundamentally different from images and text in terms of how modern AI works.

4

Ronny_Jotten t1_j4b5fqx wrote

Images and text are already quite different from each other though, in terms of AI generators. The image generators include a language model, but work on a diffusion principle that the text generators don't use. Riffusion's approach of using a diffusion image generator with sonograms is interesting to some extent, but I sincerely doubt it will be the future direction of high-quality music generators.

5