Intelligent_Rough_21
Intelligent_Rough_21 OP t1_j3lkkbq wrote
Reply to comment by geneing in [D] Looking for a dataset of Text-To-Speech audiobook-style Speech Synthesis Markup Language (SSML) files by Intelligent_Rough_21
Thanks for the reference I’ll look into it
Intelligent_Rough_21 OP t1_j3g2vvp wrote
Reply to comment by geneing in [D] Looking for a dataset of Text-To-Speech audiobook-style Speech Synthesis Markup Language (SSML) files by Intelligent_Rough_21
Yeah I was using neural poly which is equivalent to wavenet. What I discovered is it will always say the same sentence, and usually the same word used in the same way, the same way, regardless of context clues. “My gosh.” Would always render exactly the same way. Really needs paragraph or dialogue driven context, as well as a bit of randomization. In a book where an author has a repetitive goto word or phrase it’s killer.
Intelligent_Rough_21 OP t1_j3frivp wrote
Reply to comment by geneing in [D] Looking for a dataset of Text-To-Speech audiobook-style Speech Synthesis Markup Language (SSML) files by Intelligent_Rough_21
Ok I’ll admit to only having used neural models not trained them. AWS Polly is incredibly monotoned last I used it.
Intelligent_Rough_21 OP t1_j3enes5 wrote
Reply to comment by geneing in [D] Looking for a dataset of Text-To-Speech audiobook-style Speech Synthesis Markup Language (SSML) files by Intelligent_Rough_21
I don’t think they take into account language context like completion models do. They just say words with limited memory. Hopefully research will unify them somehow.
Intelligent_Rough_21 t1_je6plhz wrote
Reply to [P] Imaginary programming: implementation-free TypeScript functions for GPT-powered web development by xander76
I knew this would happen. Programming as a profession is doomed to laziness lol.