Submitted by Randoms001 t3_10fdbjc in Futurology
As artificial intelligence advances at a rapid pace, it is not unusual for people to raise worries about the possible ramifications for human labour. The recent statement by a team of Microsoft researchers that they have built a new AI system capable of successfully mimicking a human speech using only a three-second audio sample adds gasoline to these fears. This technological accomplishment illustrates the potential for AI to not only automate a wide range of jobs, but also to possibly reproduce human capabilities and skills with greater precision and efficiency. The consequences of this breakthrough are substantial, since it raises key concerns about the future of labour and the role of AI in it.
Microsoft's recent announcement of Vall-E, a cutting-edge artificial intelligence technology for voice impersonation, has piqued the tech industry's attention and alarm. The system, which employs discrete codes derived from a neural audio codec model as well as an astounding 60,000 hours of speech data from over 7,000 speakers, is capable of reproducing a human voice with astonishing precision and delicacy.
Vall-E works by analysing a speaker's speech, breaking it down into its numerous components, and using this information to synthesise the voice saying other words. It is built on the base of a technology called EnCodec, which Meta unveiled in October 2022. This enables the system to reproduce not just the speaker's timbre and pitch, but also their emotional tone, using only a three-second audio sample.
While Vall-powers E's are unquestionably astounding, they also present major ethical concerns. As AI technology advances at a rapid speed, it is critical that we as a society confront the potential negative effects on employment and other sectors as soon as possible. Furthermore, this technology emphasises the importance of constant communication and collaboration among corporate leaders, legislators, and the general public to ensure that the development and deployment of AI corresponds with societal values and interests.
Experiments on Microsoft's Vall-E AI speech mimicking technology have generated extremely promising results. According to a Cornell University study article, the system "substantially exceeds" current state-of-the-art systems in terms of speech naturalness and speaker likeness. The article also highlights Vall-ability E's to keep the speaker's emotional inflection and auditory context in its synthesised speech.
Vall-capabilities E's are exhibited on GitHub, where the system is able to successfully recreate a speaker's voice with a high degree of resemblance, even with a three-second audio sample. While the voice is little artificial in places, it is still pretty good, and the potential for further progress is obvious.
​
Vall-potential E's uses are extensive, with Microsoft researchers picturing it as a powerful tool for text-to-voice conversion, speech editing, and even audio synthesis when combined with other generative AIs like GPT-3. This technology's release is expected to have a substantial influence on sectors that rely on voice imitation and text-to-speech technologies, and its continuing development will be actively studied.
As with any advanced technology, it is critical to understand the potential ramifications and hazards of using Vall-E, Microsoft's AI voice mimicking tool. One of the key worries is the prospect of abuse, such as impersonating public figures or duping people into passing over critical information by posing as someone they know or trust. Furthermore, the system's capacity to accurately mimic voices has the potential to overcome security systems that rely on voice identification.
Another source of concern is Vall-possible E's influence on job possibilities, particularly in businesses that rely on voice actors. Because of the system's capacity to mimic human voices at a substantially cheaper cost, demand for human voice actors may decline.
​
However, the Vall-E researchers have acknowledged these issues and said that precautions might be made to reduce these hazards. It is feasible, for example, to create detection models that can determine whether or not an audio sample was synthesised using Vall-E. Furthermore, the researchers have agreed to follow Microsoft's AI Principles when further developing the system.
QuestionableAI t1_j4xd4yp wrote
Oh, I am so fucking sure that this shit will not be used for nefarious purposes on perceived criminals, liberals, any one in a protected group, and especially those who are victimized by Republican turds.