More on News
PM Modi Discusses Role of Technology in Agriculture, Education, and Health with Bill Gates
-
Team Eela
Meta, the parent company of Facebook, recently unveiled Voicebox, an AI-powered chatbot that utilizes artificial intelligence to convert spoken language into text. Voicebox can perform speech generation tasks it wasn’t specifically trained on, setting it apart from other chatbots.
However, Meta has decided not to release Voicebox for public use due to potential risks associated with its misuse.
“There are many exciting use cases for generative speech models, but because of the potential risks of misuse, we are not making the Voicebox model or code publicly available at this time,” Meta said.
In Meta’s view, while there are numerous promising applications for generative speech models, the company believes it is necessary to balance openness and responsibility. Consequently, they are currently withholding the Voicebox model and code from public availability.
Voicebox functions similarly to other generative systems that work with images and text, but its output is high-quality audio clips.
Meta claims that Voicebox can generate audio from scratch and even modify existing samples provided to it in six different languages: English, French, German, Spanish, Polish, and Portuguese. It boasts speech synthesis, noise removal, content editing, style conversion, and diverse sample generation capabilities.
“The model can synthesize speech across six languages, as well as perform noise removal, content editing, style conversion, and diverse sample generation,” Meta said.
Meta further explains that Voicebox can create high-quality audio clips and edit pre-recorded audio, removing unwanted sounds like car horns or barking dogs while maintaining the original content and style.
Looking ahead, Meta envisions the application of such AI models in giving virtual assistants and non-player characters in the metaverse or games more natural-sounding voices. Additionally, visually impaired individuals can utilize these models to have written messages read to them in familiar voices, and they can even create background music for videos.
More on News
More on News