Meta AI Unveils ‘Voicebox’ Speech Generation Model

11 months ago

Meta AI researchers have just introduced Voicebox, a ground breaking speech model capable of performing tasks that it was not specifically trained on.

Meta

This innovative model showcases state-of-the-art performance in speech generation, marking a significant milestone in the advancement of AI technology.

Voicebox stands out among generative systems for its ability to create diverse audio outputs. Unlike models that generate images or text, Voicebox produces high-quality audio clips, making it a trailblazer in the realm of speech synthesis.

With proficiency in six languages, the model excels in tasks such as noise removal, content editing, style conversion, and the generation of various speech samples.

At the core of Voicebox lies a groundbreaking methodology known as Flow Matching. This approach, which surpasses diffusion models, has propelled the model’s performance to outshine the current leading English model, VALL-E, in zero-shot text-to-speech tasks.

The development of Voicebox involved extensive training using over 50,000 hours of recorded speech, which has enabled it to predict speech segments based on the surrounding audio and segment transcripts.

Leveraging its contextual understanding, the model can seamlessly generate speech portions within an audio recording without the need to recreate the entire input, showcasing its remarkable versatility.

Although generative speech models like Voicebox present numerous exciting possibilities, Meta AI acknowledges the potential risks associated with their misuse.

As a result, the Voicebox model and its underlying code will not be made publicly available at this time.

For further insight into Meta AI’s progress, interested individuals can access the research paper available at this link.

P.S. Help support us and independent media here: Buy us a beer, Buy us a coffee, or use our Amazon link to shop.

Other articles in the category: News

Toronto Symphony Orchestra Presents Game On! Concert Experience This Month

Toronto Symphony Orchestra Game On! concert experience comes to Roy Thompson Hall on May 24th and May 25th.

Steve Vegvari

25 mins ago

Removing Personal Info from Google Searches in Canada Gets Easier

Google is expanding its privacy tools by introducing "Results About You” in Canada, a feature designed to help people control their personal information appearing in search results. This tool offers the detection and removal of personal contact details such as phone numbers, home addresses, and email addresses from Google Search. A Google spokesperson told iPhone...

Gary Ng

14 hours ago

Apple Reports Q2 Revenue Drop, iPhone Sales Fall 10% [Update]

Apple revealed its financial results for the second quarter of fiscal 2024, which concluded on March 30, 2024. The iPhone maker reported a decrease in quarterly revenue to $90.8 billion, a 4% drop from the previous year, alongside earnings per diluted share of $1.53. "Today Apple is reporting revenue of $90.8 billion for the March...

John Quintet

15 hours ago