Meta Unveils ‘AudioCraft’ AI Tool for Audio and Music Generation

9 months ago

Meta has just announced the launch of AudioCraft, an innovative AI tool capable of generating high-quality, realistic audio and music from text prompts.

With AudioCraft, professional musicians can explore new compositions without playing a single note, while small business owners can effortlessly add soundtracks to their video ads on platforms like Instagram.

The tool consists of the following 3 models

MusicGen
AudioGen
EnCodec

MusicGen utilizes Meta-owned and licensed music to generate music from text prompts, while AudioGen uses public sound effects to create audio from text prompts.

Meta has recently released an enhanced version of the EnCodec decoder, ensuring higher-quality music generation with fewer artifacts. Furthermore, the pre-trained AudioGen models allow users to generate environmental sounds, such as dog barking, car honking, and footsteps on a wooden floor.

The company is also open-sourcing all AudioCraft model weights and code, making it accessible to researchers and practitioners to train their own models with custom datasets.

🎵 Today we’re sharing details about AudioCraft, a family of generative AI models that lets you easily generate high-quality audio and music from text.https://t.co/04XAq4rlap pic.twitter.com/JreMIBGbTF

— Meta Newsroom (@MetaNewsroom)

August 2, 2023

According to Meta, generating music poses a significant challenge as it comprises local and long-range patterns, ranging from individual notes to complex musical structures with multiple instruments.

The AudioCraft family of models, however, overcomes these challenges and excels at producing high-quality audio with long-term consistency, while remaining user-friendly.

AudioCraft serves multiple purposes, including music, sound, compression, and generation, all within the same framework. The company anticipates that MusicGen, with added controls, could evolve into a new type of instrument.

A couple of months back, Meta introduced Voicebox, a speech model capable of performing tasks that it was not specifically trained on.

P.S. Help support us and independent media here: Buy us a beer, Buy us a coffee, or use our Amazon link to shop.

Other articles in the category: News

Toronto Teacher Solves Cellphone Addiction in the Classroom

A teacher at York Mills Collegiate Institute in Toronto introduced a new cellphone policy that has significantly improved student engagement. Vanessa Yoon created a pledge at the start of the school year, requiring students and their parents to agree to specific cellphone use guidelines, reports the Globe and Mail. The issue? She could teach a...

Austin Blake

12 hours ago

Instagram Unveils Exciting New Stickers for Stories

Instagram has announced the launch of brand new stickers for its Stories feature, offering users more ways to engage with friends and followers.

Usman Qureshi

3 days ago

May the 4th Be with You: New iPhone 15 Ad Celebrates Star Wars Day

Apple has just rolled out a new Star Wars Day theme advertisement for the iPhone 15, showcasing the capabilities of its Precision Finding feature.

Usman Qureshi

3 days ago