Meta Unveils ‘AudioCraft’ AI Tool for Audio and Music Generation

Meta has just announced the launch of AudioCraft, an innovative AI tool capable of generating high-quality, realistic audio and music from text prompts.


With AudioCraft, professional musicians can explore new compositions without playing a single note, while small business owners can effortlessly add soundtracks to their video ads on platforms like Instagram.

The tool consists of the following 3 models

  1. MusicGen
  2. AudioGen
  3. EnCodec

MusicGen utilizes Meta-owned and licensed music to generate music from text prompts, while AudioGen uses public sound effects to create audio from text prompts.

Meta has recently released an enhanced version of the EnCodec decoder, ensuring higher-quality music generation with fewer artifacts. Furthermore, the pre-trained AudioGen models allow users to generate environmental sounds, such as dog barking, car honking, and footsteps on a wooden floor.

The company is also open-sourcing all AudioCraft model weights and code, making it accessible to researchers and practitioners to train their own models with custom datasets.

According to Meta, generating music poses a significant challenge as it comprises local and long-range patterns, ranging from individual notes to complex musical structures with multiple instruments.

The AudioCraft family of models, however, overcomes these challenges and excels at producing high-quality audio with long-term consistency, while remaining user-friendly.

AudioCraft jpg

AudioCraft serves multiple purposes, including music, sound, compression, and generation, all within the same framework. The company anticipates that MusicGen, with added controls, could evolve into a new type of instrument.

A couple of months back, Meta introduced Voicebox, a speech model capable of performing tasks that it was not specifically trained on.

P.S. - Like our news? Support the site: become a Patreon subscriber. Or shop with our Amazon link, or buy us a coffee! We use affiliate links when possible--thanks for supporting independent media.