Meta Reveals AI Assistant Trained on Public Facebook, Instagram Posts
Meta, the parent company of Facebook and Instagram, has disclosed that it utilized publicly available Facebook and Instagram posts to train its latest virtual assistant, Meta AI.
In an exclusive interview with Reuters, the company’s top policy executive, Nick Clegg, emphasized their commitment to safeguarding user privacy by excluding private posts intended solely for friends and family from the training dataset.
Clegg further stated that private chats from Meta’s messaging services were not utilized as training data, and private information from the public datasets was also filtered out.
“We’ve tried to exclude datasets that have a heavy preponderance of personal information,” Clegg remarked, underscoring that the “vast majority” of the data used for training was publicly accessible.
The decision by Meta to draw from public social media posts raises concerns in an environment where tech giants like Meta, OpenAI, and Google have faced criticism for utilizing internet-scraped information without proper authorization for their AI models.
The development of Meta AI, which relied on a custom model based on the potent Llama 2 large language model, represents a significant product launch within the company’s portfolio of consumer-facing AI tools.
Meta AI also utilizes a novel model named Emu, designed for generating images from text prompts.
CEO Mark Zuckerberg unveiled this innovative offering during Meta’s annual Connect conference, which notably shifted its focus this year from augmented and virtual reality to artificial intelligence.
This versatile product will be capable of generating text, audio, and imagery, with real-time information accessible through a partnership with Microsoft’s Bing search engine.
Meta says public Facebook and Instagram posts served as crucial training data for Meta AI, encompassing both text and images.
Emu’s image generation capabilities were honed using these posts, while chat functions drew from Llama 2, supplemented with publicly available and annotated datasets.
To ensure responsible usage, Meta has imposed safety restrictions on the content generated by Meta AI, including a prohibition on the creation of lifelike images of public figures.