Top Media Outlets Block Apple’s AI Data Collection
Several prominent news organizations and social media platforms, including Meta, have opted out of allowing Apple to use their data for AI training, according to Wired.

This decision comes just three months after Apple introduced a tool enabling publishers to block the tech giant’s web-crawling bot, Applebot-Extended, from using their content to train its AI models.
Applebot-Extended is an extension of Apple’s original web-crawling bot, Applebot, which was initially launched in 2015. While the original Applebot was designed to gather data for Apple’s search services like Siri and Spotlight, the extended version allows website owners to control whether their data can be used for AI training.
Among the organizations that have chosen to block Applebot-Extended are heavyweights like Facebook, Instagram, The New York Times, The Financial Times, and Condé Nast, the parent company of Wired.
According to Apple spokesperson Nadine Haija, Applebot-Extended respects publishers’ rights by offering them the choice to prevent their content from being used in AI training without affecting how their websites appear in Apple’s search products. This is achieved by updating a simple text file on their websites known as the Robots Exclusion Protocol, or robots.txt.
Despite its recent introduction, Applebot-Extended has not yet been widely blocked. Analysis by Ontario-based AI-detection startup Originality AI and watchdog service Dark Visitors indicates that only about 6-7% of high-traffic websites have taken steps to block Applebot-Extended, with news and media outlets leading the charge.
A separate study by data journalist Ben Welsh found that roughly 25% of the news websites he examined have blocked Applebot-Extended, compared to 53% that block OpenAI’s bot and 43% that block Google’s AI-specific bot, Google-Extended.

Welsh notes that the number of websites blocking these bots has been gradually increasing, indicating growing awareness and resistance among publishers.
Some organizations, however, may be withholding their data as a strategic move, possibly in anticipation of negotiating licensing deals with AI companies.
Want to see more of our stories on Google?
P.S. Want to keep this site truly independent? Support us by buying us a beer, treating us to a coffee, or shopping through Amazon here. Links in this post are affiliate links, so we earn a tiny commission at no charge to you. Thanks for supporting independent Canadian media!