AI Usage at a Glance
Dec 5, 2018
ModerationPractice documented: In December 2018, Tumblr deployed an AI image-scanning system to automatically flag and hide content showing nudity or sexual activity, as part of a complete ban on adult content. The system was deeply flawed from launch, incorrectly flagging hundreds of thousands of harmless images — including cartoons, patent drawings, pet photos, and even images of then-Vice President Joe Biden — while missing actual prohibited content.
Practice DocumentedView practice →Aug 5, 2019
ModerationPractice documented: Tumblr uses a machine learning system to scan user accounts and posts for spam behavior — things like mass-following, repetitive posting, or coordinated bot activity. The system scores accounts on how likely they are to be spam and flags suspicious ones for a human reviewer to investigate, rather than automatically banning them.
Practice DocumentedView practice →Dec 10, 2021
Data AnalysisPractice documented: Tumblr uses information about what users post, like, search, and browse to decide which ads to show them. Rather than using complex behavioral prediction models, Tumblr's advertising approach is based on stated interests and content tags. Ads are also served through automated third-party advertising networks that handle placement in real time.
Practice DocumentedView practice →Feb 25, 2022
ModerationNew evidence: Tumblr will review its moderation algorithms after a porn ban-related settlement
Evidence AddedView practice →Aug 23, 2023
Customer SvcPractice documented: Tumblr does not use AI chatbots or automated support agents to handle user help requests. Tumblr CEO Matt Mullenweg has explicitly stated there will be no Tumblr AI chatbot, and Tumblr's advertising platform, Blaze Pro, emphasizes that its support is handled by human staff.
Practice DocumentedView practice →Feb 27, 2024
OtherPractice documented: Starting around early 2024, Tumblr's parent company Automattic began selling public Tumblr posts — years of user-generated writing, images, and art — to AI companies including OpenAI and Midjourney, to use as training data for their AI models. Users were enrolled in this data sharing by default, with an opt-out toggle that had to be turned off separately for each individual blog.
Practice DocumentedView practice →Feb 29, 2024
OtherPractice documented: Tumblr actively blocks known AI web crawlers — automated programs that AI companies use to collect text and images from across the internet for model training — from accessing its content. This is managed through a technical configuration file called robots.txt that instructs crawlers which pages they may not visit. Tumblr's own licensing deals represent a notable exception to this blocking.
Practice DocumentedView practice →Tumblr actively blocks known AI web crawlers — automated programs that AI companies use to collect text and images from across the internet for model training — from accessing its content. This is managed through a technical configuration file called robots.txt that instructs crawlers which pages they may not visit. Tumblr's own licensing deals represent a notable exception to this blocking.
Tumblr's robots.txt file blocks known AI data collection crawlers including GPTBot (OpenAI), ClaudeBot (Anthropic), and CCBot (Common Crawl), among others. Tumblr staff confirmed this policy in February 2024, stating they would continue to block AI crawlers "save for those with which we partner." In March 2024, Tumblr unblocked Bing's crawler after Microsoft changed its policy to no longer use indexed content for AI training without an explicit opt-in. Users who activate the "Prevent Third-Party Sharing" setting have their blogs added to an additional disallowed list for crawlers. The practice creates a visible tension: Tumblr blocks unauthorized scraping of user content while simultaneously licensing that same content to AI companies through paid deals.
Tumblr actively blocks known AI web crawlers — automated programs that AI companies use to collect text and images from across the internet for model training — from accessing its content. This is managed through a technical configuration file called robots.txt that instructs crawlers which pages they may not visit. Tumblr's own licensing deals represent a notable exception to this blocking.
Starting around early 2024, Tumblr's parent company Automattic began selling public Tumblr posts — years of user-generated writing, images, and art — to AI companies including OpenAI and Midjourney, to use as training data for their AI models. Users were enrolled in this data sharing by default, with an opt-out toggle that had to be turned off separately for each individual blog.
Tumblr uses a machine learning system to scan user accounts and posts for spam behavior — things like mass-following, repetitive posting, or coordinated bot activity. The system scores accounts on how likely they are to be spam and flags suspicious ones for a human reviewer to investigate, rather than automatically banning them.
In December 2018, Tumblr deployed an AI image-scanning system to automatically flag and hide content showing nudity or sexual activity, as part of a complete ban on adult content. The system was deeply flawed from launch, incorrectly flagging hundreds of thousands of harmless images — including cartoons, patent drawings, pet photos, and even images of then-Vice President Joe Biden — while missing actual prohibited content.
Have evidence about Tumblr's AI practices? Submit a report.
Submit a report →AI Trace is free and nonprofit. Support our work
Starting around early 2024, Tumblr's parent company Automattic began selling public Tumblr posts — years of user-generated writing, images, and art — to AI companies including OpenAI and Midjourney, to use as training data for their AI models. Users were enrolled in this data sharing by default, with an opt-out toggle that had to be turned off separately for each individual blog.
On February 27, 2024, investigative outlet 404 Media reported that Automattic had compiled all of Tumblr's public post content from 2014 to 2023 and was in advanced negotiations with OpenAI and Midjourney. Internal documents showed the data dump erroneously included content that should have been excluded: posts from deleted or suspended accounts, private posts on public blogs, unanswered asks, and posts marked as explicit or mature. Automattic added a per-blog "Prevent Third-Party Sharing" toggle the following day — after the story broke — but sharing was enabled by default (opt-out, not opt-in). Tumblr staff posted a public acknowledgment of the deals on February 27, 2024, framing the practice as a response to AI companies already scraping the web; the post received over 52,000 reblogs, reflecting the scale of user concern. A Tumblr product manager who worked on the data preparation subsequently posted publicly that he was removing his own photos from the platform. A follow-up report in March 2024 revealed a separate "Firehose" pipeline selling approximately one million daily WordPress posts through a data intermediary; Automattic later deprecated this pipeline. Critics noted that once content is used to train an AI model, it cannot be meaningfully removed.
Tumblr does not use AI chatbots or automated support agents to handle user help requests. Tumblr CEO Matt Mullenweg has explicitly stated there will be no Tumblr AI chatbot, and Tumblr's advertising platform, Blaze Pro, emphasizes that its support is handled by human staff.
The absence of AI-powered customer service at Tumblr is notable given the platform's extremely limited headcount, described publicly as a "skeleton crew" in late 2023. Tumblr relies on a standard help center and a manual support ticket system. Blaze Pro's product page specifically states support is provided by "our in-house support team of real humans," suggesting a deliberate positioning against automated customer service. No evidence was found of any planned AI customer service deployment.