Content ModerationAugments Human LaborVerified

Tumblr uses a machine learning system to scan user accounts and posts for spam behavior — things like mass-following, repetitive posting, or coordinated bot activity. The system scores accounts on how likely they are to be spam and flags suspicious ones for a human reviewer to investigate, rather than automatically banning them.

Details

Tumblr's engineering team documented the spam detection pipeline in an August 2019 post on its engineering blog. The system uses semi-supervised machine learning — meaning it learns from both labeled examples of spam and unlabeled behavioral data — and achieves 98% accuracy in identifying spammers. It runs on Spark ML for model training and uses Redis to store spam probability scores for rapid lookup. Tumblr has acknowledged that AI-generated spam content is making the problem progressively harder, as AI tools allow bad actors to produce convincing fake posts at scale.

Products affected

Tumblr

Sources & Evidence

Company Disclosure

Company DisclosureVerified

Hi, how have you been?

Tumblr Unwrapping Blog (Q&A)·May 2024

https://unwrapping.tumblr.com/post/750237966495039488/tumblr-spam

Company DisclosureVerified

Advances in Spam Detection on Tumblr

Tumblr Engineering Blog·Aug 2019

https://engineering.tumblr.com/post/186792717134/advances-in-spam-detection-on-tumblr

Other practices by Tumblr

OtherTumblr actively blocks known AI web crawlers — automated programs that AI companies use to collect text and images from across the internet for model training — from accessing its content. This is managed through a technical configuration file called robots.txt that instructs crawlers which pages they may not visit. Tumblr's own licensing deals represent a notable exception to this blocking.OtherStarting around early 2024, Tumblr's parent company Automattic began selling public Tumblr posts — years of user-generated writing, images, and art — to AI companies including OpenAI and Midjourney, to use as training data for their AI models. Users were enrolled in this data sharing by default, with an opt-out toggle that had to be turned off separately for each individual blog.Customer SvcTumblr does not use AI chatbots or automated support agents to handle user help requests. Tumblr CEO Matt Mullenweg has explicitly stated there will be no Tumblr AI chatbot, and Tumblr's advertising platform, Blaze Pro, emphasizes that its support is handled by human staff.

Have evidence about Tumblr's AI practices? Submit a report.

Report a Sighting →