Tumblr: Tumblr uses a machine learning system to scan user accounts and posts for spam behavior — things like mass-following, repetitive posting, or coordinated bot activity. The system scores accounts on how likely they are to be spam and flags suspicious ones for a human reviewer to investigate, rather than automatically banning them. | AI Trace
Content ModerationAugments Human LaborVerified
Tumblr uses a machine learning system to scan user accounts and posts for spam behavior — things like mass-following, repetitive posting, or coordinated bot activity. The system scores accounts on how likely they are to be spam and flags suspicious ones for a human reviewer to investigate, rather than automatically banning them.
Details
Tumblr's engineering team documented the spam detection pipeline in an August 2019 post on its engineering blog. The system uses semi-supervised machine learning — meaning it learns from both labeled examples of spam and unlabeled behavioral data — and achieves 98% accuracy in identifying spammers. It runs on Spark ML for model training and uses Redis to store spam probability scores for rapid lookup. Tumblr has acknowledged that AI-generated spam content is making the problem progressively harder, as AI tools allow bad actors to produce convincing fake posts at scale.