Content ModerationVerified

Bluesky deployed an automated model that detects replies judged to be toxic, spammy, off-topic, or posted in bad faith, and reduces their visibility in threads, search results, and notifications without removing them.

Details

Announced on October 31, 2025, Bluesky's updated toxicity detection model classifies replies and deprioritizes flagged content so that most users see it only after an extra click. Bluesky's 2025 Transparency Report credits this system with a 79% decline in daily anti-social behavior reports by October 2025. The system is designed to reduce harm without deleting content: replies from accounts the user follows appear above the fold, while flagged replies from others require an additional click to view. Bluesky did not publicly disclose the specific model architecture or training data used.