Details
According to Midjourney's AB2013 Documentation (a California regulatory disclosure), its models are trained on datasets comprising billions of images, text, and audiovisual content from multiple source categories: publicly crawled web content, licensed data, public domain data, and data potentially protected by copyright used under a fair use claim. User-submitted prompts and generated outputs are also covered by Midjourney's Terms of Service, which grants the company a perpetual license to use them for service improvement. Training data undergoes processing steps including safety filtering (to remove CSAM and other sensitive content) and privacy processing to filter personal information. As of 2025, Midjourney faces multiple active copyright infringement lawsuits — including a class action by artists (Andersen et al., filed January 2023) and a lawsuit filed by Disney and Universal in June 2025 — challenging whether its training on copyrighted works constitutes fair use.
Have evidence about Midjourney's AI practices? Submit a report.
Report a Sighting →