OpenAI: When users ask OpenAI's image generation tools to create a picture, the system runs multiple safety checks — on the request and on the resulting image — before anything is shown. Requests for violent, explicit, or hateful images are declined automatically. The filters also block attempts to generate realistic likenesses of named public figures or images in specific living artists' styles. | AI Trace
Content ModerationInternal OnlyVerified
When users ask OpenAI's image generation tools to create a picture, the system runs multiple safety checks — on the request and on the resulting image — before anything is shown. Requests for violent, explicit, or hateful images are declined automatically. The filters also block attempts to generate realistic likenesses of named public figures or images in specific living artists' styles.
Details
OpenAI's image safety system operates at multiple layers: training data is filtered to remove explicit content including CSAM before models are trained; incoming prompts are screened by a classifier that rejects prohibited requests; generated images are passed through a second classifier before being shown to the user. Additionally, the system automatically rewrites user prompts to introduce demographic diversity and adds safety constraints around sensitive topics. OpenAI also applies C2PA Content Credentials — a digital watermarking standard — to images generated by DALL-E and GPT Image, so that tools and platforms can identify AI-generated content. OpenAI has developed an internal classifier that identifies DALL-E-generated images with over 99% accuracy.
Products affected
DALL-E 2DALL-E 3ChatGPT image generationGPT ImageOpenAI API