Details
The system uses a computer vision model (a type of machine learning that analyzes images) to identify objects, scenes, and activities in each photo uploaded to Instagram, producing descriptions such as "may contain: two people, smiling, outdoor, beach." These are read aloud by screen readers when a visually impaired user encounters the post. Creators can also write custom alternative text to override the AI-generated version. The feature was announced via the Instagram Blog and confirmed in the Instagram Help Center. The AI descriptions are often generic and can miss context a human description would include.
Have evidence about Instagram's AI practices? Submit a report.
Report a Sighting →