AI-Powered Child Safety Features on Social Media Platforms Fail to Protect Minors

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

A study by researchers at NYU and Northeastern University found that over half of AI-driven child safety features on Instagram, Snapchat, TikTok, and YouTube do not work as advertised. These failures expose minors to harmful content and unsafe interactions, leading to lawsuits and documented harm to children's wellbeing.[AI generated]

Why's our monitor labelling this an incident or hazard?

The social media platforms employ AI systems for content recommendation, moderation, and user interaction controls, which are explicitly mentioned or reasonably inferred given the nature of the platforms and described features. The study found that these AI-driven safety features are broken, missing, or easily bypassed, leading to direct harms such as exposure of children to adult strangers, harmful content, and ineffective time limits. These harms constitute violations of children's rights and harm to their health and well-being. Since the AI systems' malfunction or ineffective deployment directly led to these harms, this event meets the criteria for an AI Incident. The article does not merely discuss potential risks or responses but documents realized harms caused by AI system failures.[AI generated]
AI principles
SafetyRobustness & digital security

Industries
Media, social platforms, and marketing

Affected stakeholders
Children

Harm types
Psychological

Severity
AI incident

Business function:
Monitoring and quality control

AI system task:
Recognition/object detection


Articles about this incident or hazard

Thumbnail Image

Many Child Safety Features on Social Apps Don't Work, Report Finds

2026-06-29
The New York Times
Why's our monitor labelling this an incident or hazard?
The social media platforms employ AI systems for content recommendation, moderation, and user interaction controls, which are explicitly mentioned or reasonably inferred given the nature of the platforms and described features. The study found that these AI-driven safety features are broken, missing, or easily bypassed, leading to direct harms such as exposure of children to adult strangers, harmful content, and ineffective time limits. These harms constitute violations of children's rights and harm to their health and well-being. Since the AI systems' malfunction or ineffective deployment directly led to these harms, this event meets the criteria for an AI Incident. The article does not merely discuss potential risks or responses but documents realized harms caused by AI system failures.
Thumbnail Image

More than half of social media child safety features aren't working as advertised, new research finds

2026-06-29
CNN International
Why's our monitor labelling this an incident or hazard?
The event involves AI systems explicitly in the form of automated safety features on social media platforms designed to protect children. The research demonstrates that these AI systems fail to function as intended, leading to direct harm or increased risk of harm to children, such as exposure to harmful content and unsafe interactions. The failure of these AI safety features to operate effectively is a malfunction or misuse of AI systems that has already caused or is causing harm, meeting the criteria for an AI Incident. The harms include violations of children's rights to safety and wellbeing, harm to communities (children and youth), and potential psychological injury. Therefore, this event is classified as an AI Incident rather than a hazard or complementary information.
Thumbnail Image

Many child safety features on social apps don't work, report finds

2026-06-30
The Star
Why's our monitor labelling this an incident or hazard?
The social media platforms employ AI systems for content recommendation, search suggestions, and user connection algorithms. The study found that these AI-driven safety features are broken, missing, or easily bypassed, resulting in direct harm to minors through exposure to harmful content and unsafe interactions. This meets the definition of an AI Incident because the AI systems' malfunction or ineffective use has directly led to harm to health and communities. The article documents realized harm rather than potential harm, so it is not an AI Hazard. It is not merely complementary information because the main focus is on the failure of AI safety features causing harm, not on responses or updates. Hence, the classification is AI Incident.
Thumbnail Image

Report claims that half of the social media child safety features don't work

2026-06-30
GameReactor
Why's our monitor labelling this an incident or hazard?
The child safety features on social media platforms are typically AI systems that monitor and filter content to protect children. The study shows that at least half of these features fail, implying malfunction or ineffective use of AI systems. This failure has led to harm or risk of harm to children, as indicated by lawsuits claiming harm caused by these platforms. The AI system's malfunction or ineffective performance is a contributing factor to the harm, meeting the criteria for an AI Incident.