Meta's AI Moderation Fails to Curb Anti-Trans Hate Speech, GLAAD Reports

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

GLAAD reports that Meta's AI-driven content moderation systems are failing to remove extreme anti-trans hate speech across Facebook, Instagram, and Threads. Despite policies banning such content, harmful posts—including slurs and calls for violence—remain widespread, contributing to real-world harm against LGBTQ+ communities.[AI generated]

Why's our monitor labelling this an incident or hazard?

Meta's content moderation relies heavily on AI systems to detect and remove harmful content. The report highlights that these AI systems are allowing anti-trans hate speech and calls for violence to remain online, which has resulted in documented real-world harms to LGBTQ+ people. This constitutes an AI Incident because the AI system's malfunction or inadequacy in moderating content has directly or indirectly led to harm to communities and violations of rights.[AI generated]

AI principles

SafetyFairnessRespect of human rights

Industries

Media, social platforms, and marketing

Affected stakeholders

Other

Harm types

PsychologicalHuman or fundamental rights

Severity

AI incident

Business function:

Monitoring and quality control

AI system task:

Recognition/object detection

Articles about this incident or hazard

Thumbnail Image

GLAAD report says Meta allows anti-trans hate to "flourish" on its platforms

2024-03-27

The Verge

Why's our monitor labelling this an incident or hazard?

Meta's content moderation relies heavily on AI systems to detect and remove harmful content. The report highlights that these AI systems are allowing anti-trans hate speech and calls for violence to remain online, which has resulted in documented real-world harms to LGBTQ+ people. This constitutes an AI Incident because the AI system's malfunction or inadequacy in moderating content has directly or indirectly led to harm to communities and violations of rights.

Thumbnail Image

Meta Platforms Fail to Stop 'Extraordinary Amounts' of Anti-Trans Hate

2024-03-28

PC Magazine

Why's our monitor labelling this an incident or hazard?

The event involves the use of AI systems (content moderation classifiers and human reviewers supported by AI tools) in the development and use phases. The failure of these AI systems to detect and remove harmful anti-trans hate speech has directly led to violations of human rights and harm to communities (the LGBTQ and transgender communities). The report and Oversight Board findings indicate that the AI moderation tools are insufficient or ineffective, resulting in ongoing harm. Therefore, this qualifies as an AI Incident due to the direct link between AI system malfunction or inadequacy and realized harm to a vulnerable community.

Thumbnail Image

Meta enabling anti-transgender hate speech on its platforms, says GLAAD

2024-03-28

NewsBytes

Why's our monitor labelling this an incident or hazard?

The event explicitly involves AI systems in Meta's content moderation processes, including automated prioritization and human review influenced by AI outputs. The failure to remove harmful content despite reports has directly led to real-world harm, such as bomb threats against targeted individuals, fulfilling the criteria for an AI Incident. The AI system's malfunction or inadequate performance in moderating hate speech is a contributing factor to the harm. Therefore, this qualifies as an AI Incident due to direct harm caused by the AI system's use and malfunction.

Thumbnail Image

Meta on LGBTQ safety, hate speech is pattern of inaction: GLAAD - Fast Company

2024-03-28

Fast Company

Why's our monitor labelling this an incident or hazard?

Meta uses AI systems to enforce its Community Standards by detecting and removing hate speech and harmful content. The report and oversight board findings indicate that these AI systems are not effectively enforcing policies against anti-LGBTQ hate speech, resulting in ongoing harm to LGBTQ individuals and communities. This failure to act or insufficient enforcement by AI content moderation systems directly contributes to violations of rights and harm to communities, meeting the criteria for an AI Incident.

Thumbnail Image

Meta is failing to curb anti-trans hate, report says - The Boston Globe

2024-03-27

The Boston Globe

Why's our monitor labelling this an incident or hazard?

Meta uses AI-based content moderation systems to monitor and enforce its policies on hate speech. The report highlights that despite these systems and human oversight, anti-trans hate speech persists and is not adequately removed, contributing to offline harms. This indicates a malfunction or failure in the AI system's use for content moderation, leading to violations of rights and harm to communities. Therefore, this qualifies as an AI Incident due to the direct or indirect role of AI systems in the harm caused by the spread of hate speech on Meta's platforms.

Thumbnail Image

'Dehumanizing and disturbing': GLAAD report reveals Meta's 'ongoing failure' to enforce policies prohibiting transphobic content

2024-03-27

The Daily Dot

Why's our monitor labelling this an incident or hazard?

The event involves AI systems used by Meta for content moderation, which are failing to adequately enforce policies against transphobic hate speech. This failure indirectly leads to harm to the LGBTQ community by allowing harmful content to proliferate, constituting a violation of rights and harm to communities. Therefore, this qualifies as an AI Incident because the AI system's malfunction (ineffective moderation) is a contributing factor to ongoing harm.

Thumbnail Image

Meta failing to moderate 'extreme anti-trans hate' on its platforms, says report

2024-03-28

PinkNews | Latest lesbian, gay, bi and trans news | LGBTQ+ news

Why's our monitor labelling this an incident or hazard?

Meta's platforms use AI systems for content moderation to detect and remove hate speech. The report highlights that despite policies banning transphobic content, the AI moderation systems are failing to remove posts containing extreme anti-trans hate, including slurs and calls for violence. This failure has led to ongoing harm, including real-world violence against LGBTQ+ people, which is a violation of human rights and harm to communities. The AI system's malfunction or inadequacy in moderating such content directly contributes to these harms, qualifying this as an AI Incident.