AI Chatbots Exhibit Racial Bias Despite Anti-Racism Training

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

Multiple studies reveal that leading AI chatbots, including OpenAI's GPT-4 and others, continue to display covert racial bias against African American English speakers, even after anti-racism training. This bias affects AI-generated judgments in areas like criminal sentencing and employability, posing ongoing harm to affected communities.[AI generated]

Why's our monitor labelling this an incident or hazard?

The article explicitly involves AI systems (large language models) and their use in tasks that simulate real-world decision-making with discriminatory outcomes based on dialect, which is a proxy for race. The harms identified include violations of human rights and harm to communities through racial bias and discrimination. The study's findings indicate that these harms are already present in the AI models' behavior, even if demonstrated in experimental settings, and thus constitute an AI Incident due to the direct link between AI use and discriminatory harm. The potential for real-world deployment in business and judicial contexts further supports the classification as an incident rather than a mere hazard or complementary information.[AI generated]
AI principles
FairnessRespect of human rightsAccountabilityTransparency & explainabilityRobustness & digital security

Industries
Government, security, and defenceBusiness processes and support services

Affected stakeholders
General public

Harm types
Human or fundamental rightsEconomic/Property

Severity
AI incident

Business function:
Human resource managementCompliance and justice

AI system task:
Interaction support/chatbotsForecasting/predictionContent generation


Articles about this incident or hazard

Thumbnail Image

ChatGPT Is More Likely to Sentence People Who Speak African American English to Death, Researchers Say

2024-03-07
Gizmodo
Why's our monitor labelling this an incident or hazard?
The article explicitly involves AI systems (large language models) and their use in tasks that simulate real-world decision-making with discriminatory outcomes based on dialect, which is a proxy for race. The harms identified include violations of human rights and harm to communities through racial bias and discrimination. The study's findings indicate that these harms are already present in the AI models' behavior, even if demonstrated in experimental settings, and thus constitute an AI Incident due to the direct link between AI use and discriminatory harm. The potential for real-world deployment in business and judicial contexts further supports the classification as an incident rather than a mere hazard or complementary information.
Thumbnail Image

AI model recommended Black defendents 'be sentenced to death'

2024-03-09
Euronews English
Why's our monitor labelling this an incident or hazard?
The article explicitly involves AI systems (large language models) whose use leads to discriminatory and racially biased outputs that can harm individuals by influencing judgments about criminality and employability. These harms fall under violations of human rights and discrimination, which are recognized AI harms. The study's findings demonstrate that these harms are occurring, not just potential, thus qualifying this as an AI Incident rather than a hazard or complementary information. The AI system's outputs directly contribute to harmful stereotyping and potentially unjust treatment, fulfilling the criteria for an AI Incident.
Thumbnail Image

Even after anti-racism training AI chatbots like ChatGPT still exhibit racial prejudice

2024-03-11
Notebookcheck
Why's our monitor labelling this an incident or hazard?
The article explicitly discusses AI systems (large language models like ChatGPT-4) producing racially biased and prejudiced outputs that have real-world implications, such as influencing sentencing recommendations and job matching. These outputs constitute violations of human rights and cause harm to communities, fulfilling the criteria for an AI Incident. The harm is realized, not just potential, as the biased outputs have been demonstrated in testing and could affect users relying on these systems. The involvement is through the AI system's use and failure to adequately mitigate bias despite safety training. Hence, the event is classified as an AI Incident.
Thumbnail Image

Racist AI: ChatGPT, Copilot, more likely to sentence African-American defendants to death, finds Cornell study

2024-03-11
Firstpost
Why's our monitor labelling this an incident or hazard?
The study explicitly involves AI systems (LLMs) and demonstrates that their use leads to discriminatory outcomes based on language dialect, which is closely tied to racial identity. This bias can directly or indirectly cause harm to individuals by influencing decisions in critical domains such as criminal justice and employment, thus meeting the criteria for an AI Incident due to violations of human rights and harm to communities. The harm is realized or ongoing as the models are currently in use and exhibit these biases, not merely a potential risk.
Thumbnail Image

AI chatbots use racist stereotypes even after anti-racism training

2024-03-07
New Scientist
Why's our monitor labelling this an incident or hazard?
The article explicitly identifies AI systems (large language models powering commercial chatbots) as producing outputs that demonstrate racial prejudice, which can influence harmful decisions about employability and criminality. This constitutes a violation of human rights and causes harm to communities, fulfilling the criteria for an AI Incident. The harm is realized and ongoing, not merely potential, as the biased outputs are demonstrated and the models are in active use by millions. Therefore, this event is classified as an AI Incident.
Thumbnail Image

AI chatbots found to use racist stereotypes even after anti-racism training

2024-03-08
Tech Xplore
Why's our monitor labelling this an incident or hazard?
The event involves AI systems explicitly (large language models) and their use in generating biased, racist outputs. The harm is realized and direct, as the AI systems produce stereotypical and discriminatory content that can perpetuate social harm and violate rights. This fits the definition of an AI Incident because the AI systems' use has directly led to harm to communities and violations of rights. The study's findings confirm the presence of these harms despite mitigation efforts, indicating ongoing issues rather than potential future harm or mere complementary information.
Thumbnail Image

Uncovering Language Bias: AI Models Implicated in Covert Racism Study

2024-03-09
Cryptopolitan
Why's our monitor labelling this an incident or hazard?
The event involves the use of AI systems (LLMs) whose biased outputs have been demonstrated to potentially cause harm to individuals and groups, particularly in critical areas like law enforcement and hiring. The AI's biased recommendations could lead to violations of human rights and harm to communities, fulfilling the criteria for an AI Incident. The harm is indirect but clearly linked to the AI system's outputs and their real-world implications. Therefore, this qualifies as an AI Incident rather than a hazard or complementary information.
Thumbnail Image

Chatbots persist in using racial stereotypes despite anti-bias training

2024-03-12
Knowridge Science Report
Why's our monitor labelling this an incident or hazard?
The AI systems involved are large language models explicitly mentioned in the study. Their use in generating biased outputs that reinforce racial stereotypes constitutes indirect harm to communities and violations of rights, as these biases can perpetuate discrimination and social harm. The harm is realized through the AI's outputs influencing perceptions and potentially decisions about individuals based on race-related language use. Therefore, this qualifies as an AI Incident under the framework, as the AI's use has directly led to harm in terms of bias and stereotyping.
Thumbnail Image

Chatbot AI makes racist judgements based on dialect

2024-03-13
Nature
Why's our monitor labelling this an incident or hazard?
The event involves AI systems (LLMs) explicitly mentioned and tested for bias. The study demonstrates that these AI systems' outputs directly lead to discriminatory harms, including racial bias in legal and employment contexts, which constitute violations of human rights and harm to communities. The harms are realized and documented, not merely potential. Therefore, this qualifies as an AI Incident because the AI systems' use has directly led to significant harms related to racial discrimination and stereotyping.
Thumbnail Image

ChatGPT can be kind of racist based on how people speak, researchers say

2024-03-11
Quartz
Why's our monitor labelling this an incident or hazard?
The article explicitly discusses how AI systems (LLMs) produce racially biased outputs that disadvantage African American English speakers in job matching and criminal sentencing scenarios. These biases represent violations of human rights and cause harm to communities by reinforcing racial prejudice and discrimination. The AI systems' development and use have directly led to these harms, qualifying this event as an AI Incident under the OECD framework.
Thumbnail Image

LLMs become more covertly racist with human intervention

2024-03-11
MIT Technology Review
Why's our monitor labelling this an incident or hazard?
The article explicitly discusses AI systems (LLMs) and their biased outputs, which constitute violations of human rights and harm to communities by perpetuating racial prejudice. The research highlights that these harms are occurring due to the AI systems' outputs and that current mitigation methods are insufficient, indicating ongoing harm. Therefore, this qualifies as an AI Incident because the AI systems' use has directly led to harm through biased and discriminatory outputs affecting racial groups. The involvement is through the use and development of AI systems, and the harm is realized in the form of racial bias and stereotyping, which is a violation of rights and harm to communities.
Thumbnail Image

AI models exhibit racism based on written dialect

2024-03-11
TheRegister.com
Why's our monitor labelling this an incident or hazard?
The event involves AI systems (LLMs) whose use leads to violations of human rights and harms communities through racial bias and discrimination. The research demonstrates that these AI models systematically produce outputs that reflect and reinforce racist stereotypes, which can directly or indirectly harm individuals by influencing decisions about employability and criminality. This constitutes an AI Incident because the AI systems' use has directly led to discriminatory harm, a violation of fundamental rights. The article reports on realized bias and harm, not just potential risk, and thus it is not merely a hazard or complementary information.
Thumbnail Image

Deplorable, 'Racist' AI Resists Woke Indoctrination Programs, Does Hate Speech - LewRockwell

2024-03-15
Lew Rockwell
Why's our monitor labelling this an incident or hazard?
The event involves AI systems explicitly (large language models) and their use in generating biased outputs that reinforce racist stereotypes. The harm is indirect but clear: the AI's biased responses contribute to violations of human rights and harm to communities by perpetuating racial discrimination. The research confirms that despite mitigation efforts, these harms persist, fulfilling the criteria for an AI Incident. The article does not merely discuss potential or future harm but documents realized bias in AI outputs, which is a form of harm under the framework.
Thumbnail Image

AI Study Reveals Persistent Racial Bias in Language Models | Cryptopolitan

2024-03-13
Cryptopolitan
Why's our monitor labelling this an incident or hazard?
The event involves AI systems (LLMs) whose outputs have directly led to discriminatory and harmful stereotypes against a racial group, which constitutes harm to communities and a violation of rights. The bias is demonstrated through experiments showing negative stereotypes and prejudicial associations in AI responses, indicating realized harm rather than just potential. The involvement is through the use of AI systems generating biased outputs, fulfilling the criteria for an AI Incident. The article does not merely discuss potential risks or responses but documents actual biased behavior of deployed AI models causing harm.
Thumbnail Image

Researchers Find AI Chatbots Are Racist Despite Multiple Anti-Racism Training

2024-03-11
The Tech Report
Why's our monitor labelling this an incident or hazard?
The article explicitly discusses AI chatbots (large language models) exhibiting racist behavior, which is a direct harm to communities and a violation of rights. The harm is realized, as the chatbots produce biased and prejudiced outputs affecting perceptions and potentially real-world outcomes. The involvement of AI systems is clear, and the harm stems from their use and development, including training data selection. This fits the definition of an AI Incident because the AI systems have directly led to harm through biased outputs.
Thumbnail Image

The dangers of "covertly" racist AI models - Tortoise

2024-03-13
Tortoise
Why's our monitor labelling this an incident or hazard?
The event involves AI systems (large language models) whose use has led to discriminatory biases that can harm individuals by perpetuating racial prejudice, particularly in sensitive applications such as job screening. Although the article does not describe a specific incident of harm occurring, it highlights ongoing and systemic bias in deployed AI models that directly contributes to violations of rights and discrimination, which qualifies as an AI Incident under the framework. The harm is realized in the models' outputs and their potential to cause real-world discriminatory effects, not merely a potential future risk.
Thumbnail Image

Unmasking the Covert Prejudice in AI: A Dive into Dialect Discrimination

2024-03-14
MarkTechPost
Why's our monitor labelling this an incident or hazard?
The event involves AI systems (language models) whose use has directly led to discriminatory harms against a marginalized group, fulfilling the criteria for an AI Incident. The bias is embedded in the AI's outputs, causing real-world negative impacts such as reinforcing stereotypes and unequal treatment. The article describes realized harms rather than potential ones, and the AI system's role is pivotal in causing these harms. Therefore, this is classified as an AI Incident.