Generative AI systems deliver misleading answers, risking trust and safety

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

A Purdue University study found ChatGPT made wrong coding answers 52% of the time, often undetected by developers. Separately, Google’s new AI Overview search feature has repeatedly hallucinated absurd, unsafe advice—from glue on pizza to eating stones—undermining user trust. Both incidents highlight risks of unchecked generative AI errors.[AI generated]

Why's our monitor labelling this an incident or hazard?

ChatGPT is an AI system generating programming answers. The research shows that a majority of these answers contain errors, which can mislead users and cause harm indirectly through reliance on incorrect information. The harm is to users' work quality and efficiency, which can be considered harm to communities or professional groups. Since the AI system's use has directly led to dissemination of incorrect information causing potential or actual harm, this qualifies as an AI Incident under the framework.[AI generated]
AI principles
AccountabilityRobustness & digital securitySafetyTransparency & explainabilityHuman wellbeing

Industries
Consumer servicesIT infrastructure and hosting

Affected stakeholders
WorkersGeneral public

Harm types
Physical (injury)Economic/PropertyReputationalPsychologicalPublic interest

Severity
AI incident

Business function:
Research and development

AI system task:
Content generation


Articles about this incident or hazard

Thumbnail Image

研究发现 52% 的 ChatGPT 编程问题回答错误

2024-05-26
chinaz.com
Why's our monitor labelling this an incident or hazard?
ChatGPT is an AI system generating programming answers. The research shows that a majority of these answers contain errors, which can mislead users and cause harm indirectly through reliance on incorrect information. The harm is to users' work quality and efficiency, which can be considered harm to communities or professional groups. Since the AI system's use has directly led to dissemination of incorrect information causing potential or actual harm, this qualifies as an AI Incident under the framework.
Thumbnail Image

用膠水黏披薩?AI回答出大包 Google認了:正在手動刪除錯誤結果 | 國際要聞 | 全球 | NOWnews今日新聞

2024-05-25
NOWnews 今日新聞
Why's our monitor labelling this an incident or hazard?
An AI system (Google's AI Overviews) is explicitly involved and is malfunctioning by generating incorrect and misleading answers. This malfunction has directly led to harm in the form of misinformation, which can mislead users and potentially cause harm if followed (e.g., eating stones or glue). Therefore, this qualifies as an AI Incident because the AI's malfunction has directly led to harm through misinformation dissemination. The article also discusses Google's response to mitigate the harm, but the primary event is the AI Incident itself.
Thumbnail Image

早报|纯血鸿蒙原生应用数量超 4000,或与 Mate 70 一同上市/ AI 搜索生成错误内容,Google 回应/天涯正出售域名筹集资金

2024-05-27
凤凰网(凤凰新媒体)
Why's our monitor labelling this an incident or hazard?
The event involves an AI system (Google's AI search summary feature) that generates content based on multiple sources. The system has produced significant factual errors, misleading users with false information. This misinformation can harm users and communities by spreading incorrect knowledge. Google's response confirms the AI system's role and the harm caused. Therefore, this is an AI Incident due to realized harm from the AI system's outputs.
Thumbnail Image

纯血鸿蒙原生应用数量超4000,或与Mate 70一同上市/ AI搜索生成错误内容,Google回应/天涯正出售域名筹集资金

2024-05-27
凤凰网(凤凰新媒体)
Why's our monitor labelling this an incident or hazard?
The event involves an AI system explicitly described as generating search summaries using AI. The AI system's use has directly led to the dissemination of false and misleading factual information to users, which is a form of harm to communities and individuals relying on accurate information. Google's response confirms the issue and ongoing remediation but does not negate the fact that harm has occurred. Hence, this is an AI Incident due to realized harm caused by the AI system's outputs.
Thumbnail Image

让用户吃石头,给披萨涂胶水,Google AI 搜索疯了吗

2024-05-27
爱范儿
Why's our monitor labelling this an incident or hazard?
The event involves an AI system (Google's AI Overview) that generates summaries and answers by combining large language model outputs with internet sources. The AI system's malfunction (hallucinations and factual errors) has directly led to the dissemination of false and potentially harmful information to users, such as unsafe cooking advice and misleading health recommendations. This constitutes harm to people (health and safety risks) and harm to communities (misinformation). The article documents actual occurrences of these harms, not just potential risks. Hence, it meets the criteria for an AI Incident rather than a hazard or complementary information. The lack of an easy opt-out mechanism exacerbates the risk and impact of these harms.
Thumbnail Image

叫人吃石頭、披薩塗膠水,Google AI 搜尋瘋了嗎?

2024-05-28
TechNews 科技新報 | 市場和業內人士關心的趨勢、內幕與新聞
Why's our monitor labelling this an incident or hazard?
The event involves an AI system (Google's AI Overview, a generative AI-enhanced search feature) whose malfunction (hallucinations and factual errors) has directly led to the dissemination of false and potentially harmful information. The harms include misinformation to communities and potential health risks (e.g., advice to ingest glue or stones). The AI system's outputs have caused violations of trust and could lead to injury or harm to persons if followed. The article documents actual occurrences of these harms, not just potential risks, thus classifying it as an AI Incident rather than a hazard or complementary information.
Thumbnail Image

AI搜尋給出離譜答案,Google緊急撤除

2024-05-28
iThome Online
Why's our monitor labelling this an incident or hazard?
The AI Overview is an AI system integrated into Google's search that generates answers based on AI models. The event involves the AI system malfunctioning by producing incorrect and harmful outputs, such as recommending adding glue to pizza or falsely stating that Obama is Muslim. These outputs can cause harm to users by spreading misinformation and potentially dangerous advice. Google had to intervene and remove these answers, indicating the harm was realized and the AI system's malfunction was the direct cause. Therefore, this qualifies as an AI Incident under the framework.
Thumbnail Image

有雷!Google AI Overview荒謬答覆頻出爐,遭疑引用網友「玩笑話」當解答

2024-05-27
數位時代
Why's our monitor labelling this an incident or hazard?
The AI system (Google AI Overview) is explicitly involved as it generates the problematic answers. The harm is realized because users are receiving false and misleading information, which constitutes harm to communities through misinformation. This fits the definition of an AI Incident since the AI system's use has directly led to harm (misinformation and erosion of trust). Although Google is working on mitigation, the incident is ongoing. Therefore, this event qualifies as an AI Incident rather than a hazard or complementary information.
Thumbnail Image

质疑、卖身、价格战,AI 竟遭遇人类"围剿"?|钛媒体AGI_ChatGPT_研究_用户

2024-05-26
搜狐
Why's our monitor labelling this an incident or hazard?
The article explicitly describes AI systems (ChatGPT, Google's AI Overview) producing incorrect and misleading information, including dangerous advice (e.g., suggesting putting glue on pizza or eating stones), which can directly harm users' health or mislead them. This meets the definition of an AI Incident as the AI systems' use has directly led to harm (misinformation and potential health risks). The financial and market aspects, while important, do not themselves constitute harm but provide context. Therefore, the event is classified as an AI Incident due to the realized harms caused by AI outputs.
Thumbnail Image

新研究称ChatGPT提供错误编程答案的比例高达52% - AI 人工智能 - cnBeta.COM

2024-05-24
cnBeta.COM
Why's our monitor labelling this an incident or hazard?
The event involves an AI system (ChatGPT) providing incorrect outputs (programming answers) that are often not detected by users, which could plausibly lead to harm such as faulty software or wasted developer time. However, the article does not report any direct or indirect harm occurring yet. The focus is on the potential for harm due to errors and user overreliance, fitting the definition of an AI Hazard rather than an AI Incident. It is not merely complementary information because the study highlights a credible risk of harm from the AI system's use.
Thumbnail Image

Google AI Overviews功能输出答案漏洞百出 维护方式竟然是手动删除 - Google 谷歌 - cnBeta.COM

2024-05-25
cnBeta.COM
Why's our monitor labelling this an incident or hazard?
The AI Overviews feature is an AI system generating answers to user queries. The system has produced factually incorrect and dangerous advice, which can harm users' health or misinform them, fulfilling the harm criteria (a) injury or harm to health and (d) harm to communities through misinformation. Google's manual deletion of these answers and related social media posts shows an attempt to manage the harm but also indicates the presence of an incident. The event involves the use and malfunction of the AI system leading to direct harm, thus qualifying as an AI Incident.
Thumbnail Image

披萨上涂胶水、建议用户吃石头、毒蘑菇......Google又被大模型带沟里 - Google 谷歌 - cnBeta.COM

2024-05-25
cnBeta.COM
Why's our monitor labelling this an incident or hazard?
The AI system (Google's AI Overview) is explicitly involved as it generates answers to user queries. The event details multiple cases where the AI's outputs are factually incorrect and potentially harmful, such as recommending eating stones or suggesting harmful medical advice. These outputs have already been delivered to users, indicating realized harm (health and safety risks, misinformation). Google's response to disable and fix the feature confirms the AI's role in causing harm. Therefore, this qualifies as an AI Incident due to direct harm caused by the AI system's malfunction and use.
Thumbnail Image

Google AI鬧笑話|AI Overview教煮汽油燴意粉 Google辯稱情況罕見已作改善

2024-05-28
EJ Tech
Why's our monitor labelling this an incident or hazard?
The AI system (Google's AI Overview) is explicitly involved and has produced misleading and incorrect information, which can harm users by spreading misinformation and reducing trust in information sources. This constitutes harm to communities (a form of harm under the framework). Since the harm is occurring (misinformation dissemination), this qualifies as an AI Incident rather than a hazard or complementary information. The article also discusses responses and improvements, but the primary focus is on the AI system's erroneous outputs causing harm, meeting the criteria for an AI Incident.
Thumbnail Image

Google AI搜尋摘要錯誤百出 「狗留在高溫車內很安全」還有這些離譜答案 | udn科技玩家

2024-05-28
udn科技玩家
Why's our monitor labelling this an incident or hazard?
The AI system (Google's AI Overviews feature) is explicitly involved as it generates the misleading and factually incorrect summaries. The harms include misinformation that could lead to health risks (e.g., unsafe advice about dogs in hot cars or smoking during pregnancy), which constitutes indirect harm to health and communities. Since these harms are occurring through the AI's use and dissemination of false information, this qualifies as an AI Incident. The article does not describe potential future harm but actual realized harm through misinformation dissemination. The manual removal of errors is a response but does not negate the incident classification.
Thumbnail Image

质疑、卖身、价格战,AI 竟遭遇人类"围剿"?|钛媒体AGI-钛媒体官方网站

2024-05-26
tmtpost.com
Why's our monitor labelling this an incident or hazard?
The article explicitly mentions AI systems such as ChatGPT and Google's AI Overview, both of which have produced erroneous outputs that mislead users, constituting harm to communities and individuals relying on this information. This meets the criteria for an AI Incident because the AI systems' use has directly led to harm through misinformation. The discussion of company mergers and price wars, while important, does not itself constitute harm or hazard but provides context. Hence, the classification is AI Incident due to the realized harms from AI-generated misinformation and errors.