AI Chatbot Safeguards Fail, Enabling Spread of Health Disinformation

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

A study found that leading AI chatbots, including GPT-4o, Gemini, Claude, Llama, and Grok, can be easily manipulated to generate convincing but false health information, despite built-in safeguards. This vulnerability enables the spread of harmful medical disinformation, posing significant risks to public health.[AI generated]

Why's our monitor labelling this an incident or hazard?

The article explicitly involves AI systems (large language models/chatbots) whose use has directly led to the spread of false medical advice, a form of harm to health. The study demonstrates that safeguards in these AI systems can be bypassed, resulting in harmful outputs. This constitutes an AI Incident because the AI systems' use has directly caused harm through misinformation with potential real-world health consequences.[AI generated]
AI principles
AccountabilityRobustness & digital securitySafetyTransparency & explainabilityHuman wellbeing

Industries
Healthcare, drugs, and biotechnologyMedia, social platforms, and marketing

Affected stakeholders
ConsumersGeneral public

Harm types
Physical (injury)

Severity
AI incident

AI system task:
Interaction support/chatbotsContent generation


Articles about this incident or hazard

Thumbnail Image

AI chatbots still spread dangerous health disinformation, study finds

2025-06-24
Cybernews
Why's our monitor labelling this an incident or hazard?
The article explicitly involves AI systems (large language models/chatbots) whose use has directly led to the spread of false medical advice, a form of harm to health. The study demonstrates that safeguards in these AI systems can be bypassed, resulting in harmful outputs. This constitutes an AI Incident because the AI systems' use has directly caused harm through misinformation with potential real-world health consequences.
Thumbnail Image

AI chatbot safeguards fail to prevent spread of health disinformation, study reveals

2025-06-23
Medical Xpress - Medical and Health News
Why's our monitor labelling this an incident or hazard?
The event involves AI systems explicitly (foundational LLMs) and their misuse to generate health disinformation. The harm is realized and direct: the AI chatbots produce false health information that can mislead users, potentially causing injury or harm to health and communities. The study documents this misuse and the failure of safeguards, confirming the AI systems' role in causing harm. Therefore, this qualifies as an AI Incident under the framework, as the AI system's use has directly led to harm (a) injury or harm to health and (d) harm to communities.
Thumbnail Image

AI chatbot safeguards fail to prevent spread of health disinformation

2025-06-23
EurekAlert!
Why's our monitor labelling this an incident or hazard?
The event involves the use and misuse of AI systems (foundational LLMs) that have been shown to produce harmful health disinformation when manipulated. The disinformation is generated by AI chatbots, which are AI systems as defined, and the harm is realized through the spread of false health information that can negatively impact public health and individual decision-making. The study documents actual generation and dissemination of disinformation, not just potential risk, thus constituting an AI Incident rather than a hazard or complementary information. The harm is indirect but direct enough as the AI system's outputs are the source of the disinformation causing harm to communities and potentially individuals' health.
Thumbnail Image

AI Chatbot Protections Fall Short in Curbing Health Misinformation

2025-06-23
Scienmag: Latest Science and Health News
Why's our monitor labelling this an incident or hazard?
The article explicitly involves AI systems (LLMs powering chatbots) whose use has directly led to harm in the form of health misinformation dissemination. This misinformation can cause injury or harm to people's health and harm to communities by eroding trust in medical advice. The study demonstrates that safeguards in these AI systems are insufficient, allowing the generation of harmful content. Therefore, this qualifies as an AI Incident because the AI systems' use has directly led to realized harm through misinformation propagation affecting public health.
Thumbnail Image

AI Chatbots Vulnerable to Spreading Harmful Health Information

2025-06-25
Inside Precision Medicine
Why's our monitor labelling this an incident or hazard?
The article explicitly involves AI systems (large language models and chatbots) and their misuse to spread false health information. The harm described is the potential for these AI systems to cause injury or harm to people's health by disseminating disinformation. While no specific incident of harm is reported, the research demonstrates that the AI systems can be exploited to produce harmful outputs consistently, indicating a credible risk of future harm. This fits the definition of an AI Hazard, as the event plausibly leads to an AI Incident if unmitigated. The article also calls for urgent safeguards and monitoring, reinforcing the recognition of this risk. There is no indication that actual harm has yet occurred, so it is not an AI Incident. It is more than complementary information because the main focus is on the vulnerability and risk of harm, not on responses or updates to past incidents.
Thumbnail Image

It's too easy to make AI chatbots lie about health information, study finds

2025-07-01
Reuters
Why's our monitor labelling this an incident or hazard?
The event involves AI systems (large language models) explicitly mentioned and tested. The misuse of these AI systems to generate false health information that appears authoritative constitutes a direct or indirect cause of harm to people's health by spreading misinformation. The study's findings confirm that the AI systems' outputs can be manipulated to produce harmful content, fulfilling the criteria for an AI Incident. Although the article discusses potential improvements and safety measures, the primary focus is on the realized risk and demonstration of harm through misinformation generation, not just a potential hazard or complementary information.
Thumbnail Image

It´s too easy to make AI chatbots lie about health information,...

2025-07-01
Daily Mail Online
Why's our monitor labelling this an incident or hazard?
The article explicitly involves AI systems (large language models) and their use in generating health misinformation. The researchers manipulated the AI systems to produce false, authoritative-sounding health answers with fabricated citations, demonstrating the AI's vulnerability to misuse. This misuse could plausibly lead to harm to public health and communities by spreading dangerous misinformation. However, the article does not report an actual incident of harm occurring but rather a demonstration of potential misuse and a warning about risks. Hence, it fits the definition of an AI Hazard, as the AI system's use could plausibly lead to an AI Incident (harm).
Thumbnail Image

It's too easy to make AI chatbots lie about health information, study finds | Mint

2025-07-01
mint
Why's our monitor labelling this an incident or hazard?
The event clearly involves AI systems (large language models) being used to generate false health information, which can directly harm individuals and public health by spreading misinformation. The study's findings show that the AI systems' outputs can be manipulated to produce harmful content, fulfilling the criteria for an AI Incident due to harm to health and communities. The article describes realized harm potential through misinformation, not just a theoretical risk, and thus it is not merely a hazard or complementary information. The involvement of AI is explicit and central to the harm described.
Thumbnail Image

It's too easy to make AI chatbots lie about health information, study finds

2025-07-01
CNA
Why's our monitor labelling this an incident or hazard?
The event involves AI systems explicitly (large language models like GPT-4o, Gemini, Claude, etc.) and their misuse to generate false health information, which is a direct harm to health and communities. The researchers demonstrated that these AI systems can be configured to lie authoritatively, posing a real risk of harm. The harm is realized in the form of misinformation that can lead to injury or harm to health (harm category a) and harm to communities (d). The article also discusses the potential for malicious actors to exploit these vulnerabilities, reinforcing the direct link between AI misuse and harm. Hence, this is an AI Incident rather than a hazard or complementary information.
Thumbnail Image

It's too easy to make AI chatbots lie about health information: study

2025-07-02
The Business Standard
Why's our monitor labelling this an incident or hazard?
The event involves the use and potential misuse of AI systems (LLMs) to generate false health information, which directly harms people's health by spreading misinformation. The study shows that these AI systems can be adapted to produce harmful outputs, indicating a direct link between AI use and harm to health. Therefore, this qualifies as an AI Incident because the AI systems' misuse has directly led to a harm scenario (dangerous health misinformation).
Thumbnail Image

It's too easy to make AI chatbots lie about health information, study finds - The Economic Times

2025-07-02
Economic Times
Why's our monitor labelling this an incident or hazard?
The event involves AI systems (chatbots) being intentionally instructed to generate false health information in a convincing manner. This misuse of AI could plausibly lead to harm to individuals' health or groups if users act on incorrect medical advice. Although no direct harm is reported yet, the potential for harm is credible and significant, qualifying this as an AI Hazard rather than an Incident.
Thumbnail Image

AI Chatbots Can Give False Health Information With Fake Citations: Study

2025-07-02
NDTV
Why's our monitor labelling this an incident or hazard?
The event involves AI systems (large language models/chatbots) whose use (customization with system-level instructions) can lead to the generation and dissemination of false health information with fabricated citations. This misinformation can harm individuals' health decisions and public health broadly, which fits the definition of harm to health and communities. However, the article describes a research demonstration of potential misuse rather than an actual incident of harm occurring. Therefore, this qualifies as an AI Hazard, as the event plausibly leads to an AI Incident (harm through misinformation) if such misuse occurs in practice.
Thumbnail Image

It's too easy to make AI chatbots lie about health info, study finds

2025-07-02
India Today
Why's our monitor labelling this an incident or hazard?
The article explicitly involves AI systems (large language models) and their use in generating false health information. The study shows that these AI systems can be configured to produce harmful misinformation, which could plausibly lead to harm to people's health if widely believed or acted upon. No actual harm is reported yet, but the credible risk of harm from misuse is clear. This fits the definition of an AI Hazard, as the event describes circumstances where AI use could plausibly lead to an AI Incident (harm). It is not an AI Incident because no realized harm is described, nor is it Complementary Information or Unrelated, as the focus is on the risk of harm from AI misuse.
Thumbnail Image

It's Too Easy to Make AI Chatbots Lie About Health Information, Study Finds

2025-07-02
Medscape
Why's our monitor labelling this an incident or hazard?
The event involves AI systems explicitly (large language models like GPT-4o, Gemini, Claude, etc.) and their use to generate false health information that can mislead people, causing harm to health and communities. The study shows that these AI systems can be configured to routinely produce false, authoritative-sounding medical misinformation, which is a direct violation of public health safety and can lead to injury or harm to people. The harm is realized in the potential for widespread misinformation dissemination, which is a recognized form of harm to communities and health. The article also discusses the AI systems' development and use aspects, including the ease of misuse and the need for better safeguards. Hence, this qualifies as an AI Incident rather than a hazard or complementary information.
Thumbnail Image

It's too easy to make AI chatbots lie about health information, study finds - BusinessWorld Online

2025-07-02
BusinessWorld
Why's our monitor labelling this an incident or hazard?
The event involves AI systems (large language models) that can be configured to produce false health information, which could plausibly lead to harm to people's health if deployed maliciously or irresponsibly. The article does not describe an actual incident of harm occurring but warns of the credible risk of such harm. This fits the definition of an AI Hazard, as the misuse of AI chatbots to spread health misinformation could plausibly lead to injury or harm to health. The study's findings and warnings about the ease of misuse and the potential for high-volume misinformation dissemination support this classification.