FDA's Elsa AI Tool Fabricates Studies, Raising Drug Approval Safety Concerns

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

The FDA's generative AI tool, Elsa, designed to expedite drug approval processes, has been found to hallucinate—fabricating non-existent studies and misinterpreting real research. Employees report that Elsa's unreliable outputs require extensive human verification, raising concerns about potential risks to drug safety if its outputs are trusted without oversight.[AI generated]

Why's our monitor labelling this an incident or hazard?

Elsa is an AI system explicitly mentioned as being used in the FDA's drug approval workflow. Its malfunction—hallucinating nonexistent studies and misrepresenting research—directly affects the reliability of critical health regulatory work. This can lead to harm to public health if erroneous information influences drug approval decisions. Although the AI is currently used with human oversight and is optional, the presence of hallucinations and misrepresentations in a high-stakes health context meets the criteria for an AI Incident due to the direct link to potential harm to health. The article details realized issues with the AI's outputs, not just potential risks, so it is not merely a hazard or complementary information.[AI generated]
AI principles
AccountabilitySafetyRobustness & digital securityTransparency & explainability

Industries
Healthcare, drugs, and biotechnologyGovernment, security, and defence

Affected stakeholders
Consumers

Harm types
Physical (injury)Physical (death)

Severity
AI incident

Business function:
Compliance and justice

AI system task:
Content generation


Articles about this incident or hazard

Thumbnail Image

FDA's artificial intelligence is supposed to revolutionize drug approvals. It's making up nonexistent studies.

2025-07-23
CNN International
Why's our monitor labelling this an incident or hazard?
Elsa is an AI system explicitly mentioned as being used in the FDA's drug approval workflow. Its malfunction—hallucinating nonexistent studies and misrepresenting research—directly affects the reliability of critical health regulatory work. This can lead to harm to public health if erroneous information influences drug approval decisions. Although the AI is currently used with human oversight and is optional, the presence of hallucinations and misrepresentations in a high-stakes health context meets the criteria for an AI Incident due to the direct link to potential harm to health. The article details realized issues with the AI's outputs, not just potential risks, so it is not merely a hazard or complementary information.
Thumbnail Image

FDA's artificial intelligence is supposed to revolutionize drug approvals. It's making up nonexistent studies. | CNN Politics

2025-07-23
CNN
Why's our monitor labelling this an incident or hazard?
The AI system Elsa is explicitly mentioned and is used in the FDA's drug approval process. The article details its malfunction in hallucinating studies and misrepresenting research, which undermines its reliability. While the AI is currently used mainly for organizational tasks and not for final review decisions, the potential for it to influence critical regulatory decisions exists. This creates a credible risk of harm to public health if erroneous AI outputs are relied upon. Since no actual harm or violation has been reported yet, but plausible future harm is evident, the event fits the definition of an AI Hazard rather than an AI Incident. The article also discusses governance and oversight gaps, reinforcing the potential risk context.
Thumbnail Image

FDA employees say the agency's Elsa generative AI hallucinates entire studies

2025-07-23
engadget
Why's our monitor labelling this an incident or hazard?
Elsa is an AI system used to aid clinical review processes at the FDA, thus its outputs influence decisions impacting patient health. The hallucination of nonexistent studies or misrepresentation of real research constitutes a malfunction or misuse of the AI system. This unreliability can directly or indirectly lead to harm to patients if decisions are made based on false information. The article describes realized issues with the AI's outputs, not just potential risks, and the harm is related to health outcomes, fitting the definition of an AI Incident.
Thumbnail Image

The FDA's new drug-approving AI chatbot is not helping

2025-07-24
Mashable ME
Why's our monitor labelling this an incident or hazard?
The AI system (Elsa) is explicitly mentioned and is involved in the FDA's drug approval process. The reported hallucinations and fabrications represent a malfunction or misuse of the AI system. However, the AI tool is not currently used in actual clinical review or decision-making, and no harm to health, rights, or property has been reported. The event thus represents a plausible risk that the AI system could lead to harm if used improperly or prematurely in critical decisions, but no incident has occurred yet. Therefore, this qualifies as an AI Hazard, as the AI system's malfunction could plausibly lead to harm in the future if integrated into drug approval decisions without adequate safeguards.
Thumbnail Image

The FDA's new drug-approving AI chatbot is not helping

2025-07-23
Mashable SEA
Why's our monitor labelling this an incident or hazard?
The AI system (Elsa) is explicitly mentioned and is involved in the FDA's processes. However, its use is limited to non-critical organizational tasks, and it is not currently used in drug approval decisions. The hallucinations and fabrications are recognized problems but have not led to any realized harm or disruption. The article focuses on the AI's malfunction risks and the need for further development and oversight, which aligns with an AI Hazard scenario. However, since no harm or disruption has occurred or is imminent, and the AI is not yet deployed in a way that could plausibly lead to harm, the event is best classified as Complementary Information, providing context on AI deployment challenges and governance concerns in a critical sector.
Thumbnail Image

FDA's New Drug Approval AI Is Generating Fake Studies: Report

2025-07-23
Gizmodo
Why's our monitor labelling this an incident or hazard?
Elsa is an AI system explicitly mentioned as being used by FDA employees for tasks related to drug approval and research summarization. The AI's hallucination of fake studies and misrepresentation of research has directly led to misinformation within a critical public health agency, posing a risk of harm to public health and safety. The harm is realized as false or fabricated studies have been generated and potentially used in decision-making, which fits the definition of injury or harm to health (a). The event involves the AI system's use and malfunction (hallucination), leading to direct harm. Hence, it qualifies as an AI Incident rather than a hazard or complementary information.
Thumbnail Image

FDA's Elsa AI Hallucinates Studies, Says Employees: Double-Checking Facts Is a Must

2025-07-24
Tech Times
Why's our monitor labelling this an incident or hazard?
Elsa AI is a generative AI system used by the FDA to assist in drug approval processes. The reported hallucinations and misinterpretations indicate a malfunction or failure in the AI's outputs. While the article does not confirm actual harm (e.g., incorrect drug approvals causing injury), the potential for such harm is credible given the critical nature of drug approval. Therefore, this situation fits the definition of an AI Hazard, as the AI's malfunction could plausibly lead to harm to people's health if relied upon without sufficient human oversight.
Thumbnail Image

FDA's AI tool used for drug approvals generates fake studies

2025-07-24
NewsBytes
Why's our monitor labelling this an incident or hazard?
Elsa is an AI system used by the FDA for tasks related to drug approvals. The reported hallucination—fabrication of non-existent studies—constitutes a malfunction or misuse of the AI system's outputs. Since drug approval decisions rely on accurate scientific evidence, the use of fabricated studies could lead to injury or harm to people if unsafe or ineffective drugs are approved based on false information. Therefore, this event involves an AI system whose malfunction has directly or indirectly led to potential harm, qualifying it as an AI Incident.
Thumbnail Image

Can AI Accelerate Clinical Review at FDA?

2025-07-23
pharmexec.com
Why's our monitor labelling this an incident or hazard?
Elsa is an AI system (a large language model) used by the FDA to expedite clinical review processes. The article highlights that Elsa has hallucinated, producing false citations and inaccurate summaries, which could undermine the safety and efficacy assessments critical to drug approval. Although the AI is currently used optionally and with human oversight, the risk of erroneous outputs influencing regulatory decisions is credible. No actual harm or injury has been reported yet, but the potential for such harm exists if the AI's hallucinations are not properly managed. Thus, this situation fits the definition of an AI Hazard, where the AI system's malfunction could plausibly lead to an AI Incident in the future.
Thumbnail Image

FDA's 'Elsa' AI For Faster Drug Approvals Under Fire for Hallucinating Studies, Highlighting Widespread Reliability Risks - WinBuzzer

2025-07-24
WinBuzzer
Why's our monitor labelling this an incident or hazard?
The event involves an AI system ('Elsa') used in a high-stakes government role for drug approvals, where its malfunction (hallucinating studies and providing false information) has directly led to increased human oversight and potential risks to public health. The AI's unreliable outputs in critical decision-making processes constitute harm to the health of people (harm category a). Therefore, this qualifies as an AI Incident due to the realized harm and direct involvement of the AI system's malfunction in a critical health-related regulatory context.
Thumbnail Image

FDA's Elsa AI Tool Hallucinates Studies, Sparks Concern Over Lack Of Oversight In High-Stakes Healthcare Decisions Where Accuracy Directly Impacts Public Safety And Trust

2025-07-24
Wccftech
Why's our monitor labelling this an incident or hazard?
The article explicitly mentions an AI system (Elsa) used in drug and device approval workflows, which is a high-stakes environment where accuracy directly impacts public safety. The AI system is reported to hallucinate and fabricate research, which is a malfunction. While no direct harm is reported, the potential for serious harm to health and trust is clear if such outputs influence regulatory decisions. The lack of mandatory use and insufficient oversight increases the risk. Since harm is plausible but not yet realized, this fits the definition of an AI Hazard rather than an AI Incident.
Thumbnail Image

The FDA Is Using an AI to "Speed Up" Drug Approvals and Insiders Say It's Making Horrible Mistakes

2025-07-24
Futurism
Why's our monitor labelling this an incident or hazard?
Elsa is an AI system used by the FDA to assist in drug approval processes. The insiders report that Elsa hallucinates false studies, which could cause the approval of harmful drugs, thus posing a direct risk of injury or harm to people's health. The AI's malfunction and flawed outputs are central to the risk described. Therefore, this qualifies as an AI Incident due to the direct link between the AI system's use and potential harm to human health.
Thumbnail Image

FDA AI Tool Elsa Hallucinates Fake Data in Drug Reviews

2025-07-24
WebProNews
Why's our monitor labelling this an incident or hazard?
Elsa is an AI system explicitly described as a generative AI tool used in drug reviews and inspections. The article details its malfunction in hallucinating false data, which could plausibly lead to serious harm in the FDA's regulatory decisions affecting public health. While no actual harm has been reported yet, the risk is credible and significant given the FDA's role. The article focuses on the AI's problematic behavior and the potential consequences rather than reporting a realized harm event. Thus, it fits the definition of an AI Hazard rather than an AI Incident or Complementary Information.
Thumbnail Image

FDA is using an AI system that staff say frequently invents or misrepresents drug research

2025-07-24
THE DECODER
Why's our monitor labelling this an incident or hazard?
Elsa is a generative AI system actively used in evaluating new drugs, and it is reported to hallucinate or invent studies, which can lead to incorrect assessments of drug safety and efficacy. This directly implicates the AI system's use in causing or risking harm to people's health, fulfilling the criteria for an AI Incident. The harm is not just potential but ongoing, as the system is already in use despite known reliability issues, thus posing a direct risk to public health through regulatory decisions based on flawed AI outputs.
Thumbnail Image

The FDA's AI Is Busy Approving Drugs -- and Hallucinating Fake Studies

2025-07-25
VICE
Why's our monitor labelling this an incident or hazard?
The FDA's AI system Elsa is explicitly mentioned and is used in the drug approval process, a critical health-related function. The AI system is reported to hallucinate and fabricate information confidently, which has been incorporated into drug approval reports, directly risking harm to people's health. This constitutes a malfunction and misuse of the AI system leading to potential or actual harm. The involvement of AI in producing false data that could lead to unsafe drug approvals is a direct link to harm under the definition of AI Incident (harm to health). Therefore, this event qualifies as an AI Incident.
Thumbnail Image

Oops: FDA's revolutionary AI tool is inventing non-existent data

2025-07-24
Cybernews
Why's our monitor labelling this an incident or hazard?
Elsa is an AI system (a generative AI tool) used by the FDA. It is reported to hallucinate and produce fabricated medical studies and incorrect data, which is a malfunction of the AI system. Although it is not currently used in clinical review protocols, the generation of false data related to drug approvals and medical research is a serious issue that has already caused inefficiencies and could lead to harm if relied upon. The event describes realized harm in terms of misinformation and operational disruption within a critical health regulatory agency, which fits the definition of an AI Incident (harm to health and disruption of critical infrastructure). The presence of hallucinations and false data generation by the AI system directly led to sidelining the tool and increased workload, confirming the AI system's role in causing harm. Hence, the classification is AI Incident.
Thumbnail Image

États-Unis: l'IA utilisée par le ministère de la santé se base sur des études scientifiques qui n'existent pas

2025-07-25
BFMTV
Why's our monitor labelling this an incident or hazard?
Elsa is an AI system used by the FDA to assist in evaluating clinical data and accelerating drug approvals. The AI's hallucinations—fabricating studies and distorting research—constitute a malfunction that directly impairs the agency's management and operation, a form of harm under category (b) (disruption of critical infrastructure management). The harm is indirect but real, as it slows down the agency's work and risks misinformation in a critical public health context. Therefore, this qualifies as an AI Incident rather than a mere hazard or complementary information, since harm is already occurring due to the AI's outputs.
Thumbnail Image

Cette IA fabrique des études bidon... et l'agence sanitaire...

2025-07-25
Futura
Why's our monitor labelling this an incident or hazard?
An AI system (Elsa) is explicitly mentioned as being used in the FDA's drug approval process. The AI's hallucinations and fabrications of scientific studies directly undermine the reliability of safety assessments, which could lead to the approval of unsafe drugs, thus posing a direct risk of harm to people's health. This constitutes an AI Incident because the AI system's malfunction and use have directly led to a significant potential harm to public health and safety.
Thumbnail Image

Ce chatbot devait moderniser l'administration américaine, mais il ne fait qu'halluciner

2025-07-24
Le Huffington Post
Why's our monitor labelling this an incident or hazard?
Elsa is an AI system deployed for use within the FDA. The article reports that Elsa's hallucinations and inaccurate outputs have directly impaired employees' ability to rely on it for accurate information, causing operational inefficiencies and misinformation. This constitutes a malfunction of the AI system leading to disruption in the management and operation of a critical infrastructure agency (the FDA). Therefore, this event qualifies as an AI Incident due to the realized harm of operational disruption and misinformation within a critical public health institution.
Thumbnail Image

Des médicaments validés sur des études fictives ? L'IA de la FDA sous le feu des critiques

2025-07-28
Trust My Science
Why's our monitor labelling this an incident or hazard?
Elsa is an AI system used in drug approval processes, a critical area affecting public health. The reported hallucinations—fabricating or distorting scientific studies—constitute a malfunction of the AI system. This malfunction could lead to the approval of unsafe drugs, causing injury or harm to people. The article indicates that this risk is ongoing and significant, with multiple insiders raising concerns. Hence, the event meets the criteria for an AI Incident due to the direct or indirect link between the AI system's malfunction and potential harm to health.
Thumbnail Image

USA : l'IA qui devait révolutionner les autorisations de médicaments hallucine des études - Next

2025-07-25
Next
Why's our monitor labelling this an incident or hazard?
Elsa is an AI system (a generative AI based on Claude) used by the FDA for drug approval-related tasks. The AI's hallucinations (fabricating or distorting scientific studies) represent a malfunction in its use. Although no direct harm has yet occurred, the potential for erroneous drug approvals or regulatory mistakes due to reliance on inaccurate AI outputs is a credible risk. The article does not report actual harm or regulatory failures but highlights the AI's unreliability and the need for caution. Hence, this is an AI Hazard rather than an AI Incident or Complementary Information.