Study Finds ChatGPT Inconsistent and Inaccurate for Medical Self-Diagnosis

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

A research team led by Tokyo Medical and Dental University found that ChatGPT provides inconsistent and sometimes inaccurate diagnoses and recommendations for common orthopedic conditions. The study highlights potential risks if patients rely on the AI for self-diagnosis, as its outputs may lead to misdiagnosis or delayed medical care.[AI generated]

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT) used in a healthcare context for self-diagnosis. The study highlights inconsistent accuracy and recommendations, which could plausibly lead to harm such as misdiagnosis or delayed medical care. However, no actual injury, health harm, or rights violation has been reported as having occurred. Therefore, this situation represents a plausible risk of harm from the AI system's use, fitting the definition of an AI Hazard rather than an AI Incident. The article is focused on the potential risks and limitations of ChatGPT's diagnostic capabilities, not on a realized incident or harm.[AI generated]
AI principles
SafetyRobustness & digital securityTransparency & explainabilityAccountabilityHuman wellbeing

Industries
Healthcare, drugs, and biotechnology

Affected stakeholders
Consumers

Harm types
Physical (injury)

Severity
AI hazard

AI system task:
Interaction support/chatbotsContent generationReasoning with knowledge structures/planning


Articles about this incident or hazard

Thumbnail Image

Quest for ChatGPT's Consistency in Healthcare Assessment

2023-10-19
Medindia
Why's our monitor labelling this an incident or hazard?
The event involves the use of an AI system (ChatGPT) in a healthcare context, specifically assessing its diagnostic precision and recommendation quality. However, the article reports on the evaluation results without indicating any actual harm or incidents caused by ChatGPT's use. There is no mention of injury, rights violations, or other harms resulting from ChatGPT's outputs. The focus is on research findings about the AI's current performance and the need for improvement, which constitutes complementary information about AI capabilities and limitations rather than an incident or hazard.
Thumbnail Image

Can ChatGPT diagnose your condition? Not yet, say researchers

2023-10-16
Medical Xpress - Medical and Health News
Why's our monitor labelling this an incident or hazard?
The event involves an AI system (ChatGPT) used in a healthcare context for self-diagnosis. The study highlights inconsistent accuracy and recommendations, which could plausibly lead to harm such as misdiagnosis or delayed medical care. However, no actual injury, health harm, or rights violation has been reported as having occurred. Therefore, this situation represents a plausible risk of harm from the AI system's use, fitting the definition of an AI Hazard rather than an AI Incident. The article is focused on the potential risks and limitations of ChatGPT's diagnostic capabilities, not on a realized incident or harm.
Thumbnail Image

AI chatbot ChatGPT may not be accurate enough for self-diagnosis

2023-10-17
News-Medical.net
Why's our monitor labelling this an incident or hazard?
The event involves the use of an AI system (ChatGPT) in a healthcare context for self-diagnosis. The study identifies significant inconsistencies and inaccuracies in the AI's diagnostic outputs and recommendations, which could plausibly lead to harm to patients' health (harm category a) if patients rely on it without proper medical consultation. Since the article discusses potential harm based on the AI's current limitations and does not report an actual incident of harm, this qualifies as an AI Hazard rather than an AI Incident. The article does not focus on responses, governance, or updates but on the potential risk posed by the AI system's current performance.
Thumbnail Image

ChatGPT Unfit for Medical Diagnosis, Says Eureka

2023-10-16
Mirage News
Why's our monitor labelling this an incident or hazard?
The article involves an AI system (ChatGPT) used for medical diagnosis, which is a use case with potential for harm if inaccurate. The study finds inconsistent accuracy and recommendations, implying a plausible risk of harm (misdiagnosis leading to health harm). However, the article does not report any actual incidents of harm or injury caused by ChatGPT's use, only the potential for such harm. Therefore, this qualifies as an AI Hazard, as the AI system's use could plausibly lead to harm but no harm has yet been documented.
Thumbnail Image

Can ChatGPT diagnose your condition? Not yet

2023-10-16
EurekAlert!
Why's our monitor labelling this an incident or hazard?
The event involves the use of an AI system (ChatGPT) in a healthcare context for self-diagnosis. The study highlights that the AI's inconsistent and sometimes inaccurate outputs could plausibly lead to harm (e.g., misdiagnosis, delayed medical consultation), which fits the definition of an AI Hazard. There is no indication that harm has already occurred, so it is not an AI Incident. The article is focused on the evaluation of the AI system's current limitations and the potential risks, rather than reporting on a realized harm or a governance response, so it is not Complementary Information. Therefore, the classification is AI Hazard.