ChatGPT Health Delivers Inaccurate and Alarmist Health Assessments from Apple Health Data

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

OpenAI's ChatGPT Health, designed to analyze personal health data from apps like Apple Health, has produced inconsistent and alarmist health assessments, as revealed by tests from The Washington Post. Medical experts found these AI-generated diagnoses to be unfounded, raising concerns about user confusion, anxiety, and potential health risks from misleading outputs.[AI generated]

Why's our monitor labelling this an incident or hazard?

The article explicitly involves an AI system (ChatGPT Salud) processing personal health data to generate health assessments. The AI system's malfunction—providing inaccurate and inconsistent diagnoses—directly risks harm to users' health by potentially misleading them about their medical condition. The harm is related to injury or harm to health (a), as users might rely on incorrect AI-generated diagnoses. The event is not merely a potential risk but documents actual erroneous outputs and misleading conclusions, fulfilling the criteria for an AI Incident rather than an AI Hazard or Complementary Information. The lack of regulation and the possibility of user reliance on these faulty outputs further supports the classification as an AI Incident.[AI generated]

AI principles

SafetyTransparency & explainability

Industries

Healthcare, drugs, and biotechnology

Affected stakeholders

Consumers

Harm types

PsychologicalPhysical (injury)

Severity

AI incident

Business function:

Citizen/customer service

AI system task:

Forecasting/prediction

Articles about this incident or hazard

Thumbnail Image

Cuidado con los diagnósticos médicos de IA: advierten fallos en ChatGPT Salud desde Apple Watch

2026-01-28

infobae

Why's our monitor labelling this an incident or hazard?

The article explicitly involves an AI system (ChatGPT Salud) processing personal health data to generate health assessments. The AI system's malfunction—providing inaccurate and inconsistent diagnoses—directly risks harm to users' health by potentially misleading them about their medical condition. The harm is related to injury or harm to health (a), as users might rely on incorrect AI-generated diagnoses. The event is not merely a potential risk but documents actual erroneous outputs and misleading conclusions, fulfilling the criteria for an AI Incident rather than an AI Hazard or Complementary Information. The lack of regulation and the possibility of user reliance on these faulty outputs further supports the classification as an AI Incident.

Thumbnail Image

Antes de darle tus datos de salud a ChatGPT, quizá deberías pensártelo dos veces

2026-01-27

La Razón

Why's our monitor labelling this an incident or hazard?

The event involves AI systems explicitly (ChatGPT Health and Claude for Healthcare) interpreting health data, which is an AI system use case. The AI systems' outputs were inconsistent and alarmist, contradicting medical expert opinions, which can cause harm to the user's health and wellbeing by misleading them. This harm is indirect but real, as the AI's erroneous interpretations can lead to confusion, anxiety, or poor health decisions. The article also highlights privacy concerns but the primary harm relates to inaccurate health assessments. Hence, the event meets the criteria for an AI Incident because the AI system's use directly led to harm (confusion and potential health risks).

Thumbnail Image

Lo nuevo de ChatGPT con 'Salud' de Apple aventuraba ser la mejor solución a las consultas médicas a la IA. Acaban de desmontarlo por completo

2026-01-28

Applesfera

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT Salud) that processes personal health data to provide health assessments. The AI's erroneous and inconsistent outputs could mislead users about their health risks, constituting indirect harm to health by potentially influencing harmful decisions or anxiety. The article explicitly mentions the AI's misdiagnosis and the risk of users overtrusting AI-generated medical advice, which aligns with harm to health (a). Although no direct injury is reported, the misleading diagnosis and inconsistent results represent a realized harm scenario. Hence, it meets the criteria for an AI Incident rather than a hazard or complementary information.

Thumbnail Image

ChatGPT Health está causando pánico innecesario

2026-01-27

Digital Trends Español

Why's our monitor labelling this an incident or hazard?

The article explicitly involves AI systems designed to analyze health data and provide personalized health insights. The AI systems' malfunction or limitations in accurately interpreting health metrics have led to inconsistent and potentially misleading health assessments. This can cause psychological harm (panic or false reassurance), which qualifies as harm to health under the AI Incident definition. Since the harm is occurring or has occurred (users being unnecessarily scared or misled), this is an AI Incident rather than a hazard or complementary information.

Thumbnail Image

Darle tus datos de salud a ChatGPT quizás no sea tan buena idea

2026-01-27

iPadizate

Why's our monitor labelling this an incident or hazard?

The article explicitly mentions AI systems analyzing health data and producing inconsistent, unreliable results, indicating AI system involvement. However, the harm is not realized; medical professionals confirmed the AI assessments were incorrect, and no injury or health harm occurred. The article warns about potential privacy risks and the unreliability of AI health diagnostics, which could plausibly lead to harm if relied upon improperly, but no such harm is described as having occurred. Thus, this fits the definition of an AI Hazard, as the AI's use could plausibly lead to harm in the future, but no incident has yet materialized.

Thumbnail Image

¿Deberías fiarte de ChatGPT Salud? Un informe revela que sus evaluaciones son poco fiables

2026-01-29

20 minutos

Why's our monitor labelling this an incident or hazard?

ChatGPT Salud is an AI system designed to interpret health data and provide contextual information. The article highlights that its evaluations are unreliable and could mislead users about their health status. Although no actual injury or harm has been reported, the AI's use could plausibly lead to harm if users act on incorrect information, such as unnecessary anxiety or neglect of proper medical advice. Therefore, this situation fits the definition of an AI Hazard, as the AI system's use could plausibly lead to harm, but no confirmed harm has yet occurred.