Purdue Study: ChatGPT Delivers Incorrect Programming Answers 52% of the Time

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

A Purdue University study found that ChatGPT’s programming answers on 517 StackOverflow questions were wrong 52% of the time, overly verbose 77%, and misinterpreted context 54%. Despite this, about one-third of developers still trusted its confident responses, risking flawed code and wasted development effort.[AI generated]

Why's our monitor labelling this an incident or hazard?

ChatGPT is an AI system providing programming answers. The study shows that over half of its answers contain inaccuracies, and users often overlook these errors, which can indirectly lead to harm in software development or learning. This constitutes an AI Incident because the AI system's use has directly or indirectly led to harm through misinformation and its consequences. The harm is non-physical but significant, affecting users' work and potentially broader software quality.[AI generated]

AI principles

Robustness & digital securityTransparency & explainabilitySafetyHuman wellbeingAccountabilityDemocracy & human autonomy

Industries

IT infrastructure and hosting

Affected stakeholders

Consumers

Harm types

Economic/Property

Business function:

Research and development

AI system task:

Content generationInteraction support/chatbots

Purdue Study: ChatGPT Delivers Incorrect Programming Answers 52% of the Time

Why's our monitor labelling this an incident or hazard?

Articles about this incident or hazard

How often does ChatGPT answer programming questions incorrectly?

ChatGPT Answers Programming Questions Incorrectly 52% of the Time: Study

Study Finds That 52 Percent of ChatGPT Answers to Programming Questions Are Wrong

New study claims ChatGPT offers wrong programming answers 52 percent of the time

Scientists find ChatGPT is inaccurate when answering computer programming questions

ChatGPT For Coding? AI Algorithm Gets 52% of Programming Answers Wrong, Study Finds

ChatGPT Gets 52% Of Programming Answers Wrong, Study Finds

ChatGPT Answers Programming Questions Incorrectly 52% of the Time: Study