ChatGPT Search Vulnerable to Manipulation via Hidden Text

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

Research reveals that ChatGPT Search can be manipulated through hidden text on websites, leading to biased or misleading outputs. This vulnerability, known as 'prompt injection,' allows for the generation of false positive reviews and malicious code. OpenAI is aware and working on security improvements, but the issue raises concerns about misinformation risks.[AI generated]

Why's our monitor labelling this an incident or hazard?

An AI system (ChatGPT Search) was exploited through hidden text to produce harmful outputs, including code that led to a financial scam. The harm has materialized, making this an AI Incident rather than a potential hazard or complementary update.[AI generated]

AI principles

Robustness & digital securitySafetyTransparency & explainabilityAccountabilityFairnessDemocracy & human autonomy

Industries

Media, social platforms, and marketingDigital securityIT infrastructure and hostingConsumer services

Affected stakeholders

ConsumersGeneral public

Harm types

ReputationalEconomic/PropertyPublic interest

Severity

AI incident

AI system task:

Content generationInteraction support/chatbots

Articles about this incident or hazard

Thumbnail Image

ChatGPT search tool vulnerable to manipulation and deception, tests show

2024-12-24

Yahoo! Finance

Why's our monitor labelling this an incident or hazard?

This article describes a security vulnerability in an AI system that could plausibly lead to harmful outcomes—misleading recommendations, malware delivery, or other attacks. No actual harm has yet occurred, but the demonstrated weaknesses create credible future risks. Therefore, it constitutes an AI Hazard.

Thumbnail Image

ChatGPT search tool vulnerable to manipulation and deception, tests show

2024-12-24

The Guardian

Why's our monitor labelling this an incident or hazard?

The report examines tests revealing that hidden webpage content can manipulate ChatGPT’s search summaries or inject malicious code. These findings highlight a credible risk of future harms (deception, malware distribution, credential theft) rather than documenting a single, fully realized AI-driven incident. This fits the definition of an AI Hazard.

Thumbnail Image

ChatGPT Search can be tricked into misleading users, new research reveals | TechCrunch

2024-12-26

TechCrunch

Why's our monitor labelling this an incident or hazard?

The report reveals a concrete method for exploiting an AI system already in production, posing a credible risk of misinformation or malicious code injection. Since no incident of realized harm is described but the hazard is clearly actionable, this qualifies as an AI Hazard.

Thumbnail Image

OpenAI's ChatGPT Search Can Be Manipulated With Prompt Injection & Hidden Text To Produce Favourable Results?

2024-12-28

english

Why's our monitor labelling this an incident or hazard?

The report details how hidden HTML/CSS-based text and injected prompts can alter ChatGPT Search’s behavior to present favorable but deceptive content. This is a risk that could plausibly lead to user deception (harm to users’ decision-making) or distribution of malicious code. No actual incident of harm has occurred yet, but the described manipulation is a credible future threat, fitting the definition of an AI Hazard.

Thumbnail Image

ChatGPT Search Can Be Manipulated To Spread False Information, Warns Study

2024-12-27

TimesNow

Why's our monitor labelling this an incident or hazard?

No actual harm or incident is reported; rather, the study reveals a vulnerability in ChatGPT Search that could plausibly lead to misinformation, harmful code generation, and related harms. This potential risk classifies the event as an AI Hazard.

Thumbnail Image

ChatGPT Search Vulnerable To Manipulation; Can Be Used To Spread Malicious Code - Lowyat.NET

2024-12-27

Lowyat.NET

Why's our monitor labelling this an incident or hazard?

An AI system (ChatGPT Search) was exploited through hidden text to produce harmful outputs, including code that led to a financial scam. The harm has materialized, making this an AI Incident rather than a potential hazard or complementary update.

Thumbnail Image

ChatGPT search tool vulnerable to manipulation and deception, tests show

2024-12-24

AOL.com

Why's our monitor labelling this an incident or hazard?

This article primarily describes how vulnerabilities in ChatGPT’s search tool could be exploited—via hidden text or prompt injection—to deceive users or deliver malicious code. While it references a past code-theft incident, the focus is on plausible future harms if these weaknesses are not fixed. Therefore, it is best classified as an AI Hazard rather than a realized AI Incident.

Thumbnail Image

ChatGPT Search : de graves failles de s├⌐curit├⌐ ont ├⌐t├⌐ d├⌐voil├⌐es par une enqu├¬te

2024-12-25

Les Num├⌐riques

Why's our monitor labelling this an incident or hazard?

The report uncovers and analyzes serious security flaws in an AI system currently under test, highlighting how adversaries could manipulate it to produce harmful or misleading outputs. No widespread incident stemming from these specific vulnerabilities is described; instead, the article warns that they could plausibly lead to real harm. This aligns with the definition of an AI Hazard.

Thumbnail Image

Is ChatGPT Search safe? OpenAI's new tool under fire for this reason

2024-12-27

The Financial Express

Why's our monitor labelling this an incident or hazard?

No actual user harm or incident has yet been documented; rather, the piece highlights a concrete security weakness that could plausibly be exploited to produce false reviews or distribute harmful code in the future. This matches the definition of an AI Hazard, where misuse or manipulation of an AI system could lead to harm.

Thumbnail Image

Hidden content tricks ChatGPT into rewriting search results, Guardian shows

2024-12-25

Mashable

Why's our monitor labelling this an incident or hazard?

An AI system (ChatGPT Search) is involved, and the issue arises from its use and susceptibility to prompt injection attacks. Although no direct harm has occurred yet, the described vulnerability could plausibly lead to significant harm such as misinformation or manipulation of public opinion, which affects communities and potentially violates rights to accurate information. Therefore, this event qualifies as an AI Hazard rather than an Incident, as the harm is potential and not realized.

Thumbnail Image

ChatGPT search tool vulnerable to manipulation and deception, tests show | ChatGPT - News Directory 3

2024-12-24

News Directory 3

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT with its search function) whose use has directly led to harm in the form of misinformation and potential user deception. The manipulation of ChatGPT's outputs via prompt injection and hidden content causes biased and misleading information dissemination, which harms users' ability to make informed decisions and undermines trust. This fits the definition of an AI Incident because the AI system's use has directly led to harm to communities (misinformation) and potentially to individuals (e.g., financial or security risks). The article describes realized harm, not just potential, and thus it is not merely a hazard or complementary information.

Thumbnail Image

ChatGPT Search Tool Vulnerable to Manipulation - News Directory 3

2024-12-26

News Directory 3

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT search tool) whose use and malfunction (susceptibility to hidden text manipulation) have directly led to harm, including misinformation and security risks exemplified by a cryptocurrency scam. The harm affects users' trust and safety, fitting the definition of an AI Incident. The article details realized harm rather than potential harm, and the AI system's role is pivotal in causing these harms. Hence, the classification as AI Incident is appropriate.

Thumbnail Image

Guardian - Το ChatGPT είναι ευάλωτο σε χειραγώγηση και εξαπάτηση, δείχνουν δοκιμές

2024-12-24

Liberal.gr

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT with search capabilities) whose use has directly led to harm: users receiving manipulated or malicious outputs, including a concrete case of financial loss due to malicious code generated by the AI. The article details how the AI system can be manipulated via hidden content (prompt injection), which is a malfunction or misuse leading to harm. The harms include deception of users and financial damage, fitting the definition of an AI Incident. The article does not merely warn of potential harm but documents actual harm and vulnerabilities exploited in practice.

Thumbnail Image

ChatGPT: "Ευάλωτο σε χειραγώγηση" το εργαλείο αναζήτησης | Η ΚΑΘΗΜΕΡΙΝΗ

2024-12-24

H Kαθημερινή

Why's our monitor labelling this an incident or hazard?

The event involves the use of an AI system (ChatGPT with search capabilities) whose outputs can be influenced by hidden content on websites, leading to misleading or harmful responses. The article provides evidence of actual harm, such as financial loss from malicious code generated by the AI and the risk of users being deceived by manipulated AI outputs. The AI system's development and use are central to these harms, fulfilling the criteria for an AI Incident as the AI's malfunction or exploitation has directly led to harm to individuals (financial loss) and harm to communities (misinformation).

Thumbnail Image

Πόσο εύκολο είναι να μας χειραγωγήσει το ChatGPT;

2024-12-24

NEWS 24/7

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT) whose use has directly led to harm: a user lost money due to malicious code generated by the AI, and the AI's outputs can be manipulated to mislead users, constituting harm to individuals and communities. This meets the criteria for an AI Incident because the AI system's use has directly caused realized harm (financial loss) and misinformation risks. The article also includes expert warnings about future risks, but the presence of actual harm takes precedence.

Thumbnail Image

ChatGPT: Οι κίνδυνοι χειραγώγησης και οι επιπτώσεις για τους χρήστες

2024-12-26

Newpost.gr

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT) whose use has directly led to harm, including financial loss and the spread of manipulated or malicious content. The manipulation of AI outputs through hidden content and the resulting misleading or harmful responses constitute violations that harm users and communities. Therefore, this qualifies as an AI Incident because the AI system's use has directly caused harm.

Thumbnail Image

ChatGPT: "Ευάλωτο σε χειραγώγηση" το εργαλείο αναζήτησης

2024-12-24

www.kathimerini.com.cy

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT with search capabilities) whose use has directly led to harm: users receiving manipulated, misleading, or malicious outputs due to hidden content on websites designed to influence the AI's responses. The article provides concrete examples, including a case where malicious code generated by ChatGPT led to theft of $2,500 from a user. This constitutes harm to persons (financial injury) and harm to communities (misinformation and deception). The AI system's vulnerability to manipulation and the resulting harms meet the criteria for an AI Incident rather than a mere hazard or complementary information.

Thumbnail Image

Κακά νέα για το ChatGPT: Έρευνα αποκαλύπτει πως επιστρέφει παραπλανητικό περιεχόμενο - Dnews

2024-12-24

dnews.gr

Why's our monitor labelling this an incident or hazard?

The event involves the use of an AI system (ChatGPT) whose outputs are manipulated through prompt injection techniques, leading to the AI returning misleading and potentially harmful content. This manipulation directly results in harm by deceiving users with false positive reviews and possibly malicious code, which affects user trust and safety. The article describes actual occurrences of this manipulation and its effects, not just potential risks. Hence, it meets the criteria for an AI Incident as the AI system's use has directly led to harm to communities and users.

Thumbnail Image

Vous faites vos recherches web avec ChatGPT ? Attention, les résultats peuvent être corrompus

2024-12-27

Clubic.com

Why's our monitor labelling this an incident or hazard?

The event involves the use of an AI system (ChatGPT Search) and discusses how malicious actors can manipulate web content to deceive the AI, potentially causing it to generate incorrect or misleading information. While no specific harm has been reported as having occurred, the article clearly indicates a credible risk that such misuse could lead to harm, such as misinformation dissemination. Therefore, this situation qualifies as an AI Hazard, as the AI system's use could plausibly lead to harm through corrupted outputs.

Thumbnail Image

ChatGPT Search peut être trompé pour tromper les utilisateurs, révèle une nouvelle recherche

2024-12-27

News 24

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT Search) whose use has resulted in misleading summaries and the potential generation of malicious code, directly causing harm to users by spreading false information and security threats. The AI system's vulnerability to hidden text attacks is a malfunction or misuse leading to these harms. Since the harm is realized (misleading summaries are generated and could deceive users), this qualifies as an AI Incident rather than a hazard or complementary information.

Thumbnail Image

ChatGPT Search : crédule et peu sécurisé, le moteur de recherche inquiète

2024-12-26

L'Éclaireur Fnac

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT Search) whose use has directly led to harms: misinformation through manipulated search results and financial theft via malicious AI-generated code. The AI system's vulnerability to hidden prompts and the resulting misleading outputs have caused harm to users' trust and security. The concrete example of stolen tokens confirms realized harm. Hence, this is an AI Incident as the AI system's use has directly caused harm to individuals and communities.

Thumbnail Image

ChatGPT Search peut être amené à fournir de fausses informations~? car ChatGPT peut être influencé par le contenu caché des pages web, une tactique connue sous le nom de \

2024-12-27

Developpez.com

Why's our monitor labelling this an incident or hazard?

The article explicitly involves an AI system, ChatGPT Search, which uses large language models and real-time web data to generate responses. The study demonstrates that the AI system's outputs can be manipulated through prompt injection, causing it to provide false or misleading information. This constitutes an AI Incident because the AI's use has directly led to harm—specifically misinformation and potential deception of users. The harm to communities and users' right to accurate information is a recognized form of harm under the framework. The article also references real cases of incorrect source citations and manipulated outputs, confirming realized harm rather than just potential risk. Therefore, this event qualifies as an AI Incident rather than a hazard or complementary information.

Thumbnail Image

ChatGPT sous le feu des critiques en raison de sa vulnérabilité à la manipulation et à la manipulation SEO

2024-12-29

Business AM

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT's search tool) whose use is vulnerable to manipulation via prompt injection, leading to misleading outputs and potential security risks. These outcomes constitute harm to communities through misinformation and possible distribution of harmful code, fulfilling the criteria for an AI Incident. The harm is occurring or highly likely given the demonstrated manipulations and the potential for malicious exploitation, not merely a theoretical risk. Therefore, this qualifies as an AI Incident rather than a hazard or complementary information.

Thumbnail Image

Experiment: Motorul de căutare al ChatGPT este vulnerabil la manipulare și înșelăciune - HotNews.ro

2024-12-24

HotNews.ro

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT's AI-powered search) whose use is shown to be vulnerable to manipulation via hidden text on web pages. This manipulation can cause the AI to produce misleading or false outputs, which can deceive users and cause harm to individuals and communities relying on the information. The article includes expert warnings about the high risk of such misuse if the system is widely deployed without fixes. Since the harm is not yet realized but plausibly could occur, this fits the definition of an AI Hazard rather than an AI Incident. The article focuses on the potential for harm due to the AI system's vulnerabilities, not on a realized harm event or a governance response, so it is not Complementary Information.

Thumbnail Image

ChatGPT Search poate genera rezumate eronate prin manipulare

2024-12-27

Mediafax.ro

Why's our monitor labelling this an incident or hazard?

The article explicitly involves an AI system (ChatGPT Search) and shows how its use can be manipulated to produce erroneous and potentially harmful outputs, such as misleading summaries and malicious code generation. This manipulation directly leads to harm in the form of misinformation and security threats, fulfilling the criteria for an AI Incident. Although no specific harm event is described as having occurred yet, the demonstrated ability to cause such harm through manipulation constitutes realized harm potential and risk, especially given the AI system's deployment in a functional product. Therefore, this qualifies as an AI Incident rather than a mere hazard or complementary information.

Thumbnail Image

Instrumentul de căutare ChatGPT "este vulnerabil la manipulare și inducere în eroare", arată o investigație The Guardian

2024-12-24

G4Media.ro

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT search tool using large language models) whose use has directly led to harm by returning manipulated, misleading, or malicious content to users. The investigation shows that hidden content on web pages can cause the AI to produce false positive evaluations and even malicious code, which can mislead users and pose security risks. This meets the criteria for an AI Incident because the AI system's outputs have caused harm to users and communities through misinformation and potential security threats. The involvement is through the AI system's use and its vulnerability to manipulation, leading to realized harm rather than just potential harm.

Thumbnail Image

Investigație The Guardian. Funcția de căutare din modelele de AI ale ChatGPT este vulnerabilă la tehnici de manipulare. Prin introducerea de text ascuns pe paginile web, modelul a fost "păcălit" să returneze rezultate false și cod malițios, expunând utilizatorii la riscuri semnificative de securitate. - Biziday

2024-12-25

Biziday

Why's our monitor labelling this an incident or hazard?

The article explicitly involves an AI system (ChatGPT's search function) whose use has directly led to harms: false information dissemination and a financial loss caused by malicious code generated by the AI. The manipulation via hidden text (prompt injection) is a misuse of the AI system's use, causing harm to users' security and trust. The financial loss example confirms realized harm. Hence, this is an AI Incident rather than a hazard or complementary information.

Thumbnail Image

ChatGPT-ul este vulnerabil la manipulare și înșelăciuni, arată noi teste: "Modelul produce conținut care a fost practic virusat"

2024-12-24

Libertatea

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (ChatGPT) whose use and vulnerability to prompt injection attacks have directly led to harm, including financial loss and the potential spread of malicious or misleading content. The article details actual incidents and expert warnings about the AI producing harmful outputs due to manipulation, fulfilling the criteria for an AI Incident. The harm includes financial injury to a person and risks of misinformation, both recognized harms under the framework. Hence, the classification is AI Incident.