Google’s Gemini 2.5 Flash AI Shows Safety Regression Before Kids Rollout

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

Google’s internal tests reveal its new Gemini 2.5 Flash generates 4.1% more unsafe text and 9.6% more unsafe image captions than its predecessor. Despite these safety regressions, Google plans to open Gemini to children under 13 with parental controls, warning the model may err and expose youngsters to unwanted content.[AI generated]

Why's our monitor labelling this an incident or hazard?

The event involves an AI system (Google's Gemini 2.5 Flash) and concerns its development and evaluation regarding safety compliance. Although the model shows a regression in safety metrics and a higher chance of producing unsafe content, the article does not report any realized harm or incidents caused by the AI system. The focus is on potential risks and internal testing results, which could plausibly lead to harm if the model is deployed without mitigation, but no direct or indirect harm has occurred yet. Therefore, this qualifies as an AI Hazard rather than an AI Incident or Complementary Information.[AI generated]
AI principles
SafetyRobustness & digital securityAccountabilityHuman wellbeingTransparency & explainability

Industries
Consumer servicesMedia, social platforms, and marketing

Affected stakeholders
Children

Harm types
PsychologicalReputationalHuman or fundamental rights

Severity
AI hazard

Business function:
Citizen/customer service

AI system task:
Content generationInteraction support/chatbotsRecognition/object detection


Articles about this incident or hazard

Thumbnail Image

هوش مصنوعی "گوگل" در ارزیابی ایمنی امتیاز پایینی گرفت

2025-05-03
ایسنا
Why's our monitor labelling this an incident or hazard?
The event involves an AI system (Google's Gemini 2.5 Flash) and concerns its development and evaluation regarding safety compliance. Although the model shows a regression in safety metrics and a higher chance of producing unsafe content, the article does not report any realized harm or incidents caused by the AI system. The focus is on potential risks and internal testing results, which could plausibly lead to harm if the model is deployed without mitigation, but no direct or indirect harm has occurred yet. Therefore, this qualifies as an AI Hazard rather than an AI Incident or Complementary Information.
Thumbnail Image

گوگل هوش مصنوعی جمینای را برای کودکان زیر ۱۳ سال باز می‌کند

2025-05-03
خبرگزاری ایلنا
Why's our monitor labelling this an incident or hazard?
The article explicitly involves an AI system (Google's Gemini chatbot) being made accessible to children under 13 under parental supervision. While no actual harm is reported, the article discusses potential risks such as children encountering inappropriate content or misunderstanding the AI, which could plausibly lead to harm (psychological or informational harm). The involvement is in the use phase of the AI system. Since the harm is potential and not realized, this fits the definition of an AI Hazard. The article also references similar issues with other chatbots and calls for regulation, reinforcing the plausible risk. It is not Complementary Information because it is not primarily about responses or updates to past incidents, nor is it unrelated as it clearly concerns AI and potential harm.
Thumbnail Image

نتیجه یک آزمایش: هوش مصنوعی جدید گوگل امنیت پایینی دارد

2025-05-03
زومیت
Why's our monitor labelling this an incident or hazard?
The event involves an AI system (Gemini 2.5 Flash) and its development and use, specifically its safety performance in generating content. Although the model produces content that may violate safety policies and raise ethical concerns, the article does not report any realized harm or incidents resulting from this behavior. The concerns are about potential future harms or risks if such unsafe content is disseminated or used. Therefore, this qualifies as an AI Hazard, as the AI system's behavior could plausibly lead to harms such as misinformation, ethical violations, or other safety issues, but no direct or indirect harm has yet occurred according to the article.