Anthropic's Claude Source Code Leak Reveals Ambitious AI Agent Capabilities

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

Anthropic accidentally leaked 512,000 lines of Claude's source code, exposing a new platform, Conway, designed for persistent, autonomous AI agents capable of running background tasks across devices. The leak highlights potential future risks, such as security breaches or loss of user control, due to the system's advanced, always-on capabilities.[AI generated]

Why's our monitor labelling this an incident or hazard?

The article explicitly involves AI systems (large language models like Claude, GPT-5.5, and DeepSeek's models) and discusses their development, use, and strategic withholding of capabilities. While no direct harm or incident is reported, the narrative centers on the plausible risks and competitive pressures that could lead to harm if these powerful AI capabilities were exposed or misused. The discussion of safety concerns, capability overhang, and strategic restraint indicates a credible risk of future AI incidents. There is no indication of actual injury, rights violations, or other harms having occurred yet, so it does not meet the criteria for an AI Incident. It is not merely complementary information because the main focus is on the plausible risks and strategic behaviors that could lead to harm, not on responses or ecosystem updates. Hence, the classification as AI Hazard is appropriate.[AI generated]
AI principles
Robustness & digital securityAccountability

Industries
Digital securityIT infrastructure and hosting

Affected stakeholders
BusinessGeneral public

Harm types
ReputationalEconomic/Property

Severity
AI hazard

AI system task:
Goal-driven organisation


Articles about this incident or hazard

Thumbnail Image

Claude Code降智非錯覺,Anthropic承認產品層變更造成品質下滑

2026-04-25
iThome Online
Why's our monitor labelling this an incident or hazard?
The event explicitly involves an AI system (Claude Code) and changes to its operation that affected its performance. However, the reported issues relate to quality degradation (e.g., reasoning, repetition, forgetfulness) without any mention of resulting harm such as injury, rights violations, or disruption. The company fixed the problems and communicated transparently about the causes and remediation. Since no harm occurred or is plausibly expected from these quality issues, this does not meet the criteria for an AI Incident or AI Hazard. Instead, it is an update on the AI system's performance and the company's response, which fits the definition of Complementary Information.
Thumbnail Image

AI 巨头,走入黑暗森林-钛媒体官方网站

2026-04-25
tmtpost.com
Why's our monitor labelling this an incident or hazard?
The article explicitly involves AI systems (large language models like Claude, GPT-5.5, and DeepSeek's models) and discusses their development, use, and strategic withholding of capabilities. While no direct harm or incident is reported, the narrative centers on the plausible risks and competitive pressures that could lead to harm if these powerful AI capabilities were exposed or misused. The discussion of safety concerns, capability overhang, and strategic restraint indicates a credible risk of future AI incidents. There is no indication of actual injury, rights violations, or other harms having occurred yet, so it does not meet the criteria for an AI Incident. It is not merely complementary information because the main focus is on the plausible risks and strategic behaviors that could lead to harm, not on responses or ecosystem updates. Hence, the classification as AI Hazard is appropriate.
Thumbnail Image

凌晨3点,硅谷炸锅!Anthropic源码意外泄露,51万行代码揭露Claude惊天野心

2026-04-25
凤凰网(凤凰新媒体)
Why's our monitor labelling this an incident or hazard?
The article explicitly involves an AI system (Claude and its new Conway platform) and discusses its development and use. The accidental leak of the source code is a malfunction in the development process. Although no actual harm has been reported, the described capabilities of the AI system to autonomously run tasks in the background, across devices, and potentially control workflows imply plausible future harms such as security breaches, privacy violations, or loss of user control. Therefore, this event fits the definition of an AI Hazard, as it plausibly could lead to an AI Incident in the future. It is not an AI Incident because no realized harm is described, nor is it Complementary Information or Unrelated.
Thumbnail Image

Anthropic回应模型"变笨" 官方解释原因

2026-04-25
中华网科技公司
Why's our monitor labelling this an incident or hazard?
The article discusses user complaints about the AI model's performance decline and the company's explanation attributing the issue to the software framework rather than the AI model's core capabilities. There is no mention or implication of any harm caused by the AI system's development, use, or malfunction. Therefore, this event does not meet the criteria for an AI Incident or AI Hazard. It is an update on an AI system's status and company response, fitting the definition of Complementary Information.