Anthropic Removes Hard Safety Limits from AI Scaling Policy, Raising Catastrophic Risk Concerns

Thumbnail Image

The information displayed in the AIM should not be reported as representing the official views of the OECD or of its member countries.

Anthropic, a leading AI company, has revised its Responsible Scaling Policy by removing a key safety commitment that previously barred training advanced AI models without proven safeguards. Experts warn this policy shift increases the plausible risk of catastrophic AI incidents, as safety measures may not keep pace with rapidly advancing capabilities.[AI generated]

Why's our monitor labelling this an incident or hazard?

The article explicitly discusses changes in AI safety policy by Anthropic, an AI company, which is directly related to the development and use of AI systems. The removal of safety guardrails increases the risk that more powerful AI models could be developed and deployed without sufficient controls, potentially leading to catastrophic risks. However, the article does not report any actual harm or incident resulting from this policy change. The focus is on the potential for future harm due to reduced safety measures amid competitive pressures and regulatory gaps. Thus, the event fits the definition of an AI Hazard rather than an AI Incident or Complementary Information.[AI generated]
AI principles
SafetyRobustness & digital security

Industries
Digital security

Affected stakeholders
General public

Harm types
Public interest

Severity
AI hazard

Business function:
Research and development

AI system task:
Content generation


Articles about this incident or hazard