The OECD.AI Policy Navigator

Our policy navigator is a living repository from more than 80 jurisdictions and organisations. Use the filters to browse initiatives and find what you are looking for.

Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations (NIST AI 100-2e2025)


-
Added by:   National contact point
Added on:   06 May 2026
Updated by:   OECD analyst
Updated on:   17 May 2026

Published by the National Institute of Standards and Technology (NIST) in March 2025, NIST AI 100-2e2025 provides a taxonomy of concepts and terminology in the field of adversarial machine learning (AML). It covers key types of ML methods, life cycle stages of attack, and attacker goals, capabilities, and knowledge, alongside mitigation methods, aiming to establish a common language to inform standards and practice guides for assessing and managing the security of AI systems.

Initiative overview

The report distinguishes between two broad classes of AI systems: predictive AI (PredAI), which still dominates industrial applications, and generative AI (GenAI), whose adoption in business and consumer contexts has increased rapidly. Real-world failures documented in the report include autonomous vehicles caused to swerve into oncoming lanes and stop signs misclassified as speed limit signs, illustrating the tangible stakes of AML vulnerabilities.

The taxonomy classifies attacks across five dimensions: the type of AI system targeted; the stage of the machine learning life cycle at which the attack occurs (from design and training through to deployment); the attacker's goals in terms of which system properties they seek to violate (availability, integrity, or privacy); the attacker's capabilities and access; and the attacker's knowledge of the learning process. For predictive AI, the taxonomy covers evasion, poisoning, and privacy attacks. For generative AI, it extends to supply chain attacks, direct and indirect prompt injection, misuse violations, and security of AI agents. Each category of attack is paired with corresponding mitigation strategies, along with an assessment of the limitations of those techniques.

The report is the product of an extensive literature review, expert consultation in the field of adversarial machine learning, and original research by its authors, drawn from NIST's Computer Security Division, Northeastern University, the U.S. AI Safety Institute, Cisco, and the U.K. AI Security Institute. It adopts the security, resilience, and robustness concepts from the NIST AI Risk Management Framework, and aligns with the NCSC Machine Learning Principles. While the guidance is voluntary and not intended to supersede existing regulations or law, it is explicitly designed to inform future standards and practice guides. Its primary audience includes those responsible for designing, developing, deploying, evaluating, and governing AI systems.