Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

AI red team service



AI red team service

The AI red team service exposes hidden safety and security threats across the entire lifecycle of artificial intelligence (AI) systems by applying an adversarial mindset to assess AI systems during:

  •  design
  • development
  • deployment
  • operations stage

By leveraging an adversarial mindset, the service assesses AI systems across all lifecycle stages, focusing on model evaluations, identity and infrastructure assessments, and production application testing. This comprehensive approach ensures that security principles are effectively applied to emerging technologies such as machine learning, large language models, generative AI, and agentic workflows.

The service decomposes AI systems into individual components and holistically evaluates them for attack vectors and vulnerabilities, including both AI-specific and traditional risks. It simulates advanced threat actors using novel and well-known adversary tactics to provide insight into worst-case scenarios without incurring actual security risks. Deliverables provide a realistic understanding of potential attack vectors and actionable approaches for resolution and defence against adversaries.

Across lifecycle phases, the service includes threat modelling during design to anticipate attacks and inform secure controls, evaluation of models and pipelines during development for safety, security, privacy, and alignment risks, assessment of deployed systems from a compromised-user perspective, and traditional red team exercises in operations to generate telemetry and realistic adversary simulations that improve detection and response capabilities.

About the tool


Developing organisation(s):




Impacted stakeholders:



Country/Territory of origin:



Type of approach:


Usage rights:






Geographical scope:



Risk management stage(s):


Tags:

  • ai risks
  • ai vulnerabilities
  • safety
  • ai evaluation

Modify this tool

Use Cases

There is no use cases for this tool yet.

Would you like to submit a use case for this tool?

If you have used this tool, we would love to know more about your experience.

Add use case
Partnership on AI

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.