Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Overview Tools Metrics About the catalogue

AI red team service

Website

Github

The AI red team service exposes hidden safety and security threats across the entire lifecycle of artificial intelligence (AI) systems by applying an adversarial mindset to assess AI systems during:

design
development
deployment
operations stage

By leveraging an adversarial mindset, the service assesses AI systems across all lifecycle stages, focusing on model evaluations, identity and infrastructure assessments, and production application testing. This comprehensive approach ensures that security principles are effectively applied to emerging technologies such as machine learning, large language models, generative AI, and agentic workflows.

The service decomposes AI systems into individual components and holistically evaluates them for attack vectors and vulnerabilities, including both AI-specific and traditional risks. It simulates advanced threat actors using novel and well-known adversary tactics to provide insight into worst-case scenarios without incurring actual security risks. Deliverables provide a realistic understanding of potential attack vectors and actionable approaches for resolution and defence against adversaries.

Across lifecycle phases, the service includes threat modelling during design to anticipate attacks and inform secure controls, evaluation of models and pipelines during development for safety, security, privacy, and alignment risks, assessment of deployed systems from a compromised-user perspective, and traditional red team exercises in operations to generate telemetry and realistic adversary simulations that improve detection and response capabilities.

About the tool

You can click on the links to see the associated tools

Developing organisation(s):

SpecterOps

Tool type(s):

Technical validation
Risk management framework

Objective(s):

Privacy
Safety
Digital Security

Impacted stakeholders:

Employees
Management

Purpose(s):

Risk management

Country/Territory of origin:

International

Lifecycle stage(s):

Operate & monitor
Deploy
Build & interpret model
Plan & design

Type of approach:

Technical

Usage rights:

Restricted access

Target groups:

Private sector
Technical community

Target users:

Developer
System operators

Stakeholder group:

Business
Technical community

Benefits:

Reduction in risk of failure
Responsible implementation

Geographical scope:

International

People involved:

IT employees
Operations employees

Risk management stage(s):

Define
Assess
Treat

Tags:

ai risks
ai vulnerabilities
safety
ai evaluation

Modify this tool

Use Cases

There is no use cases for this tool yet.

Would you like to submit a use case for this tool?

If you have used this tool, we would love to know more about your experience.

Add use case

Partnership on AI

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.