Tools for Trustworthy AI

Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Overview Tools Metrics About the catalogue

Show tools Show use cases

Type

Technical

Educational

Procedural

Origin

Scope

SUBMIT A TOOL

If you have a tool that you think should be featured in the Catalogue of AI Tools & Metrics, we would love to hear from you!

Submit

CAST (Constructive Approach to Smart Technologies)

TechnicalProceduralPolandUploaded on Jul 23, 2025

CAST is an open framework for responsible AI design and engineering. It offers design heuristics and patterns, and RAI recommendations through generative features and online content.

Objective(s)

Robustness Transparency

Related lifecycle stage(s)

Build & interpret model Plan & design

Automated Decision-Making Implications Tool (ADMIT)

ProceduralItalyUploaded on Jun 19, 2025

ADMIT is a research tool within a broader methodological framework combining quantitative and qualitative strategies to identify, analyse, and mitigate social implications associated with automated decision-making systems while enhancing their potential benefits. It supports comprehensive assessments of sociotechnical impacts to inform responsible design, deployment, and governance of automation technologies.

Objective(s)

Human Agency & Control Safety Digital Security

Related lifecycle stage(s)

Deploy Build & interpret model Plan & design

HiddenLayer AISec Platform

TechnicalUnited StatesUploaded on May 19, 2025

HiddenLayer’s AISec Platform is a GenAI Protection Suite purpose-built to ensure the integrity of AI models throughout the MLOps pipeline. The platform provides detection and response for GenAI and traditional AI models to detect prompt injections, adversarial AI attacks, and digital supply chain vulnerabilities.

Objective(s)

Safety Digital Security

Related lifecycle stage(s)

Deploy Verify & validate Build & interpret model

A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management

ProceduralFranceUploaded on Apr 2, 2025

This tool provides a comprehensive risk management framework for frontier AI development, integrating established risk management principles with AI-specific practices. It combines four key components: risk identification through systematic methods, quantitative risk analysis, targeted risk treatment measures, and clear governance structures.

Objective(s)

Safety

Related lifecycle stage(s)

Build & interpret model

Eticas Bias

TechnicalUnited StatesUploaded on Mar 24, 2025

An open-source Python library designed for developers to calculate fairness metrics and assess bias in machine learning models. This library provides a comprehensive set of tools to ensure transparency, accountability, and ethical AI development.

Objective(s)

Fairness Robustness

Related lifecycle stage(s)

Operate & monitor Build & interpret model Collect & process data

Behavior Elicitation Tool

TechnicalFranceEuropean UnionUploaded on Mar 24, 2025

Behavior Elicitation Tool (BET) is a complex-AI system that systematically probes and elicits specific behaviors from cutting-edge LLMs. Whether for red-teaming or targeted behavioral analysis, this automated solution is Dynamic Optimized and Adversarial (DAO) and can be configured to test the robustness precisely and help to have a better control of the AI system.

Objective(s)

Robustness Safety

Related lifecycle stage(s)

Deploy Verify & validate Build & interpret model

MLPerf Client

TechnicalUnited StatesUploaded on Jan 8, 2025

MLPerf Client is a benchmark for Windows and macOS, focusing on client form factors in ML inference scenarios like AI chatbots, image classification, etc. The benchmark evaluates performance across different hardware and software configurations, providing command line interface.

Objective(s)

Performance Robustness

Related lifecycle stage(s)

Operate & monitor Deploy Build & interpret model

Adversa: AI Red Teaming Platform

TechnicalUploaded on Dec 6, 2024

Continuous proactive AI red teaming platform for AI and GenAI models, applications and agents.

Objective(s)

Privacy Robustness

Related lifecycle stage(s)

Verify & validate Build & interpret model Plan & design

Vectice

TechnicalProceduralUnited StatesUploaded on Dec 6, 2024

Vectice is a regulatory MLOps platform for AI/ML developers and validators that streamlines documentation, governance, and collaborative reviewing of AI/ML models. Designed to enhance audit readiness and ensure regulatory compliance, Vectice automates model documentation, from development to validation. With features like automated lineage tracking and documentation co-pilot, Vectice empowers AI/ML developers and validators to work in their favorite environment while focusing on impactful work, accelerating productivity, and reducing risk.

Objective(s)

Robustness Transparency

Related lifecycle stage(s)

Deploy Verify & validate Build & interpret model

Dioptra

TechnicalUnited StatesUploaded on Sep 9, 2024

Dioptra is an open source software test platform for assessing the trustworthy characteristics of artificial intelligence (AI). It helps developers on determining which types of attacks may impact negatively their model's performance.

Objective(s)

Robustness Safety

Related lifecycle stage(s)

Operate & monitor Verify & validate Build & interpret model

Adhere AI

TechnicalUploaded on Aug 2, 2024

Responsible AI (RAI) Repairing Assistant

Objective(s)

Fairness Safety Transparency

Related lifecycle stage(s)

Verify & validate Build & interpret model Plan & design

Algorithm Impact Assessment Toolkit

ProceduralNew ZealandUploaded on Jul 11, 2024

The Algorithm Charter for Aotearoa New Zealand is a set of voluntary commitments developed by Stats NZ in 2020 to increase public confidence and visibility around the use of algorithms within Aotearoa New Zealand’s public sector. In 2023, Stats NZ commissioned Simply Privacy to develop the Algorithm Impact Assessment Toolkit (AIA Toolkit) to help government agencies meet the Charter commitments. The AIA Toolkit is designed to facilitate informed decision-making about the benefits and risks of government use of algorithms.

Objective(s)

Fairness Privacy Transparency

Related lifecycle stage(s)

Build & interpret model Collect & process data Plan & design

ISACA: Digital Trust Ecosystem Framework (DTEF) application to assure AI environments

United StatesUploaded on Jun 17, 2024

ISACA’s Digital Trust Ecosystem Framework (DTEF) offers enterprises a holistic framework that applies systems thinking – the notion that a change in one area can have an impact on another area – across an entire organisation.

Objective(s)

Performance

Related lifecycle stage(s)

Build & interpret model Plan & design

NayaOne’s AI Sandbox

TechnicalUnited KingdomUploaded on Jun 14, 2024

NayaOne is a Sandbox-as-a-Service provider to tier 1 financial services institutions, world-leading regulators, and governments. This sandbox is designed to address key concerns in AI deployment by providing a single environment where AI can be evaluated and procured while also enabling collaboration and access to world-leading tools.

Objective(s)

Performance

Related lifecycle stage(s)

Verify & validate Build & interpret model Plan & design

LangBiTe

TechnicalProceduralSpainUploaded on May 21, 2024

LangBiTe is a framework for testing biases in large language models. It includes a library of prompts to test sexism / misogyny, racism, xenophobia, ageism, political bias, lgtbiq+phobia and religious discrimination. Any contributor may add new ethical concerns to assess.

Objective(s)

Fairness

Related lifecycle stage(s)

Operate & monitor Verify & validate Build & interpret model

Careful AI: PRIDAR Assurance Framework

ProceduralUploaded on May 27, 2024

PRIDAR (Prioritization, Research, Innovation, Development, Analysis, and Review) A Risk Management Framework

Related lifecycle stage(s)

Deploy Build & interpret model Collect & process data

zamba

TechnicalUploaded on Apr 22, 2024

A Python package for identifying 42 kinds of animals, training custom models, and estimating distance from camera trap videos

Objective(s)

Robustness

Related lifecycle stage(s)

Build & interpret model

yopo-you-only-propagate-once

TechnicalUnited StatesUploaded on Apr 22, 2024

Code for our nips19 paper: You Only Propagate Once: Accelerating Adversarial Training Via Maximal Principle

Objective(s)

Privacy Robustness

Related lifecycle stage(s)

Operate & monitor Deploy Verify & validate Build & interpret model Collect & process data Plan & design

wildbook

TechnicalUnited StatesUploaded on Apr 22, 2024

Wild Me's first product, Wildbook supports researchers by allowing collaboration across the globe and automation of photo ID matching

Objective(s)

Fairness Privacy Human Agency & Control Robustness Safety Transparency

Related lifecycle stage(s)

Operate & monitor Deploy Verify & validate Build & interpret model Collect & process data Plan & design

vit-explain

TechnicalIsraelUploaded on Apr 22, 2024

Explainability for Vision Transformers

Objective(s)

Robustness

Related lifecycle stage(s)

Verify & validate Build & interpret model Collect & process data Plan & design

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.