Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Type

Safety

Clear all

Origin

Scope

SUBMIT A TOOL

If you have a tool that you think should be featured in the Catalogue of AI Tools & Metrics, we would love to hear from you!

SUBMIT
Objective Safety

TechnicalProceduralFranceEuropean UnionUploaded on Mar 24, 2025
Behavior Elicitation Tool (BET) is a complex-AI system that systematically probes and elicits specific behaviors from cutting-edge LLMs. Whether for red-teaming or targeted behavioral analysis, this automated solution is Dynamic Optimized and Adversarial (DAO) and can be configured to test the robustness precisely and help to have a better control of the AI system.

TechnicalSwitzerlandEuropean UnionUploaded on Jan 24, 2025
COMPL-AI is an open-source compliance-centered evaluation framework for Generative AI models

ProceduralUploaded on Jan 6, 2025
This document addresses how artificial intelligence machine learning can impact the safety of machinery and machinery systems.

Objective(s)


TechnicalFranceUploaded on Dec 6, 2024
AIxploit is a tool designed to evaluate and enhance the robustness of Large Language Models (LLMs) through adversarial testing. This tool simulates various attack scenarios to identify vulnerabilities and weaknesses in LLMs, ensuring they are more resilient and reliable in real-world applications.

Objective(s)

Related lifecycle stage(s)

Operate & monitorVerify & validate

TechnicalUnited StatesUploaded on Nov 8, 2024
The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and machine learning engineers to proactively find risks in their generative AI systems.

Objective(s)

Related lifecycle stage(s)

Operate & monitorVerify & validate

ProceduralUploaded on Nov 7, 2024
Trustworthy AI Procurement CardTM is a non-exhaustive list of information which can accompany acquisition decisions. The Card is similar to the DataSheets or Model Cards, in the sense that the objective is to promote transparency and better due diligence during AI procurement process.

TechnicalUploaded on Nov 5, 2024
garak, Generative AI Red-teaming & Assessment Kit, is an LLM vulnerability scanner. Garak checks if an LLM can be made to fail.

Objective(s)

Related lifecycle stage(s)

Operate & monitorVerify & validate

TechnicalUnited StatesUploaded on Sep 9, 2024
Harms Modeling is a practice designed to help you anticipate the potential for harm, identify gaps in product that could put people at risk, and ultimately create approaches that proactively address harm.

Objective(s)


TechnicalUnited StatesUploaded on Sep 9, 2024
Dioptra is an open source software test platform for assessing the trustworthy characteristics of artificial intelligence (AI). It helps developers on determining which types of attacks may impact negatively their model's performance.

TechnicalFranceUploaded on Aug 2, 2024
Evaluate input-output safeguards for LLM systems such as jailbreak and hallucination detectors, to understand how good they are and on which type of inputs they fail.

TechnicalUnited StatesUploaded on Aug 2, 2024
AI Security Platform for GenAI and Conversational AI applications. Probe enables security officers and developers identify, mitigate, and monitor AI system security.

TechnicalUploaded on Aug 2, 2024
Responsible AI (RAI) Repairing Assistant

ProceduralUploaded on Jul 2, 2024
This document reviews current aerospace software, hardware, and system development standards used in the certification/approval process of safety-critical airborne and ground-based systems, and assesses whether these standards are compatible with a typical Artificial Intelligence (AI) and Machine Learning (ML) development approach.

Objective(s)


ProceduralUploaded on Jul 2, 2024
First full revision of PAS 1881:2020 to reflect learning from recent automated vehicle trials and input from stakeholders. It deals with how to build operational safety cases for trialling and testing of AVs so that ultimately connected and automated vehicles can be deployed safely, and the public will be confident about their safety.

Objective(s)


ProceduralUploaded on Jul 2, 2024
PAS 11281 is the international standard on road vehicles that gives recommendations for managing security risks that might lead to a compromise of safety in a connected automotive ecosystem.

Objective(s)


ProceduralUploaded on Jul 2, 2024
The scope of the expert recommendation is the design and layout of autonomous systems from the perspective of human reliability.

Objective(s)


ProceduralUploaded on Jul 2, 2024
PAS 1882:2021 details how information should be handled during automated vehicle trials to ensure it’s collected consistently and improves the safety of UK trials

Objective(s)


ProceduralUploaded on Jul 2, 2024
This Recommendation provides a framework of civilian unmanned aerial vehicle flight control using artificial intelligence (AI), including the flight navigation control of a civilian unmanned aerial vehicle (CUAV) and the specific flight control according to the vertical industry application requirements.

Objective(s)


ProceduralUploaded on Jul 2, 2024
Defining a minimum set of reasonable assumptions and foreseeable scenarios that shall be considered in the development of safety related models that are part of an automated driving system (ADS).

Objective(s)


ProceduralUploaded on Jul 2, 2024
This document specifies requirements for the collection, curation, storage and sharing of information during automated vehicle trials and advanced trials in the UK in relation to information collected or received by the system. It covers all automated and co-operative automated driving vehicle trials on any land with public access.

Objective(s)


catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.