Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Type

Safety

Clear all

Origin

Scope

SUBMIT A TOOL

If you have a tool that you think should be featured in the Catalogue of AI Tools & Metrics, we would love to hear from you!

Submit

TechnicalEducationalUploaded on Aug 27, 2025
AI Screener to enable universal early screening for all children.

Objective(s)

Related lifecycle stage(s)

Plan & design

TechnicalEducationalUploaded on Aug 1, 2025
Dytective by Change Dyslexia is an innovative AI-powered tool designed to detect the risk of dyslexia in children quickly and reliably. Developed in collaboration with researchers, Dytective combines language exercises with machine learning to screen for dyslexia in just 15 minutes. Backed by scientific validation and used by schools and families worldwide, it empowers early intervention and promotes equal opportunities in education.

Related lifecycle stage(s)

Operate & monitorDeploy

AustraliaUploaded on May 22, 2025
FloodMapp is a technology company that specialises in rapid real-time flood forecasting and flood inundation mapping to provide greater warning time and situational awareness.

Objective(s)


TechnicalUnited StatesUploaded on May 22, 2025
Pano - Actionable Intelligence for Wildfire Management is an advanced, connected platform designed for fire professionals, enabling them to detect threats, verify fires, and share critical information with response teams faster than ever before.

TechnicalEducationalMexicoUnited StatesIsraelUploaded on May 19, 2025
SeismicAI is a provider of innovative Earthquake Early Warning Systems (EEW) ensuring earthquake preparedness. SeismicAI's algorithms utilise local sensors to issue high-precision alerts for earthquake preparedness. The system covers the full early warning cycle - from monitoring and reporting, through alerts, to optionally triggering automated preventive actions.

Objective(s)

Related lifecycle stage(s)

Operate & monitor

EducationalIndonesiaUploaded on May 14, 2025
PetaBencana.id leverages AI to provide residents, government agencies, and first responders with a real-time disaster mapping platform for Indonesia.

TechnicalIrelandUploaded on May 2, 2025
Risk Atlas Nexus provides tooling to connect fragmented AI governance resources through a community-driven approach to curation of linkages between risks, datasets, benchmarks, and mitigations. It transforms abstract risk definitions into actionable AI governance workflows.

ProceduralFranceUploaded on Apr 2, 2025
This tool provides a comprehensive risk management framework for frontier AI development, integrating established risk management principles with AI-specific practices. It combines four key components: risk identification through systematic methods, quantitative risk analysis, targeted risk treatment measures, and clear governance structures.

Related lifecycle stage(s)

Build & interpret model

TechnicalFranceEuropean UnionUploaded on Mar 24, 2025
Behavior Elicitation Tool (BET) is a complex-AI system that systematically probes and elicits specific behaviors from cutting-edge LLMs. Whether for red-teaming or targeted behavioral analysis, this automated solution is Dynamic Optimized and Adversarial (DAO) and can be configured to test the robustness precisely and help to have a better control of the AI system.

TechnicalSwitzerlandEuropean UnionUploaded on Jan 24, 2025
COMPL-AI is an open-source compliance-centered evaluation framework for Generative AI models

ProceduralUploaded on Jan 6, 2025
This document addresses how artificial intelligence machine learning can impact the safety of machinery and machinery systems.

Objective(s)


TechnicalFranceUploaded on Dec 6, 2024
AIxploit is a tool designed to evaluate and enhance the robustness of Large Language Models (LLMs) through adversarial testing. This tool simulates various attack scenarios to identify vulnerabilities and weaknesses in LLMs, ensuring they are more resilient and reliable in real-world applications.

Objective(s)

Related lifecycle stage(s)

Operate & monitorVerify & validate

TechnicalUploaded on Dec 6, 2024
Continuous proactive AI red teaming platform for AI and GenAI models, applications and agents.

TechnicalUploaded on Nov 5, 2024
garak, Generative AI Red-teaming & Assessment Kit, is an LLM vulnerability scanner. Garak checks if an LLM can be made to fail.

Related lifecycle stage(s)

Operate & monitorVerify & validate

TechnicalInternationalUploaded on Nov 5, 2024
A fast, scalable, and open-source framework for evaluating automated red teaming methods and LLM attacks/defenses. HarmBench has out-of-the-box support for transformers-compatible LLMs, numerous closed-source APIs, and several multimodal models.

Objective(s)


TechnicalUnited StatesUploaded on Sep 9, 2024
Harms Modeling is a practice designed to help you anticipate the potential for harm, identify gaps in product that could put people at risk, and ultimately create approaches that proactively address harm.

TechnicalUnited StatesUploaded on Sep 9, 2024
Dioptra is an open source software test platform for assessing the trustworthy characteristics of artificial intelligence (AI). It helps developers on determining which types of attacks may impact negatively their model's performance.

TechnicalFranceUploaded on Aug 2, 2024
Evaluate input-output safeguards for LLM systems such as jailbreak and hallucination detectors, to understand how good they are and on which type of inputs they fail.

Objective(s)

Related lifecycle stage(s)

Operate & monitorVerify & validate

ProceduralUploaded on Jul 3, 2024
Defining quality measures for quantitatively evaluating system and software product quality in terms of characteristics and subcharacteristics defined in ISO/IEC 25010 and is intended to be used together with ISO/IEC 25010.

ProceduralUploaded on Jul 2, 2024
The DIN SPEC series describes a number of AI quality requirements which are structured using an AI quality meta model. The DIN SPEC series applies to all phases of the life cycle of an AI module.

Objective(s)


Partnership on AI

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.