Tools for Trustworthy AI

Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Overview Tools Metrics About the catalogue

Show tools Show use cases

Type

Technical

Educational

Procedural

Robustness

Clear all

Origin

Scope

SUBMIT A TOOL

If you have a tool that you think should be featured in the Catalogue of AI Tools & Metrics, we would love to hear from you!

Submit

AI Screener

TechnicalEducationalUploaded on Aug 27, 2025

AI Screener to enable universal early screening for all children.

Objective(s)

Robustness Safety

Related lifecycle stage(s)

Plan & design

BESPECIAL

ProceduralUploaded on Aug 1, 2025

BeSpecial is an AI-driven platform designed to support university students with dyslexia by providing personalized digital tools and tailored learning strategies. Developed within the European VRAILEXIA project, BeSpecial combines clinical data, self-assessments, and psychometric tests to recommend customized resources like audiobooks and concept maps, as well as inclusive academic practices. The platform also raises awareness and trains educators to foster inclusive higher education environments.

Objective(s)

Human Agency & Control Robustness

Related lifecycle stage(s)

Operate & monitor Deploy

FloodMapp - ForeCast, NowCast and PostCast

AustraliaUploaded on May 22, 2025

FloodMapp is a technology company that specialises in rapid real-time flood forecasting and flood inundation mapping to provide greater warning time and situational awareness.

Objective(s)

Robustness Safety

SeismicAI's Earthquake Early Warning Systems

TechnicalEducationalMexicoUnited StatesIsraelUploaded on May 19, 2025

SeismicAI is a provider of innovative Earthquake Early Warning Systems (EEW) ensuring earthquake preparedness. SeismicAI's algorithms utilise local sensors to issue high-precision alerts for earthquake preparedness. The system covers the full early warning cycle - from monitoring and reporting, through alerts, to optionally triggering automated preventive actions.

Objective(s)

Robustness Safety

Related lifecycle stage(s)

Operate & monitor

Artificial Intelligence Forecasting System (AIFS)

TechnicalEuropeUploaded on May 19, 2025

The AIFS is the first fully operational weather prediction open model using machine learning technology for weather forecasting.

Objective(s)

Robustness Data Governance & Traceability

Geospatial Damage Assessments (GDA) model

TechnicalUnited StatesUploaded on May 15, 2025

The GDA leverages aerial imagery, satellite data, and machine learning techniques to evaluate the damage in areas impacted by natural disasters. This tool greatly enhances the efficiency and precision of disaster response operations.

Objective(s)

Robustness Explainability

MITRE Atlas

TechnicalUnited StatesUploaded on May 2, 2025

ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) is a globally accessible, living knowledge base of adversary tactics and techniques against Al-enabled systems based on real-world attack observations and realistic demonstrations from Al red teams and security groups.

Objective(s)

Robustness Digital Security

AI Ready Validation and Verification Program

ProceduralCanadaUploaded on Mar 31, 2025

This program provides organisations with a comprehensive, independent review of their AI approaches, ensuring alignment with consensus standards and enhancing trust among stakeholders and the public in their AI practices.

Objective(s)

Robustness Data Governance & Traceability

Related lifecycle stage(s)

Verify & validate

MLPerf Client

TechnicalUnited StatesUploaded on Jan 8, 2025

MLPerf Client is a benchmark for Windows and macOS, focusing on client form factors in ML inference scenarios like AI chatbots, image classification, etc. The benchmark evaluates performance across different hardware and software configurations, providing command line interface.

Objective(s)

Robustness Transparency

Related lifecycle stage(s)

Operate & monitor Deploy Build & interpret model

CEN ISO/TR 22100-5:2022 - Safety of machinery - Relationship with ISO 12100 - Part 5: Implications of artificial intelligence machine learning

ProceduralUploaded on Jan 6, 2025

This document addresses how artificial intelligence machine learning can impact the safety of machinery and machinery systems.

Objective(s)

Robustness Safety

ISO/IEC 25023:2016 - Systems and software engineering. Systems and software Quality Requirements and Evaluation (SQuaRE). Measurement of system and software product quality

ProceduralUploaded on Jan 6, 2025

ISO/IEC 25023:2016 defines quality measures for quantitatively evaluating system and software product quality in terms of characteristics and subcharacteristics defined in ISO/IEC 25010 and is intended to be used together with ISO/IEC 25010.

Objective(s)

Robustness Digital Security

CarefulAI: Prompt-LLM Improvement Method (PLIM)

EducationalUnited KingdomUploaded on Dec 9, 2024

PLIM is designed to make benchmarking and continuous monitoring of LLMs safer and more fit for purpose. This is particularly important in high-risk environments, e.g. healthcare, finance, insurance and defence. Having community-based prompts to validate models as fit for purpose is safer in a world where LLMs are not static.

Objective(s)

Human Agency & Control Robustness

AIxploit

TechnicalFranceUploaded on Dec 6, 2024

AIxploit is a tool designed to evaluate and enhance the robustness of Large Language Models (LLMs) through adversarial testing. This tool simulates various attack scenarios to identify vulnerabilities and weaknesses in LLMs, ensuring they are more resilient and reliable in real-world applications.

Objective(s)

Robustness Safety

Related lifecycle stage(s)

Operate & monitor Verify & validate

Adversa: AI Red Teaming Platform

TechnicalUploaded on Dec 6, 2024

Continuous proactive AI red teaming platform for AI and GenAI models, applications and agents.

Objective(s)

Robustness Safety

Related lifecycle stage(s)

Verify & validate Build & interpret model Plan & design

Mindgard

TechnicalUnited KingdomUploaded on Dec 6, 2024

Continuous automated red teaming for AI, minimize security threats to AI models and applications.

Objective(s)

Robustness Digital Security

Related lifecycle stage(s)

Operate & monitor Deploy Verify & validate

MLE-bench

TechnicalInternationalUploaded on Dec 6, 2024

Evaluating machine learning agents on machine learning engineering.

Objective(s)

Robustness Transparency

Resaro

ProceduralSingaporeUploaded on Oct 2, 2024

Resaro offers independent, third-party assurance of mission-critical AI systems. It promotes responsible, safe and robust AI adoption for enterprises, through technical advisory and evaluation of AI systems against emerging regulatory requirements.

Objective(s)

Robustness Data Governance & Traceability

BELLS - Benchmarks for the Evaluation of LLM Safeguards

TechnicalFranceUploaded on Aug 2, 2024

Evaluate input-output safeguards for LLM systems such as jailbreak and hallucination detectors, to understand how good they are and on which type of inputs they fail.

Objective(s)

Robustness Safety

Related lifecycle stage(s)

Operate & monitor Verify & validate

SPLX Platform: End-to-end AI Security Platform

TechnicalUnited StatesUploaded on Aug 2, 2024

AI Security Platform for GenAI and Conversational AI applications. Probe enables security officers and developers identify, mitigate, and monitor AI system security.

Objective(s)

Robustness Digital Security

Related lifecycle stage(s)

Operate & monitor Verify & validate

IEEE 3302-2022 - Adoption of Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) Technical Specification Context-based Audio Enhancement (CAE) Version 1.2

ProceduralUploaded on Jul 1, 2024

MPAI-CAE V1.4 is a collection of four use cases to improve the user audio experience in a variety of situations.

Objective(s)

Robustness Transparency

Partnership on AI

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.