Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Type

Robustness

Clear all

Origin

Scope

SUBMIT A TOOL

If you have a tool that you think should be featured in the Catalogue of AI Tools & Metrics, we would love to hear from you!

Submit

TechnicalEducationalUploaded on Aug 27, 2025
AI Screener to enable universal early screening for all children.

Objective(s)

Related lifecycle stage(s)

Plan & design

ProceduralUploaded on Aug 1, 2025
BeSpecial is an AI-driven platform designed to support university students with dyslexia by providing personalized digital tools and tailored learning strategies. Developed within the European VRAILEXIA project, BeSpecial combines clinical data, self-assessments, and psychometric tests to recommend customized resources like audiobooks and concept maps, as well as inclusive academic practices. The platform also raises awareness and trains educators to foster inclusive higher education environments.

Related lifecycle stage(s)

Operate & monitorDeploy

AustraliaUploaded on May 22, 2025
FloodMapp is a technology company that specialises in rapid real-time flood forecasting and flood inundation mapping to provide greater warning time and situational awareness.

Objective(s)


TechnicalEducationalMexicoUnited StatesIsraelUploaded on May 19, 2025
SeismicAI is a provider of innovative Earthquake Early Warning Systems (EEW) ensuring earthquake preparedness. SeismicAI's algorithms utilise local sensors to issue high-precision alerts for earthquake preparedness. The system covers the full early warning cycle - from monitoring and reporting, through alerts, to optionally triggering automated preventive actions.

Objective(s)

Related lifecycle stage(s)

Operate & monitor

TechnicalEuropeUploaded on May 19, 2025
The AIFS is the first fully operational weather prediction open model using machine learning technology for weather forecasting.

TechnicalUnited StatesUploaded on May 15, 2025
The GDA leverages aerial imagery, satellite data, and machine learning techniques to evaluate the damage in areas impacted by natural disasters. This tool greatly enhances the efficiency and precision of disaster response operations.

TechnicalUnited StatesUploaded on May 2, 2025
ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) is a globally accessible, living knowledge base of adversary tactics and techniques against Al-enabled systems based on real-world attack observations and realistic demonstrations from Al red teams and security groups.

ProceduralCanadaUploaded on Mar 31, 2025
This program provides organisations with a comprehensive, independent review of their AI approaches, ensuring alignment with consensus standards and enhancing trust among stakeholders and the public in their AI practices.

Related lifecycle stage(s)

Verify & validate

TechnicalUnited StatesUploaded on Jan 8, 2025
MLPerf Client is a benchmark for Windows and macOS, focusing on client form factors in ML inference scenarios like AI chatbots, image classification, etc. The benchmark evaluates performance across different hardware and software configurations, providing command line interface.

ProceduralUploaded on Jan 6, 2025
This document addresses how artificial intelligence machine learning can impact the safety of machinery and machinery systems.

Objective(s)


ProceduralUploaded on Jan 6, 2025
ISO/IEC 25023:2016 defines quality measures for quantitatively evaluating system and software product quality in terms of characteristics and subcharacteristics defined in ISO/IEC 25010 and is intended to be used together with ISO/IEC 25010.

EducationalUnited KingdomUploaded on Dec 9, 2024
PLIM is designed to make benchmarking and continuous monitoring of LLMs safer and more fit for purpose. This is particularly important in high-risk environments, e.g. healthcare, finance, insurance and defence. Having community-based prompts to validate models as fit for purpose is safer in a world where LLMs are not static. 

TechnicalFranceUploaded on Dec 6, 2024
AIxploit is a tool designed to evaluate and enhance the robustness of Large Language Models (LLMs) through adversarial testing. This tool simulates various attack scenarios to identify vulnerabilities and weaknesses in LLMs, ensuring they are more resilient and reliable in real-world applications.

Objective(s)

Related lifecycle stage(s)

Operate & monitorVerify & validate

TechnicalUploaded on Dec 6, 2024
Continuous proactive AI red teaming platform for AI and GenAI models, applications and agents.

TechnicalUnited KingdomUploaded on Dec 6, 2024
Continuous automated red teaming for AI, minimize security threats to AI models and applications.

TechnicalInternationalUploaded on Dec 6, 2024
Evaluating machine learning agents on machine learning engineering.

ProceduralSingaporeUploaded on Oct 2, 2024
Resaro offers independent, third-party assurance of mission-critical AI systems. It promotes responsible, safe and robust AI adoption for enterprises, through technical advisory and evaluation of AI systems against emerging regulatory requirements.

TechnicalFranceUploaded on Aug 2, 2024
Evaluate input-output safeguards for LLM systems such as jailbreak and hallucination detectors, to understand how good they are and on which type of inputs they fail.

Objective(s)

Related lifecycle stage(s)

Operate & monitorVerify & validate

TechnicalUnited StatesUploaded on Aug 2, 2024
AI Security Platform for GenAI and Conversational AI applications. Probe enables security officers and developers identify, mitigate, and monitor AI system security.

Related lifecycle stage(s)

Operate & monitorVerify & validate

ProceduralUploaded on Jul 1, 2024
MPAI-CAE V1.4 is a collection of four use cases to improve the user audio experience in a variety of situations.

Partnership on AI

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.