Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Type

Origin

Scope

SUBMIT A TOOL

If you have a tool that you think should be featured in the Catalogue of Tools & Metrics for Trustworthy AI, we would love to hear from you!

Submit

ProceduralUploaded on Mar 20, 2026
Judgment Assurance is a decision-governance discipline that reframes human judgment as a governed institutional asset. It provides a structured framework and practical instruments, including the Underwriting Questionnaire (JA-UQ) and Maturity Model (JAMM-PS), to ensure that consequential AI-mediated decisions are reconstructible and defensible. By defining minimum governance controls for human oversight, it closes the "accountability gap," allowing institutions to define, record, own, and guard the reasoning behind consequential AI-supported outcomes.

TechnicalProceduralUploaded on Mar 20, 2026
The Approved Intelligence Platform (AIP) provides modular, scenario-based testing workflows to evaluate mission-critical AI systems in defence, public safety, and critical civil use cases. It delivers a comprehensive, end-to-end testing environment based on a proprietary AI trust ontology with measurable AI Solutions Quality Indicators (ASQI) for the testing, evaluation, validation and verification of software solutions with different AI modalities.

ProceduralUploaded on Mar 20, 2026
The AI Governance Playbook from the Council on AI Governance helps organizations align people, processes and tools to achieve responsible AI outcomes.

ProceduralCanadaUploaded on Apr 1, 2025
An artificial intelligence (AI) impact assessment tool to provide organisations a method to assess AI systems for compliance with Canadian human rights law. The purpose of this human rights AI impact assessment is to assist developers and administrators of AI systems to identify, assess, minimise or avoid discrimination and uphold human rights obligations throughout the lifecycle of an AI system.

TechnicalUnited StatesUploaded on Mar 24, 2025
An open-source Python library designed for developers to calculate fairness metrics and assess bias in machine learning models. This library provides a comprehensive set of tools to ensure transparency, accountability, and ethical AI development.

TechnicalUnited StatesUploaded on Nov 8, 2024
The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and machine learning engineers to proactively find risks in their generative AI systems.

Related lifecycle stage(s)

Operate & monitorVerify & validate

EducationalUnited StatesUploaded on Nov 6, 2024<1 day
The deck of 50 Trustworthy AI Cards™ correspond to 50 most relevant concepts under the 5 categories of: Data, AI, Generative AI, Governance, and Society. The Cards are used to create awareness and literacy on opportunities and risks about AI, and how to govern these technologies.

ProceduralUnited KingdomUploaded on Oct 2, 2024
Warden AI provides independent, tech-led AI bias auditing, designed for both HR Tech platforms and enterprises deploying AI solutions in HR. As the adoption of AI in recruitment and HR processes grows, concerns around fairness have intensified. With the advent of regulations such as NYC Local Law 144 and the EU AI Act, organisations are under increasing pressure to demonstrate compliance and fairness.

ProceduralUploaded on Oct 2, 2024
FairNow is an AI governance software tool that simplifies and centralises AI risk management at scale. To build and maintain trust with customers, organisations must conduct thorough risk assessments on their AI models, ensuring compliance, fairness, and security. Risk assessments also ensure organisations know where to prioritise their AI governance efforts, beginning with high-risk models and use cases.

TechnicalUnited StatesUploaded on Aug 2, 2024
AI Security Platform for GenAI and Conversational AI applications. Probe enables security officers and developers identify, mitigate, and monitor AI system security.

Related lifecycle stage(s)

Operate & monitorVerify & validate

ProceduralNew ZealandUploaded on Jul 11, 2024
The Algorithm Charter for Aotearoa New Zealand is a set of voluntary commitments developed by Stats NZ in 2020 to increase public confidence and visibility around the use of algorithms within Aotearoa New Zealand’s public sector. In 2023, Stats NZ commissioned Simply Privacy to develop the Algorithm Impact Assessment Toolkit (AIA Toolkit) to help government agencies meet the Charter commitments. The AIA Toolkit is designed to facilitate informed decision-making about the benefits and risks of government use of algorithms.

ProceduralUploaded on Jul 2, 2024
The purpose of the present document is to provide information on different types of AI mechanisms that can be used for cognitive networking and decision making in modern system design, including natural language processing.

ProceduralUploaded on Jul 3, 2024
This document addresses bias in relation to AI systems, especially with regards to AI-aided decision-making.

TechnicalBrazilUploaded on Jun 26, 2024
Privacy compliance platform, based on AI/Blockchain, which helps global companies to keep compliant with the data protection requirements.

Related lifecycle stage(s)

Deploy

TechnicalProceduralUnited StatesUnited KingdomEuropean UnionUploaded on Jul 11, 2024
Enzai’s EU AI Act Compliance Framework makes achieving compliance with the world’s first comprehensive AI governance legislation as easy as possible. The framework breaks hundreds of pages of regulations down into easy-to-follow steps allowing teams to enable organisations to seamlessly and confidently assess the compliance of their AI Systems with the Act and complete the requisite conformity assessments.

TechnicalUnited KingdomUploaded on Jun 14, 2024
NayaOne is a Sandbox-as-a-Service provider to tier 1 financial services institutions, world-leading regulators, and governments. This sandbox is designed to address key concerns in AI deployment by providing a single environment where AI can be evaluated and procured while also enabling collaboration and access to world-leading tools.

TechnicalSingaporeUploaded on Jun 25, 2024
Developed by the AI Verify Foundation, Moonshot is one of the first tools to bring Benchmarking and Red-Teaming together to help AI developers, compliance teams and AI system owners evaluate LLMs and LLM applications.

Objective(s)

Related lifecycle stage(s)

Verify & validate

TechnicalUnited KingdomUploaded on Jun 5, 2024
Advai Versus is a tool for developers to test and evaluate a company's AI systems. Integrated within the MLOps architecture, Advai Versus can be used to test for biases, security, and other critical aspects, ensuring that the AI models are robust and fit for purpose.

TechnicalProceduralSpainUploaded on May 21, 2024
LangBiTe is a framework for testing biases in large language models. It includes a library of prompts to test sexism / misogyny, racism, xenophobia, ageism, political bias, lgtbiq+phobia and religious discrimination. Any contributor may add new ethical concerns to assess.

TechnicalProceduralUnited StatesJapanUploaded on Apr 19, 2024
Diagnose bias in LLMs (Large Language Models) from various points of views, allowing users to choose the most appropriate LLM.

Related lifecycle stage(s)

Plan & design

Partnership on AI

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.