Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Type

Clear all

Fairness

Clear all

Origin

Scope

SUBMIT A TOOL

If you have a tool that you think should be featured in the Catalogue of AI Tools & Metrics, we would love to hear from you!

Submit

TechnicalEducationalUploaded on Oct 1, 2025
An AI-powered speech recognition app that adapts to users' unique speech patterns, facilitating communication for individuals with speech impairments.

Related lifecycle stage(s)

Operate & monitor

TechnicalUnited StatesUploaded on Mar 24, 2025
An open-source Python library designed for developers to calculate fairness metrics and assess bias in machine learning models. This library provides a comprehensive set of tools to ensure transparency, accountability, and ethical AI development.

TechnicalInternationalUploaded on Nov 5, 2024
A fast, scalable, and open-source framework for evaluating automated red teaming methods and LLM attacks/defenses. HarmBench has out-of-the-box support for transformers-compatible LLMs, numerous closed-source APIs, and several multimodal models.

Objective(s)


TechnicalProceduralSpainUploaded on May 21, 2024
LangBiTe is a framework for testing biases in large language models. It includes a library of prompts to test sexism / misogyny, racism, xenophobia, ageism, political bias, lgtbiq+phobia and religious discrimination. Any contributor may add new ethical concerns to assess.

TechnicalEducationalSwitzerlandUploaded on Apr 22, 2024<1 day
A global communityof 500+ researchers from 57+ countries, with foundational free course in a Human Rights-based approach to AI development that explores concrete builds of systems centering human rights values. Community is further enriched by reading & discussion groups, and written community outputs.

TechnicalProceduralUnited StatesJapanUploaded on Apr 19, 2024
Diagnose bias in LLMs (Large Language Models) from various points of views, allowing users to choose the most appropriate LLM.

Related lifecycle stage(s)

Plan & design

TechnicalChinaUploaded on Apr 2, 2024
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

Objective(s)


TechnicalGermanyUploaded on Dec 15, 2023
A collections of public and free annotated datasets of relationships between entities/nominals (Portuguese and English).

TechnicalUploaded on Dec 11, 2023
In this work, we proposed a two-stage framework, which produces debiased representations by using a fairness constraint adversarial framework in the first stage.

TechnicalUploaded on Dec 11, 2023
This paper, for the first time, systematically quantifies and finds speech recognition bias against gender, age, regional accents and non-native accents, and investigates the origin of this bias by investigating bias cross-lingually (i.e., Dutch and Mandarin) and for two different SotA ASR architectures (a hybrid DNN-HMM and an attention based end-to-end (E2E) model) through a phoneme error analysis.

Objective(s)

Related lifecycle stage(s)

Verify & validateCollect & process data

TechnicalUploaded on Dec 11, 2023
Practical recommendations for mitigating bias in automated speaker recognition, and outline future research directions.

TechnicalUnited StatesIndiaUploaded on Dec 11, 2023<6 months
Repository of algorithmic bias related metrics and measures to enable researchers and practitioners to leverage their use. The repository is also intended to allow researchers to expand their research to identify more metrics that may be relevant and appropriate to specific context.

Related lifecycle stage(s)

Verify & validate

TechnicalNetherlandsUploaded on Nov 29, 2023
This bias detection tool identifies potentially unfairly treated groups of similar users by a binary algorithmic classifier. The tool identifies clusters of users that face a higher misclassification rate compared to the rest of the data set. Clustering is an unsupervised ML method, so no data is needed is required on protected attributes of users. The metric by which bias is defined can be manually chosen in advance: False Negative Rate (FNR), False Positive Rate (FPR), or Accuracy (Acc).

Related lifecycle stage(s)

Verify & validate

TechnicalUnited KingdomJapanEuropeUploaded on Nov 10, 2023<1 hour
Intersectional Fairness (ISF) is a bias detection and mitigation technology developed by Fujitsu for intersectional bias, which is caused by the combinations of multiple protected attributes. ISF is hosted as an open source project by the Linux Foundation.

TechnicalProceduralGermanyUploaded on Sep 7, 2023>1 year
Biaslyze is a python package that helps to get started with the analysis of bias within NLP models and offers a concrete entry point for further impact assessments and mitigation measures.

TechnicalUploaded on Aug 28, 2023>1 year
PAM is the contextual bias detection solution for AI and ML models. Achieve trustworthiness by identifying hidden bias prior to launching and improving explainability. Get a socio-technical contextual view in real-time in your production environment to ensure compliance with AI regulations.

TechnicalUploaded on Jun 26, 2023
Multi-VALUE is a suite of resources for evaluating and achieving English dialect invariance.

Objective(s)


TechnicalUnited StatesUploaded on Jun 8, 2023
[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

TechnicalUploaded on May 23, 2023
This repository contains a sample implementation of Gradient Feature Auditing (GFA) meant to be generalizable to most datasets.

TechnicalUploaded on May 22, 2023
Know Your Data helps researchers, engineers, product teams, and decision makers understand datasets with the goal of improving data quality, and helping mitigate fairness and bias issues.

catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.