Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Type

Clear all

Clear all

Fairness

Origin

Scope

SUBMIT A TOOL

If you have a tool that you think should be featured in the Catalogue of Tools & Metrics for Trustworthy AI, we would love to hear from you!

Submit
Approach Technical
Objective Fairness

TechnicalUnited StatesUploaded on Mar 24, 2025
An open-source Python library designed for developers to calculate fairness metrics and assess bias in machine learning models. This library provides a comprehensive set of tools to ensure transparency, accountability, and ethical AI development.

TechnicalNetherlandsUploaded on Nov 29, 2023
This bias detection tool identifies potentially unfairly treated groups of similar users by a binary algorithmic classifier. The tool identifies clusters of users that face a higher misclassification rate compared to the rest of the data set. Clustering is an unsupervised ML method, so no data is needed is required on protected attributes of users. The metric by which bias is defined can be manually chosen in advance: False Negative Rate (FNR), False Positive Rate (FPR), or Accuracy (Acc).

Related lifecycle stage(s)

Verify & validate

TechnicalProceduralGermanyUploaded on Sep 7, 2023>1 year
Biaslyze is a python package that helps to get started with the analysis of bias within NLP models and offers a concrete entry point for further impact assessments and mitigation measures.

TechnicalUploaded on Aug 28, 2023>1 year
PAM is the contextual bias detection solution for AI and ML models. Achieve trustworthiness by identifying hidden bias prior to launching and improving explainability. Get a socio-technical contextual view in real-time in your production environment to ensure compliance with AI regulations.

TechnicalUploaded on May 23, 2023
This repository contains a sample implementation of Gradient Feature Auditing (GFA) meant to be generalizable to most datasets.

TechnicalUploaded on Apr 20, 2023<1 day
CounterGen is a framework for auditing and reducing bias in natural language processing (NLP) models such as generative models (e.g. ChatGPT, GPT-J, GPT-3 etc.) or classification models (e.g. BERT). It does so by generating balanced datasets, evaluating the behavior of NLP models, and directly editing the internals of the model to reduce bias.

TechnicalNetherlandsSwedenUploaded on Mar 27, 2023>1 year
Tool that uses statistical analysis to identify groups that may be subject to unfair treatment by an algorithmic-driven decision-support system. The tool informs the qualitative doctrine of law and ethics which disparities need to be scrutinised manually by human experts.

TechnicalProceduralUploaded on Mar 27, 2023<1 day
A fully independent and impartial audit to ensure compliance with the bias audit requirements for New York City Local Law 144 and upcoming regulations.

TechnicalProceduralUploaded on Mar 27, 2023<1 day
A bespoke AI risk audit solution tailor-made to identify your enterprise project’s AI risks, comprising deep technical, quantitative analysis.

TechnicalUnited StatesUploaded on Sep 20, 2022

library for fair auditing and learning of classifiers with respect to rich subgroup fairness.


TechnicalUnited StatesUploaded on Sep 9, 2022

A curated list of awesome Fairness in AI resources


TechnicalUnited StatesUploaded on Mar 17, 2022

FairVis is a visual analytics system that allows users to audit their classification models for intersectional bias. Users can generate subgroups of their data and investigate if a model is underperforming for certain populations.


TechnicalUnited StatesUploaded on Mar 17, 2022

FairTest enables developers or auditing entities to discover and test for unwarranted associations between an algorithm’s outputs and certain user subpopulations identified by protected features. FairTest works by learning a special decision tree, that splits a user population into smaller subgroups in which the association between protected features and algorithm outputs is maximized. FairTest supports and makes use of a variety of different fairness metrics each […]


TechnicalUploaded on Feb 23, 2022

Open Sourced Bias Testing for Generalized Machine Learning Applications audit-AI is a Python library built on top of pandas and sklearn that implements fairness-aware machine learning algorithms. audit-AI was developed by the Data Science team at pymetrics


TechnicalProceduralUploaded on Feb 23, 2022

Aequitas is an open source bias and fairness audit toolkit that is an intuitive and easy to use addition to the machine learning workflow, enabling users to seamlessly test models for several bias and fairness metrics in relation to multiple population sub-groups. Aequitas facilitates informed and equitable decisions around developing and deploying algorithmic decision making […]


Partnership on AI

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.