These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

### Scope

### SUBMIT A METRIC

If you have a tool that you think should be featured in the Catalogue of AI Tools & Metrics, we would love to hear from you!

SUBMIT## Equal performance 4 related use cases

If a model systematically makes errors disproportionately for patients in the protected group, it is likely to lead to unequal outcomes. *Equal performance* refers to the assurance that a model is equally accurate for patients in the protec...

Objectives:

## Equality of Opportunity Difference (EOD) 1 related use case

We propose a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features. Assuming data about the predictor, target, and membership in the protected group...

Objectives:

## Statistical Parity Difference (SPD) 1 related use case

We study fairness in classification, where individuals are classified, e.g., admitted to a university, and the goal is to prevent discrimination against individuals based on their membership in some group, while maintaining utility for the classifier (the ...

Objectives:

## Hellinger Distance

**Hellinger distance**

The Hellinger distance (sometimes called the Jeffreys distance) is a metric in the space of probability distributions. The Hellinger distance can be used to quantify the degree of similarity between two probability ...

Objectives:

## Gender-based Illicit Proximity Estimate (GIPE)

This paper proposes a new bias evaluation metric – Gender-based Illicit Proximity Estimate (GIPE), which measures the extent of undue proximity in word vectors resulting from the presence of gender-based predilections. Experiments based on a suite of...

Objectives:

## Conditional Demographic Disparity (CDD)

The demographic disparity metric (DD) determines whether a facet has a larger proportion of the rejected outcomes in the dataset than of the accepted outcomes. In the binary case where there are two facets, men and women for example, that constitute the dat...

Objectives:

## Rank-Aware Divergence (RADio)

Objectives:

## Contextual Outlier Interpretation (COIN)

Objectives:

## Local Explanation Method using Nonlinear Approximation (LEMNA)

Objectives:

## Shapley Additive Explanation (SHAP)

Objectives:

## Local Interpretable Model-agnostic Explanation (LIME)

Objectives:

## Shapley Variable Importance Cloud (ShapleyVIC)

Objectives:

## Variable Importance Cloud (VIC)

Objectives:

## Data Banzhaf

Objectives:

## Data Shapley

Objectives:

## Predictions Groups Contrast (PGC)

The PGC metric compares the top-K ranking of features importance drawn from the entire dataset with the top-K ranking induced from specific subgroups of predictions. It can be applied to both categorical and regression problems, being useful for quantifying...

Objectives:

## Global Feature Importance Spread (GFIS)

The metric GFIS is based on the concept of entropy. More precisely on the entropy of the normalized features measure, which represents the concentration of information within a set of features. Lower entropy values indicate that the majority of the explanat...

Objectives:

## SAFE (Sustainable, Accurate, Fair and Explainable)

Machine learning models, at the core of AI applications, typically achieve a high accuracy at the expense of an insufficient explainability. Moreover, according to the proposed regulations, AI applications based on machine learning must be "trus...

Objectives:

## Kendall rank correlation coefficient (KRCC)

In statistics, the Kendall rank correlation coefficient, commonly referred to as Kendall's τ coefficient, is a statistic used to measure the ordinal association between two measured quantities. A τ test is a non-parametric hypothesis test for statistical de...

Objectives:

## Equal outcomes

In the field of health, equal patient outcomes refers to the assurance that protected groups have equal benefit in terms of patient outcomes from the deployment of machine-learning models. A weak form of equal outcomes is ensuring that both the protect...

Objectives: