Catalogue of Tools & Metrics for Trustworthy AI
These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
Scope
SUBMIT A METRIC
If you have a tool that you think should be featured in the Catalogue of AI Tools & Metrics, we would love to hear from you!
SUBMITHellinger Distance
Hellinger distance
The Hellinger distance (sometimes called the Jeffreys distance) is a metric in the space of probability distributions. The Hellinger distance can be used to quantify the degree of similarity between two probability ...
Objectives:
Conditional Demographic Disparity (CDD)
The demographic disparity metric (DD) determines whether a facet has a larger proportion of the rejected outcomes in the dataset than of the accepted outcomes. In the binary case where there are two facets, men and women for example, that constitute the dat...
Objectives:
Rank-Aware Divergence (RADio)
Objectives:
Contextual Outlier Interpretation (COIN)
Objectives:
Local Explanation Method using Nonlinear Approximation (LEMNA)
Objectives:
Shapley Additive Explanation (SHAP)
Objectives:
Local Interpretable Model-agnostic Explanation (LIME)
Objectives:
Shapley Variable Importance Cloud (ShapleyVIC)
Objectives:
Variable Importance Cloud (VIC)
Objectives:
CLIPSBERTScore
Objectives:
Data Banzhaf
Objectives:
Beta Shapley
Objectives:
Data Shapley
Objectives:
Surrogacy Efficacy Score (SESc)
The Surrogacy Efficacy Score is a technique for gaining a better understanding of the inner workings of complex "black box" models. For example, by using a Tree-based model, this method provides a more interpretable representation of the model’s behavior by...
Objectives:
Partial Dependence Complexity (PDC)
The Partial Dependence Complexity metric uses the concept of Partial Dependence curve to evaluate how simple this curve can be represented. The partial dependence curve is used to show model predictions are affected on average by each feature. Curves repres...
Objectives:
α-Feature Importance (αFI)
The α-Feature Importance metric quantifies the minimum proportion of features required to represent α of the total importance. In other words, this metric is focused in obtaining the minimum number of features necessary to obtain no less than α × 100% of th...
Objectives:
Predictions Groups Contrast (PGC)
The PGC metric compares the top-K ranking of features importance drawn from the entire dataset with the top-K ranking induced from specific subgroups of predictions. It can be applied to both categorical and regression problems, being useful for quantifying...
Objectives:
Local Feature Importance Spread Stability (LFISS)
Local Feature Importance refers to the assignment of feature normalized importance to different regions of the input data space. For a given dataset D with N samples, it is possible to compute a vector of feature importance for each individual observation d...
Objectives:
Global Feature Importance Spread (GFIS)
The metric GFIS is based on the concept of entropy. More precisely on the entropy of the normalized features measure, which represents the concentration of information within a set of features. Lower entropy values indicate that the majority of the explanat...
Objectives:
SAFE (Sustainable, Accurate, Fair and Explainable)
Machine learning models, at the core of AI applications, typically achieve a high accuracy at the expense of an insufficient explainability. Moreover, according to the proposed regulations, AI applications based on machine learning must be "trus...
Objectives:
