Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

SUBMIT A METRIC

If you have a tool that you think should be featured in the Catalogue of AI Tools & Metrics, we would love to hear from you!

SUBMIT
This page includes metrics and methodologies for measuring and evaluating AI trustworthiness and AI risks. These metrics are often represented through mathematical formulas that assess the technical requirements for achieving trustworthy AI in a particular context. They can help to ensure that a system is fair, accurate, explainable, transparent, robust, safe, or secure.

Normalized Detection Score (NDS) evaluates the performance of 3D object detection systems.

The NDS metric is calculated as the average of the detection scores over different distance ranges and object sizes. Specifically, for each detected object, the...

Objectives:


The Fréchet inception distance (FID) is a metric used to assess the quality of images created by a generative model, like a generative adversarial network (GAN). Unlike the earlier inception score (IS), which evaluates only the distribution of generated ima...

Objectives:


The Adjusted Rand Index (ARI) is a measure of the similarity between two data clusterings. It is a correction of the Rand Index, which is a basic measure of similarity between two clusterings, but it has the disadvantage of being sensitive to chance. The Ad...


The anonymity set for an individual u, denoted ASu is the set of users that the adversary cannot distinguish from u. It can be seen as the size of the crowd into which the target u can blend.


privASS ≡ |ASu |

...

This metric counts the information items S disclosed by a system, e.g., the number of compromised users. However, this metric does not indicate the severity of a leak because it does not account for the
sensitivity of the leaked information.

<...


The most general time-based metric measures the time until the adversary’s success. It assumes that the adversary will succeed eventually, and is therefore an example of a pessimistic metric. This metric relies on a definition of success, and varies depend...


We study fairness in classification, where individuals are classified, e.g., admitted to a university, and the goal is to prevent discrimination against individuals based on their membership in some group, while maintaining utility for the classifier (the ...

Objectives:


We discuss information-theoretic anonymity metrics, that use entropy over the distribution of all possible recipients to quantify anonymity. We identify a common misconception: the entropy of the distribution describing the potential receivers does not alw...


Robustness Metrics provides lightweight modules in order to evaluate the robustness of classification models. Stability is defined as, e.g. the stability of the prediction and predicted probabilities under natural perturbation of the input.

The l...


Robustness Metrics provides lightweight modules in order to evaluate the robustness of classification models. OOD generalization is defined as, e.g. a non-expert human would be able to classify similar objects, but possibly changed viewpoint, scene setting...


We propose a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features. Assuming data about the predictor, target, and membership in the protected group...

Objectives:


If a model systematically makes errors disproportionately for patients in the protected group, it is likely to lead to unequal outcomes. Equal performance refers to the assurance that a model is equally accurate for patients in the protec...

Objectives:


The demographic disparity metric (DD) determines whether a facet has a larger proportion of the rejected outcomes in the dataset than of the accepted outcomes. In the binary case where there are two facets, men and women for example, that constitute the dat...

Objectives:


RADio introduces a rank-aware Jensen Shannon (JS) divergence. This combination accounts for (i) a user’s decreasing propensity to observe items further down a list and (ii) full distributional shifts as opposed to point estimates.

Objectives:


Contextual Outlier INterpretation (COIN) is a method designed to explain the abnormality of existing outliers spotted by detectors. The interpretability for an outlier is achieved from three aspects: outlierness score, att that contribute to the abnormality, a...

Given an input data sample, LEMNA generates a small set of interpretable features to explain how the input sample is classified. The core idea is to approximate a local area of the complex deep learning decision boundary using a simple interpretable model. The...

SHAP (SHapley Additive exPlanations) assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is...

LIME is a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction.

Following the VIC framework, our proposed ShapleyVIC extends the widely used Shapley-based variable importance measures beyond final models for a comprehensive assessment and has important practical implications.

catalogue Logos
Sign up for OECD artificial intelligence newsletter