Metrics for Trustworthy AI

Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Overview Tools Metrics About the catalogue

Show metrics Show use cases

Transparency

Clear all

Scope

SUBMIT A METRIC

If you have a tool that you think should be featured in the Catalogue of AI Tools & Metrics, we would love to hear from you!

Submit

This page includes technical metrics and methodologies for measuring and evaluating AI trustworthiness and AI risks. These metrics are often represented through mathematical formulas that assess the technical requirements for achieving trustworthy AI in a particular context. They can help to ensure that a system is fair, accurate, explainable, transparent, robust, safe, or secure.

Objective Transparency

Recall-Oriented Understudy for Gisting Evaluation (ROUGE) 10 related use cases

ROUGE, or Recall-Oriented Understudy for Gisting Evaluation, is a set of metrics and a software package used for evaluating automatic summarization and machine translation software in natural language processing. The metrics compare an automatically produce...

Objectives:

Transparency Explainability

Sparsity 1 related use case

While smoothness and spatial locality capture spatial properties, the individual values shall also be sparse, since few highly important regions are more indicative of a good explanation than several mildly relevant ones. This is why a sparsity metric shoul...

Objectives:

Transparency Explainability

SAGED (Holistic Bias-Benchmarking Pipeline for Language Models with Customizable Fairness Calibration)

SAGED is a pioneering pipeline for bias detection in large language models (LLMs). It introduces an integrated framework for building benchmarks, running diagnostic tests, and calibrating fairness baselines, addressing key challenges such as contamination, ...

Objectives:

Fairness Transparency

JobFair (A Framework for Benchmarking Gender Hiring Bias in Large Language Models)

JobFair is a robust framework designed to benchmark and evaluate hierarchical gender hiring biases in Large Language Models (LLMs) used for resume scoring. It identifies and quantifies two primary types of bias: Level Bias (differences in a...

Objectives:

Fairness Transparency

Prometheus

Prometheus is an open-source evaluator model metric fine-tuned on feedback data to perform fine-grained evaluations of AI-generated text. Designed to offer transparency, cost-efficiency, and reproducibility, it matches proprietary models like GPT-4 in evalu...

Objectives:

Fairness Transparency

G-Eval

G-Eval is a novel framework designed to evaluate the outputs of large language models (LLMs) using the interpretive and reasoning capabilities of the models themselves. Introduced in the paper “NLG Evaluation using GPT-4 with Better Human Alignment”,...

Objectives:

Transparency

SAFE Artificial Intelligence in finance

We propose a set of interrelated metrics, all based on the notion of AI output concentration, and the related Lorenz curve/Lorenz area under the curve, able to measure the Sustainability/robustness, Accuracy, Fairness/privacy, Explainability/accountability ...

Objectives:

Fairness Robustness Transparency

Contextual Outlier Interpretation (COIN)

Contextual Outlier INterpretation (COIN) is a method designed to explain the abnormality of existing outliers spotted by detectors. The interpretability for an outlier is achieved from three aspects: outlierness score, att that contribute to the abnormality, a...

Objectives:

Transparency Explainability

Local Explanation Method using Nonlinear Approximation (LEMNA)

Given an input data sample, LEMNA generates a small set of interpretable features to explain how the input sample is classified. The core idea is to approximate a local area of the complex deep learning decision boundary using a simple interpretable model. ...

Objectives:

Transparency Explainability

Shapley Additive Explanation (SHAP)

Shapley Additive Explanations (SHAP) is a method that quantifies the contribution of each feature to the output of a predictive model. Rooted in cooperative game theory, SHAP values provide a theoretically sound approach for interpreting complex models by d...

Objectives:

Transparency Explainability

Local Interpretable Model-agnostic Explanation (LIME)

Local Interpretable Model-agnostic Explanations (LIME) is a method developed to enhance the explainability and transparency of machine learning models, particularly those that are complex and difficult to interpret. It is designed to provide clear, localize...

Objectives:

Transparency Explainability

Shapley Variable Importance Cloud (ShapleyVIC)

Following the VIC framework, our proposed ShapleyVIC extends the widely used Shapley-based variable importance measures beyond final models for a comprehensive assessment and has important practical implications.

Objectives:

Transparency Explainability

Variable Importance Cloud (VIC)

Ideally we would like to obtain a more complete understanding of variable importance for the set of models that predict almost equally well. This set of almost-equally-accurate predictive models is called the Rashomon set; it is the set of models with training...

Objectives:

Transparency Explainability

Surrogacy Efficacy Score (SESc)

The Surrogacy Efficacy Score is a technique for gaining a better understanding of the inner workings of complex "black box" models. For example, by using a Tree-based model, this method provides a more interpretable representation of the model’s behavior by...

Objectives:

Transparency Explainability

Partial Dependence Complexity (PDC)

The Partial Dependence Complexity metric uses the concept of Partial Dependence curve to evaluate how simple this curve can be represented. The partial dependence curve is used to show model predictions are affected on average by each feature. Curves repres...

Objectives:

Transparency Explainability

α-Feature Importance (αFI)

The α-Feature Importance metric quantifies the minimum proportion of features required to represent α of the total importance. In other words, this metric is focused in obtaining the minimum number of features necessary to obtain no less than α × 100% of th...

Objectives:

Transparency Explainability

Predictions Groups Contrast (PGC)

The PGC metric compares the top-K ranking of features importance drawn from the entire dataset with the top-K ranking induced from specific subgroups of predictions. It can be applied to both categorical and regression problems, being useful for quantifying...

Objectives:

Fairness Transparency

Local Feature Importance Spread Stability (LFISS)

Local Feature Importance refers to the assignment of feature normalized importance to different regions of the input data space. For a given dataset D with N samples, it is possible to compute a vector of feature importance for each individual observation d...

Objectives:

Transparency

Global Feature Importance Spread (GFIS)

The metric GFIS is based on the concept of entropy. More precisely on the entropy of the normalized features measure, which represents the concentration of information within a set of features. Lower entropy values indicate that the majority of the explanat...

Objectives:

Transparency Explainability

Pearson correlation coefficient (PCC)

In statistics, the Pearson correlation coefficient (PCC) ― also known as Pearson's r, the Pearson product-moment correlation coefficient (PPMCC), the bivariate correlation, or colloquially simply as the correlation coefficient ― is a measure of linear corre...

Objectives:

Transparency Explainability

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.