These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
Scope
SUBMIT A METRIC
If you have a tool that you think should be featured in the Catalogue of AI Tools & Metrics, we would love to hear from you!
SubmitShapley Additive Explanation (SHAP)
Shapley Additive Explanations (SHAP) is a method that quantifies the contribution of each feature to the output of a predictive model. Rooted in cooperative game theory, SHAP values provide a theoretically sound approach for interpreting complex models by d...
Objectives:
Log odds-ratio
Objectives:
WinoST
The scientific community is increasingly aware of the necessity to embrace pluralism and consistently represent major and minor social groups. Currently, there are no standard evaluation techniques for different types of biases. Accordingly, there is an urg...
Objectives:
SVEva Fair
Despite the success of deep neural networks (DNNs) in enabling on-device voice assistants, increasing evidence of bias and discrimination in machine learning is raising the urgency of investigating the fairness of these systems. Speaker verification is a fo...
Objectives:
SAFE Artificial Intelligence in finance
We propose a set of interrelated metrics, all based on the notion of AI output concentration, and the related Lorenz curve/Lorenz area under the curve, able to measure the Sustainability/robustness, Accuracy, Fairness/privacy, Explainability/accountability ...
Objectives:
Hellinger Distance
The Hellinger distance is a metric used to measure the similarity between two probability distributions. It is related to the Euclidean distance but applied in the space of probability distributions. The Hellinger distance ranges between 0 and 1, where 0 in...
Objectives:
Conditional Demographic Disparity (CDD)
The demographic disparity metric (DD) determines whether a facet has a larger proportion of the rejected outcomes in the dataset than of the accepted outcomes. In the binary case where there are two facets, men and women for example, that constitute the dat...
Objectives:
Rank-Aware Divergence (RADio)
Objectives:
Contextual Outlier Interpretation (COIN)
Objectives:
Local Explanation Method using Nonlinear Approximation (LEMNA)
Given an input data sample, LEMNA generates a small set of interpretable features to explain how the input sample is classified. The core idea is to approximate a local area of the complex deep learning decision boundary using a simple interpretable model. ...
Objectives:
α-Feature Importance (αFI)
The α-Feature Importance metric quantifies the minimum proportion of features required to represent α of the total importance. In other words, this metric is focused in obtaining the minimum number of features necessary to obtain no less than α × 100% of th...
Objectives:
Local Interpretable Model-agnostic Explanation (LIME)
Local Interpretable Model-agnostic Explanations (LIME) is a method developed to enhance the explainability and transparency of machine learning models, particularly those that are complex and difficult to interpret. It is designed to provide clear, localize...
Objectives:
Shapley Variable Importance Cloud (ShapleyVIC)
Objectives:
Variable Importance Cloud (VIC)
Objectives:
CLIPSBERTScore
CLIPBERTSCORE is a simple weighted combination of CLIPScore (Hessel et al., 2021) and BERTScore (Zhang* et al., 2020) to leverage the robustness and strong factuality detection performance between image-summary and document-summary, respectively.
CLI...
Objectives:
Data Banzhaf
Objectives:
Beta Shapley
Objectives:
Data Shapley
Objectives:
Surrogacy Efficacy Score (SESc)
The Surrogacy Efficacy Score is a technique for gaining a better understanding of the inner workings of complex "black box" models. For example, by using a Tree-based model, this method provides a more interpretable representation of the model’s behavior by...
Objectives:
Partial Dependence Complexity (PDC)
The Partial Dependence Complexity metric uses the concept of Partial Dependence curve to evaluate how simple this curve can be represented. The partial dependence curve is used to show model predictions are affected on average by each feature. Curves repres...
Objectives:
