These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
Scope
SUBMIT A METRIC
If you have a tool that you think should be featured in the Catalogue of AI Tools & Metrics, we would love to hear from you!
Submit Metric for Evaluation of Translation with Explicit ORdering (METEOR) 1 related use case
Metric for Evaluation of Translation with Explicit ORdering (METEOR) is a machine translation evaluation metric, which is calculated based on the harmonic mean of precision and recall, with recall weighted more than precision.
METEOR is based on a gen...
Objectives:
Data Shapley
Objectives:
Context Entities Recall
Context Entities Recall measures the recall of entities in retrieved contexts based on the entities present in both the reference and retrieved contexts, relative to the entities in the reference alone. This metric evaluates what fraction of entities in the...
Objectives:
Log odds-ratio
Trustworthy AI Relevance
This metric addresses Explainability and Fairness
Objectives:
Contextual Outlier Interpretation (COIN)
Objectives:
Local Explanation Method using Nonlinear Approximation (LEMNA)
Given an input data sample, LEMNA generates a small set of interpretable features to explain how the input sample is classified. The core idea is to approximate a local area of the complex deep learning decision boundary using a simple interpretable model. ...
Objectives:
Shapley Additive Explanation (SHAP)
Shapley Additive Explanations (SHAP) is a method that quantifies the contribution of each feature to the output of a predictive model. Rooted in cooperative game theory, SHAP values provide a theoretically sound approach for interpreting complex models by d...
Objectives:
Local Interpretable Model-agnostic Explanation (LIME)
Local Interpretable Model-agnostic Explanations (LIME) is a method developed to enhance the explainability and transparency of machine learning models, particularly those that are complex and difficult to interpret. It is designed to provide clear, localize...
Objectives:
Shapley Variable Importance Cloud (ShapleyVIC)
Trustworthy AI Relevance
This metr...
Objectives:
Variable Importance Cloud (VIC)
Objectives:
Beta Shapley
Objectives:
System output Against References and against the Input sentence (SARI)
SARI (system output against references and against the input sentence) is a metric used for evaluating automatic text simplification systems.
The metric compares the predicted simplified sentences against the reference and the source sentences. It exp...
Objectives:
Partial Dependence Complexity (PDC)
The Partial Dependence Complexity metric uses the concept of Partial Dependence curve to evaluate how simple this curve can be represented. The partial dependence curve is used to show model predictions are affected on average by each feature. Curves repres...
Objectives:
α-Feature Importance (αFI)
The α-Feature Importance metric quantifies the minimum proportion of features required to represent α of the total importance. In other words, this metric is focused in obtaining the minimum number of features necessary to obtain no less than α × 100% of th...
Objectives:
Local Feature Importance Spread Stability (LFISS)
Local Feature Importance refers to the assignment of feature normalized importance to different regions of the input data space. For a given dataset D with N samples, it is possible to compute a vector of feature importance for each individual observation d...
Objectives:
Global Feature Importance Spread (GFIS)
The metric GFIS is based on the concept of entropy. More precisely on the entropy of the normalized features measure, which represents the concentration of information within a set of features. Lower entropy values indicate that the majority of the explanat...
Objectives:
SAFE (Sustainable, Accurate, Fair and Explainable)
Machine learning models, at the core of AI applications, typically achieve a high accuracy at the expense of an insufficient explainability. Moreover, according to the proposed regulations, AI applications based on machine learning must be "trus...
Objectives:
Tree Edit Distance (TED)
Tree Edit Distance (TED) is a metric for calculation of similarity between syntactic n-grams for further detection of soft similarity between texts.
Trustworthy AI Relevance
This metric addresses Robustness and Expl...
Objectives:
Normalized Scanpath Saliency (NSS)
The Normalized Scanpath Saliency was introduced to the saliency community as a simple correspondence measure between saliency maps and ground truth, computed as the average normalized saliency at fixated locations. Unlike in AUC, the absolute saliency value...
Objectives:
Spearman's rank correlation coefficient (SRCC)
In statistics, Spearman's rank correlation coefficient or Spearman's ρ is a non-parametric measure of rank correlation (statistical dependence between the rankings of two variables). It assesses how well the relationship between two variables can be describ...
Objectives:


























