Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

A given predicted string’s exact match score is 1 if it is the exact same as its reference string, and is 0 otherwise.

  • Example 1: The exact match score of prediction “Happy Birthday!” is 0, given its reference is “Happy New Year!”.
  • Example 2: The exact match score of prediction “The Colour of Magic (1983)” is 1, given its reference is also “The Colour of Magic (1983)”.

The exact match score of a set of predictions is the sum of all of the individual exact match scores in the set, divided by the total number of predictions in the set.

  • Example: The exact match score of the set {Example 1, Example 2} (above) is 0.5.

Trustworthy AI Relevance

This metric addresses Robustness and Transparency by quantifying relevant system properties. Robustness: EM directly measures whether a model consistently returns exactly correct outputs, so it is useful for assessing reliability under different conditions (e.g., dataset shifts, noisy inputs, model variants). Tracking EM across test slices, perturbations, or OOD examples helps identify brittleness, regressions, or failure modes — all central to robustness evaluation.

Related use cases :

Uploaded on Nov 1, 2023
This paper introduces a new concept called "transferable visual words" (TransVW), aiming to achieve annotation efficiency for deep learning in medical image analysis. Medical imagi...

Uploaded on Nov 1, 2023
Vision-language modeling has enabled open-vocabulary tasks where predictions can be queried using any text prompt in a zero-shot manner. Existing open-vocabulary tasks focus on obj...


About the metric




Target sector(s):


Lifecycle stage(s):


Target users:


Risk management stage(s):

Modify this metric

Partnership on AI

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.