Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Mean reciprocal rank (MRR) measures the number of triples predicted correctly. If the first predicted triple is correct, then 1 is added, if the second is correct, 1/2 is summed, and so on. MRR is generally used to quantify the effect of search algorithms.

MRR can indirectly support Robustness by surfacing drops in ranking quality under distribution shifts or adversarial probes—sudden declines in mean reciprocal rank act as early warnings that the system may be faltering under new or noisy inputs.

Trustworthy AI Relevance

This metric addresses Robustness and Fairness by quantifying relevant system properties. Robustness: MRR quantifies ranking quality (the reciprocal of the rank of the first correct item averaged across queries). Monitoring MRR over time, across input perturbations, or under distribution shift/adversarial probes provides an early signal of performance degradation or fragility, making it useful for robustness evaluation and regression detection.

Partnership on AI

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.