Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Context Recall assesses how effectively a model retrieves all relevant pieces of information necessary to generate a comprehensive and accurate response. Unlike precision, which focuses on relevance, recall emphasizes completeness, ensuring that no critical details are omitted. Higher recall indicates that fewer pertinent details were missed during retrieval, which is vital in Retrieval-Augmented Generation (RAG) systems. This metric is particularly useful in applications like customer support, knowledge systems, and educational tools where full information retrieval is essential.

 

Formula:

Context Recall = 
(Number of Ground Truth (GT) Claims Attributable to Retrieved Context) / (Total Number of Claims in Ground Truth)

 

GT Claims Attributable to Retrieved Context: The number of claims from the ground truth reference that can be matched or found in the retrieved contexts.

Total Number of Claims in Ground Truth: The total number of claims or relevant information pieces in the reference answer.

 

Types of Context Recall Approaches:

 

1. LLM-Based Context Recall: Uses a language model to evaluate if each retrieved context supports the reference answer. It breaks down the reference into individual claims and checks if these can be attributed to retrieved chunks, providing a recall score between 0 and 1.

2. Non-LLM-Based Context Recall: Employs traditional, non-LLM comparison metrics (e.g., string similarity or distance measures) to assess whether the retrieved contexts contain relevant information as per the reference contexts.

 

This metric is essential in high-stakes applications, ensuring that all necessary information is retrieved to avoid incomplete or misleading responses in AI-driven interactions.

References

catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.