Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Context Precision is a metric that quantifies the accuracy of relevant information in the retrieved contexts used by language models. This metric is particularly significant in Retrieval-Augmented Generation (RAG) settings, where precise retrieval of context is essential for generating responses that are both accurate and contextually appropriate.

 

Context Precision is calculated as the average precision at rank  k  (Precision@k) across all retrieved contexts. Precision@k represents the ratio of relevant chunks retrieved at rank  k  to the total chunks retrieved at that rank, allowing the metric to focus on higher-ranked, more relevant results.

 

Formula:
 

Context Precision@K = 
(Sum of (Precision@k * Relevance at rank k)) / Total relevant items in top K results

 

where:

• Precision@k = True Positives at rank k / (True Positives at k + False Positives at k)

• True Positives at rank k are the relevant chunks identified by the model at rank  k .

• Relevance at rank  k  is a binary indicator of whether a chunk is relevant (1) or irrelevant (0).

 

Types of Context Precision Approaches:

 

1. LLM-Based Context Precision (Without Reference): Uses an LLM to evaluate if the retrieved context is relevant based on the user input and response.

2. LLM-Based Context Precision (With Reference): Utilizes an LLM to compare each retrieved context against a reference answer to determine relevance.

3. Non-LLM-Based Context Precision (With Reference Contexts): Applies traditional, non-LLM distance measures to assess the relevance of retrieved contexts against provided reference contexts.

 

This metric is critical for ensuring that retrieval-augmented models only use precise, relevant information in their responses, improving the quality of interactions across applications like chatbots, educational tools, and customer support systems.

References

Ragas Documentation: Context Precision

catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.