Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

3 citations of this metric

Noise Sensitivity measures how susceptible a language model is to making errors when exposed to irrelevant or noisy information in the context. Specifically, it evaluates the likelihood of the model generating incorrect responses due to both relevant and irrelevant retrieved documents. This metric is vital for assessing the robustness of the Retrieval-Augmented Generation system (RAG), as it helps ensure that the model’s performance remains unaffected by extraneous or misleading information.

 

Formula:

Noise Sensitivity (relevant) = 

(Total number of incorrect claims in response) / (Total number of claims in the response)

 

Total number of incorrect claims in response: Claims in the model’s generated answer that cannot be supported by the relevant context or deviate from the ground truth.

Total number of claims in the response: Total claims made in the response, both correct and incorrect.

 

Types of Noise Sensitivity Approaches:

 

1. LLM-Based Noise Sensitivity: Uses a language model to assess whether each claim in the response is correct and supported by the retrieved context. This approach can differentiate between errors due to relevant and irrelevant information.

2. Non-LLM-Based Noise Sensitivity: Employs traditional comparison methods to measure noise sensitivity, focusing on distance or similarity metrics to evaluate discrepancies between the response and retrieved contexts.

 

This metric is essential for ensuring robust performance in applications where models interact with varied or noisy data sources, such as customer support, educational content, and high-stakes decision-making systems.

References

catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.