Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Response Relevancy evaluates how closely the generated answer aligns with the input query. This metric assigns a higher score to answers that directly and completely address the question, while penalizing answers that are incomplete or contain redundant information. The relevancy score is crucial for improving the quality of responses in retrieval-augmented generation (RAG) tasks, as it ensures that the output is pertinent to the user’s intent.

 

Formula:

Response Relevancy is computed as the mean cosine similarity between the embedding of the original question and embeddings of multiple artificial questions generated based on the model’s response.

 

Response Relevancy = (1/N) Σ cos(E_gi, E_o)

 

Where:

 

• E_gi is the embedding of the i-th generated question.

• E_o is the embedding of the original question.

• N is the number of generated questions (default is 3).

 

This formula measures the alignment between the original query and reconstructed questions derived from the model’s response, with higher cosine similarity indicating better relevance.

 

Low Relevance: Indicates that the response lacks critical details or is only partially relevant to the original question.
 

High Relevance: Shows that the response is directly aligned with the question and fully addresses it.

References

catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.