These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
Response Relevancy evaluates how closely the generated answer aligns with the input query. This metric assigns a higher score to answers that directly and completely address the question, while penalizing answers that are incomplete or contain redundant information. The relevancy score is crucial for improving the quality of responses in retrieval-augmented generation (RAG) tasks, as it ensures that the output is pertinent to the user’s intent.
Formula:
Response Relevancy is computed as the mean cosine similarity between the embedding of the original question and embeddings of multiple artificial questions generated based on the model’s response.
Response Relevancy = (1/N) Σ cos(E_gi, E_o)
Where:
• E_gi is the embedding of the i-th generated question.
• E_o is the embedding of the original question.
• N is the number of generated questions (default is 3).
This formula measures the alignment between the original query and reconstructed questions derived from the model’s response, with higher cosine similarity indicating better relevance.
Low Relevance: Indicates that the response lacks critical details or is only partially relevant to the original question.
High Relevance: Shows that the response is directly aligned with the question and fully addresses it.
References
About the metric
You can click on the links to see the associated metrics
Metric type(s):
Objective(s):
Purpose(s):
Lifecycle stage(s):
Usage rights:
Target users:
Risk management stage(s):
Github stars:
- 7100
Github forks:
- 720