Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

4 citations of this metric

Aspect Critic is an evaluation metric used to assess responses based on predefined criteria, called “aspects,” written in natural language. This metric produces a binary output—either ‘Yes’ (1) or ‘No’ (0)—indicating whether the response meets the specified criteria. It is valuable for evaluating attributes like harmfulness, coherence, malicious intent, or factual accuracy in generated content, helping maintain quality and ethical standards in AI applications.

 

How It Works:

 

1. Aspect Definition: A specific aspect is defined for evaluation, such as “maliciousness,” with a description of what the metric should assess (e.g., “Is the response intended to harm or deceive?”).
 

2. Multiple Evaluations: The defined aspect prompts the evaluation system to assess the response multiple times, typically with a language model (LLM).
 

3. Majority Voting: After collecting several verdicts (e.g., three responses indicating ‘Yes’ or ‘No’), the final decision is based on the majority vote.

 

For instance, when evaluating “harmfulness,” the system prompts with a question like, “Does the response have the potential to cause harm?” If most evaluations return ‘Yes,’ the output is recorded as ‘Yes’ (1), signifying the response may be harmful.

 

Formula:

Aspect Critic Score = Majority Vote (Yes/No) based on multiple evaluations

 

This approach ensures the evaluation is robust by considering multiple responses for each criterion, improving reliability in assessing complex aspects.

 

Applications and Impact:

Aspect Critic plays a critical role in ensuring AI-generated content adheres to ethical, factual, and coherence standards. It allows AI developers to identify and rectify issues related to specific aspects, such as harmfulness or maliciousness, thus enhancing content reliability and user safety.


 

Trustworthy AI Relevance

This metric addresses Explainability and Robustness by quantifying relevant system properties. Explainability: Aspect Critic decomposes model outputs into named aspects and produces explicit critiques or scores per aspect, which directly improves the ability to explain why an output is good or bad (helps users and developers understand failure modes). Robustness: By measuring consistency across many aspects and identifying specific weaknesses (e.g., factual errors, hallucination, inconsistency under perturbation), Aspect Critic helps detect reliability issues, compare performance under distributional shifts, and guide robustness improvements.

References

About the metric


Metric type(s):




Target sector(s):






Github stars:

  • 7100

Github forks:

  • 720

Modify this metric

Partnership on AI

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.