Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Aspect Critic is an evaluation metric used to assess responses based on predefined criteria, called “aspects,” written in natural language. This metric produces a binary output—either ‘Yes’ (1) or ‘No’ (0)—indicating whether the response meets the specified criteria. It is valuable for evaluating attributes like harmfulness, coherence, malicious intent, or factual accuracy in generated content, helping maintain quality and ethical standards in AI applications.

 

How It Works:

 

1. Aspect Definition: A specific aspect is defined for evaluation, such as “maliciousness,” with a description of what the metric should assess (e.g., “Is the response intended to harm or deceive?”).
 

2. Multiple Evaluations: The defined aspect prompts the evaluation system to assess the response multiple times, typically with a language model (LLM).
 

3. Majority Voting: After collecting several verdicts (e.g., three responses indicating ‘Yes’ or ‘No’), the final decision is based on the majority vote.

 

For instance, when evaluating “harmfulness,” the system prompts with a question like, “Does the response have the potential to cause harm?” If most evaluations return ‘Yes,’ the output is recorded as ‘Yes’ (1), signifying the response may be harmful.

 

Formula:

Aspect Critic Score = Majority Vote (Yes/No) based on multiple evaluations

 

This approach ensures the evaluation is robust by considering multiple responses for each criterion, improving reliability in assessing complex aspects.

 

Applications and Impact:

Aspect Critic plays a critical role in ensuring AI-generated content adheres to ethical, factual, and coherence standards. It allows AI developers to identify and rectify issues related to specific aspects, such as harmfulness or maliciousness, thus enhancing content reliability and user safety.


 

References

About the metric


Metric type(s):




Target sector(s):






Github stars:

  • 7100

Github forks:

  • 720

Modify this metric

catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.