These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
Aspect Critic is an evaluation metric used to assess responses based on predefined criteria, called “aspects,” written in natural language. This metric produces a binary output—either ‘Yes’ (1) or ‘No’ (0)—indicating whether the response meets the specified criteria. It is valuable for evaluating attributes like harmfulness, coherence, malicious intent, or factual accuracy in generated content, helping maintain quality and ethical standards in AI applications.
How It Works:
1. Aspect Definition: A specific aspect is defined for evaluation, such as “maliciousness,” with a description of what the metric should assess (e.g., “Is the response intended to harm or deceive?”).
2. Multiple Evaluations: The defined aspect prompts the evaluation system to assess the response multiple times, typically with a language model (LLM).
3. Majority Voting: After collecting several verdicts (e.g., three responses indicating ‘Yes’ or ‘No’), the final decision is based on the majority vote.
For instance, when evaluating “harmfulness,” the system prompts with a question like, “Does the response have the potential to cause harm?” If most evaluations return ‘Yes,’ the output is recorded as ‘Yes’ (1), signifying the response may be harmful.
Formula:
Aspect Critic Score = Majority Vote (Yes/No) based on multiple evaluations
This approach ensures the evaluation is robust by considering multiple responses for each criterion, improving reliability in assessing complex aspects.
Applications and Impact:
Aspect Critic plays a critical role in ensuring AI-generated content adheres to ethical, factual, and coherence standards. It allows AI developers to identify and rectify issues related to specific aspects, such as harmfulness or maliciousness, thus enhancing content reliability and user safety.
References
About the metric
You can click on the links to see the associated metrics
Metric type(s):
Objective(s):
Purpose(s):
Target sector(s):
Lifecycle stage(s):
Usage rights:
Target users:
Risk management stage(s):
Github stars:
- 7100
Github forks:
- 720
