These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
CHAIR is a metric designed to measure object hallucination in image captioning models, assessing the relevance of generated captions to the actual image content. It evaluates how often models “hallucinate” objects not present in the image and introduces a new way to measure caption quality using veridical visual labels.
Applicable Models
Image captioning models including attention-based models (e.g., TopDown, NBT) and non-attention-based models (e.g., FC, LRCN). It applies to models trained and evaluated on datasets such as MSCOCO.
Background
Standard sentence metrics like CIDEr, METEOR, and SPICE fail to penalize hallucinated objects sufficiently, making them less reliable for assessing image relevance. CHAIR addresses this gap by using both ground truth object annotations and captions.
Formulae
• CHAIRi = (Number of hallucinated objects) / (Total objects mentioned in captions)
• CHAIRs = (Number of sentences with hallucinated objects) / (Total sentences)
Applications
CHAIR is used to evaluate hallucination tendencies in image captioning models, aiding in model comparison and optimization for tasks requiring high image-caption fidelity. It helps identify how models rely on language priors versus visual input.
Impact
CHAIR promotes responsible AI development by encouraging models to produce captions more aligned with image content, enhancing performance and reducing misleading outputs. It helps mitigate risks associated with hallucinations, especially in sensitive applications for visually impaired users or automated systems.
References
Rohrbach, A., Hendricks, L. A., Burns, K., Darrell, T., & Saenko, K. (2018). Object hallucination in image captioning. arXiv preprint arXiv:1809.02156.
About the metric
You can click on the links to see the associated metrics
Metric type(s):
Objective(s):
Purpose(s):
Target sector(s):
Lifecycle stage(s):
Usage rights:
Target users:
Risk management stage(s):
Github stars:
- 63
Github forks:
- 8
