These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
The CIDEr (Consensus-based Image Description Evaluation) metric is a way of evaluating the quality of generated textual descriptions of images. The CIDEr metric measures the similarity between a generated caption and the reference captions, and it is based on the concept of consensus: the idea that good captions should not only be similar to the reference captions in terms of word choice and grammar, but also in terms of meaning and content.
The CIDEr metric is computed as follows:
1. First, a set of reference captions is provided for each image. These captions serve as the ground truth for the evaluation.
2. The generated caption is compared to each reference caption using the BLEU (Bilingual Evaluation Understudy) score, which measures the n-gram overlap between the generated caption and the reference captions.
3. The BLEU scores are then modified using an IDF (Inverse Document Frequency) weighting, which gives more weight to words that are rare in the reference captions but appear in the generated caption.
4. Finally, the weighted BLEU scores are averaged over all reference captions to produce the final CIDEr score.
The CIDEr metric has become a standard in the field of image captioning and has been used in several benchmark datasets and competitions. It is a widely used evaluation metric because it provides a comprehensive evaluation of the quality of generated captions, taking into account both the language and content of the captions.
Related use cases :
3D Hand Reconstruction via Aggregating Intra and Inter Graphs Guided by Prior Knowledge for Hand-Object Interaction Scenario
Uploaded on Mar 15, 2024EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
Uploaded on Mar 15, 2024HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields
Uploaded on Mar 15, 2024About the metric
You can click on the links to see the associated metrics
Objective(s):
Purpose(s):
Target sector(s):
Lifecycle stage(s):
Target users: