Recall-Oriented Understudy for Gisting Evaluation (ROUGE)

Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Overview Tools Metrics About the catalogue

Github

Website

ROUGE, or Recall-Oriented Understudy for Gisting Evaluation, is a set of metrics and a software package used for evaluating automatic summarization and machine translation software in natural language processing. The metrics compare an automatically produced summary or translation against a reference or a set of references (human-produced) summary or translation.

Note that ROUGE is case insensitive, meaning that upper case letters are treated the same way as lower case letters.

Related use cases :

Better Summarization Evaluation with Word Embeddings for ROUGE

Uploaded on Nov 1, 2022

ROUGE is a widely adopted, automatic evaluation measure for text summarization. While it has been shown to correlate well with human judgements, it is biased towards surface le...

BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations

Uploaded on Nov 1, 2023

In this paper, we propose a novel benchmark called the StarCraft Multi-Agent Challenges+, where agents learn to perform multi-stage tasks and to use environmental factors without p...

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

Uploaded on Nov 1, 2023

Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks. To address this issue, we propose an enhanced mu...

MAST: Multimodal Abstractive Summarization with Trimodal Hierarchical Attention

Uploaded on Nov 1, 2023

Large-scale deployment of autonomous vehicles has been continually delayed due to safety concerns. On the one hand, comprehensive scene understanding is indispensable, a lack of wh...

Momentum Calibration for Text Generation

Uploaded on Nov 1, 2023

Vision Transformer (ViT) extends the application range of transformers from language processing to computer vision tasks as being an alternative architecture against the existing c...

Multimodal Pretraining for Dense Video Captioning

Uploaded on Nov 1, 2023

Traffic forecasting as a canonical task of multivariate time series forecasting has been a significant research topic in AI community. To address the spatio-temporal heterogeneity ...

NITS-VC System for VATEX Video Captioning Challenge 2020

Uploaded on Nov 1, 2023

Video captioning is process of summarising the content, event and action of the video into a short textual form which can be helpful in many research areas such as video guided mac...

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Uploaded on Nov 1, 2023

Convolution Neural Networks (CNNs) are widely used in medical image analysis, but their performance degrade when the magnification of testing images differ from the training images...

RefineCap: Concept-Aware Refinement for Image Captioning

Uploaded on Nov 1, 2023

We present Video-LLaMA a multi-modal framework that empowers Large Language Models (LLMs) with the capability of understanding both visual and auditory content in the video. Video-...

Scaling Up Vision-Language Pre-training for Image Captioning

Uploaded on Nov 1, 2023

The recent advances in neural language models have also been successfully applied to the field of chemistry, offering generative solutions for classical problems in molecular desig...

About the metric

You can click on the links to see the associated metrics

Objective(s):

Transparency
Explainability

Purpose(s):

Interaction support/chatbots
Recognition/object detection

Target sector(s):

Science & technology
Innovation
Health
Environment
Corporate governance
Transport

Lifecycle stage(s):

Operate & monitor
Verify & validate
Build & interpret model

Target users:

Data scientist
Developer

Risk management stage(s):

Assess
Treat

Modify this metric

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.