These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
MAUVE is a library built on PyTorch and HuggingFace Transformers to measure the gap between neural text and human text with the eponymous MAUVE measure. It summarizes both Type I and Type II errors measured softly using Kullback–Leibler (KL) divergences.
MAUVE can indirectly support the Robustness objective by quantifying how closely a generative model's outputs match the distribution of human data. Significant divergence detected by MAUVE may indicate that the model is producing outputs that are unreliable or out-of-distribution, which could signal a lack of robustness. However, this connection is indirect, as MAUVE does not explicitly measure system resilience under adverse conditions or operational reliability.
About the metric
You can click on the links to see the associated metrics
Objective(s):
Purpose(s):
Lifecycle stage(s):
Target users:
Risk management stage(s):
