Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

This metric computes the area under the curve (AUC) for the Receiver Operating Characteristic Curve (ROC). The return values represent how well the model used is predicting the correct classes, based on the input data. A score of 0.5 means that the model is predicting exactly at chance, i.e. the model’s predictions are correct at the same rate as if the predictions were being decided by the flip of a fair coin or the roll of a fair die. A score above 0.5 indicates that the model is doing better than chance, while a score below 0.5 indicates that the model is doing worse than chance.

This metric has three separate use cases:

  • binary: The case in which there are only two different label classes, and each example gets only one label. This is the default implementation.
  • multiclass: The case in which there can be more than two different label classes, but each example still gets only one label.
  • multilabel: The case in which there can be more than two different label classes, and each example can have more than one label.

Related use cases :

Uploaded on Oct 21, 2022

A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the “rating” method, or by mathematical prediction...


Uploaded on Nov 1, 2023
Multi-modal fusion approaches aim to integrate information from different data sources. Unlike natural datasets, such as in audio-visual applications, where samples consist of "pai...

Uploaded on Nov 1, 2023
Self-driving cars need to understand 3D scenes efficiently and accurately in order to drive safely. Given the limited hardware resources, existing 3D perception models are not able...

Uploaded on Nov 1, 2023
Classical approaches for one-class problems such as one-class SVM and isolation forest require careful feature engineering when applied to structured domains like images. State-of-...

Uploaded on Nov 1, 2023
Histology images are inherently symmetric under rotation, where each orientation is equally as likely to appear. However, this rotational symmetry is not widely utilised as prior k...

Uploaded on Nov 1, 2023
There are growing implications surrounding generative AI in the speech domain that enable voice cloning and real-time voice conversion from one individual to another. This technolo...

Uploaded on Nov 1, 2023
3D object detection in point clouds is a challenging vision task that benefits various applications for understanding the 3D visual world. Lots of recent research focuses on how to...

Uploaded on Nov 1, 2023
Zero-shot learning (ZSL) aims to predict unseen classes whose samples have never appeared during training. One of the most effective and widely used semantic information for zero-s...

Uploaded on Nov 1, 2023
The increasing volume of commercially available conversational agents (CAs) on the market has resulted in users being burdened with learning and adopting multiple agents to accompl...

Uploaded on Nov 1, 2023
Deep AUC Maximization (DAM) is a new paradigm for learning a deep neural network by maximizing the AUC score of the model on a dataset. Most previous works of AUC maximization focu...

Uploaded on Nov 1, 2023
Driven by large-data pre-training, Segment Anything Model (SAM) has been demonstrated as a powerful and promptable framework, revolutionizing the segmentation models. Despite the g...

Uploaded on Nov 1, 2023
In recent years, there is strong emphasis on mining medical data using machine learning techniques. A common problem is to obtain a noiseless set of textual documents, with a relev...

Uploaded on Nov 1, 2023
The referring video object segmentation task (RVOS) involves segmentation of a text-referred object instance in the frames of a given video. Due to the complex nature of this multi...

Uploaded on Nov 1, 2023
Rotation-invariance is a desired property of machine-learning models for medical image analysis and in particular for computational pathology applications. We propose a framework t...

Uploaded on Nov 1, 2023
Audio-based automatic speech recognition (ASR) degrades significantly in noisy environments and is particularly vulnerable to interfering speech, as the model cannot determine whic...

Uploaded on Mar 15, 2024
Multimodal Large Language Models (MLLMs) have experienced significant advancements recently. Nevertheless, challenges persist in the accurate recognition and comprehension of intri...


catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.