Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Given a model and an input text sequence, perplexity measures how likely the model is to generate the input text sequence. This can be used in two main ways: 

- to evaluate how well the model has learned the distribution of the text it was trained on. In this case, the model input should be the trained model to be evaluated, and the input texts should be the text that the model was trained on.
- to evaluate how well a selection of text matches the distribution of text that the input model was trained on. In this case, the model input should be a trained model, and the input texts should be the text to be evaluated.

Related use cases :

Uploaded on Nov 1, 2022

Neural text decoding is important for generating high-quality texts using language models. To generate high-quality text, popular decoding algorithms like top-k, top-p (nucleus...


Uploaded on Nov 1, 2023
Long short-term memory (LSTM) networks and their variants are capable of encapsulating long-range dependencies, which is evident from their performance on a variety of linguistic t...


About the metric


Objective(s):



Target sector(s):




Risk management stage(s):

Modify this metric

catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.