Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Inspect: an open-source framework for large language model evaluations



Created by the UK AI Safety Institute, Inspect is a software library which enables testers – from start ups, academia and AI developers to international governments – to assess specific capabilities of individual models and then produce a score based on their results. Inspect can be used to evaluate models in a range of areas, including their core knowledge, ability to reason, and autonomous capabilities. Released through an open source licence, Inspect it is freely available.

Making Inspect available to the global community, the Institute is helping accelerate the work on AI safety evaluations being carried out across the globe, leading to better safety testing and the development of more secure models. This will allow for a consistent approach to AI safety evaluations around the world.
Inspect provides many built-in components, including facilities for prompt engineering, tool usage, multi-turn dialog, and model graded evaluations. 

For more information please view the press release on AI Safety Institute releases new AI safety evaluation platform.

 

 

About the tool


Developing organisation(s):


Objective(s):


Country of origin:


Type of approach:





Tags:

  • collaborative governance
  • evaluation
  • large language model
  • open source
  • AISI

Modify this tool

Use Cases

There is no use cases for this tool yet.

Would you like to submit a use case for this tool?

If you have used this tool, we would love to know more about your experience.

Add use case
catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.