These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
Inspect: an open-source framework for large language model evaluations
Created by the UK AI Safety Institute, Inspect is a software library which enables testers – from start ups, academia and AI developers to international governments – to assess specific capabilities of individual models and then produce a score based on their results. Inspect can be used to evaluate models in a range of areas, including their core knowledge, ability to reason, and autonomous capabilities. Released through an open source licence, Inspect it is freely available.
Making Inspect available to the global community, the Institute is helping accelerate the work on AI safety evaluations being carried out across the globe, leading to better safety testing and the development of more secure models. This will allow for a consistent approach to AI safety evaluations around the world.
Inspect provides many built-in components, including facilities for prompt engineering, tool usage, multi-turn dialog, and model graded evaluations.
For more information please view the press release on AI Safety Institute releases new AI safety evaluation platform.
About the tool
You can click on the links to see the associated tools
Developing organisation(s):
Objective(s):
Country of origin:
Type of approach:
Maturity:
Usage rights:
Target users:
Tags:
- collaborative governance
- evaluation
- large language model
- open source
- AISI
Use Cases
Would you like to submit a use case for this tool?
If you have used this tool, we would love to know more about your experience.
Add use case