Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Overview Tools Metrics About the catalogue

Behavior Elicitation Tool

Website

Tool package

Behavior Elicitation Tool (BET) is a complex-AI system that systematically probes and elicits specific behaviors from cutting-edge LLMs. Whether for red-teaming or targeted behavioral analysis, this automated solution is Dynamic Optimized and Adversarial (DAO) and can be configured to test the robustness precisely and help to have a better control of the AI system. The system's capabilities are continuously enhanced by research in AI interpretability and safety.

This automated solution enables precise specification and elicitation of any target behavior, from security vulnerabilities to desired output patterns. Leveraging advanced prompt engineering and behavioral steering techniques, BET systematically maps the complete behavioral landscape of LLMs and their deployment contexts (system prompts, scaffolding, RAG systems, output filters, etc.).

BET's architecture enables granular control over model behavior throughout the entire AI development lifecycle. By integrating into CI/CD pipelines during training, fine-tuning, and deployment phases, it provides dynamic behavioral verification across arbitrary target specifications. Unlike static benchmarks, BET exploits model-specific characteristics to identify and shape behaviors across the full spectrum of possible outputs, enabling developers to precisely tune their models' responses for any desired application context.

The system's sophisticated behavior mapping capabilities are continuously enhanced by breakthroughs in AI interpretability and safety research, providing unprecedented control over LLM outputs. This enables more robust behavioral confidence for industrial GenAI applications while maintaining visibility into model dynamics.

About the tool

You can click on the links to see the associated tools

Developing organisation(s):

prism eval

Tool type(s):

Toolkit/software
Risk management framework
Trust/Quality mark

Objective(s):

Robustness
Safety

Impacted stakeholders:

Consumers
Other
Regulators

Purpose(s):

Event/anomaly detection
Goal-driven optimisation

Target sector(s):

Industry & entrepreneurship
Finance and insurance
Digital Economy

Country of origin:

France
European Union

Lifecycle stage(s):

Deploy
Verify & validate
Build & interpret model

Type of approach:

Technical

Maturity:

Project stage

Target groups:

Private sector
Public sector
Technical community

Target users:

Data scientist
Developer
System integrators

Stakeholder group:

Academia
Business
Technical community

Validity:

Always up to date

Enforcement:

Reporting frameworks
Trust/Quality mark

Benefits:

Improved ability to scale
Reduction in risk of failure
Responsible implementation

Geographical scope:

All countries
United Kingdom
Europe
United States

People involved:

Clients
Government agencies
IT employees

Required skills:

Data
Programming skills

Technology platforms:

Platform neutral

Tags:

robustness
safety
genAI
red-teaming
llm

Modify this tool

Use Cases

There is no use cases for this tool yet.

Would you like to submit a use case for this tool?

If you have used this tool, we would love to know more about your experience.

Add use case

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.