Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Overview Tools Metrics About the catalogue

Resaro AI Solutions Quality Index Engineer (ASQI Engineer)

Website

Github

Tool package

Resaro AI Solutions Quality Index Engineer (ASQI Engineer)

AI Solutions Quality Index Engineer (ASQI Engineer) is an open-source framework for testing and assuring AI systems. Built for scale and reliability, it uses containerised test packages, automated assessments, and repeatable workflows to make evaluation transparent and robust. With ASQI Engineer, organisations also run AI Solutions Quality Index's that they have created themselves, giving teams full control and confidence in AI quality.

ASQI Engineer is in active development and contribution to test new packages, share score cards and test plans, and help define common schemas to meet industry needs are welcome. The initial release focuses on comprehensive chatbot testing with extensible foundations for broader AI system evaluation.

Key features:

1. Durable Execution: DBOS-powered fault tolerance with automatic retry and recovery for reliable test execution.

2. Container Isolation: Reproducible testing in isolated Docker environments with consistent, repeatable results.

3. Multi-System Orchestration: Coordinate target, simulator, and evaluator systems in complex testing workflows.

4. Flexible Assessment: Configurable score cards map technical metrics to business-relevant outcomes.

5. Type-Safe Configuration: Pydantic schemas with JSON Schema generation provide IDE integration and validation.

6. Modular Workflows: Separate validation, test execution, and evaluation phases for flexible CI/CD integration.

About the tool

You can click on the links to see the associated tools

Developing organisation(s):

Resaro

Tool type(s):

Technical validation
Technical documentation
Trust/Quality mark

Objective(s):

Explainability
Digital Security

Impacted stakeholders:

Management
Regulators

Target sector(s):

Corporate governance
Defence
Public sector

Lifecycle stage(s):

Operate & monitor
Deploy
Verify & validate

Type of approach:

Technical

Maturity:

In development

Usage rights:

Open source/Permissive

License:

Apache 2.0

Target users:

Data scientist
Developer

Stakeholder group:

Business
Government
Technical community

Validity:

Other

Enforcement:

Certification
Reporting frameworks
Trust/Quality mark

Benefits:

Faster implementation
Increased quality results
Reduction in risk of failure

People involved:

Other

Required skills:

Data
IT infrastructure
Programming skills

Technology platforms:

Platform neutral

Tags:

open source
machine learning testing
benchmarking
llm redteaming
quality measurement

Modify this tool

Use Cases

There is no use cases for this tool yet.

Would you like to submit a use case for this tool?

If you have used this tool, we would love to know more about your experience.

Add use case

Partnership on AI

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.