Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

ISO/IEC TR 29119-11:2020 - Software and systems engineering - Software testing - Part 11: Guidelines on the testing of AI-based systems



This document provides an introduction to AI-based systems. These systems are typically complex (e.g. deep neural nets), are sometimes based on big data, can be poorly specified and can be non-deterministic, which creates new challenges and opportunities for testing them. This document explains those characteristics which are specific to AI-based systems and explains the corresponding difficulties of specifying the acceptance criteria for such systems. This document presents the challenges of testing AI-based systems, the main challenge being the test oracle problem, whereby testers find it difficult to determine expected results for testing and therefore whether tests have passed or failed. It covers testing of these systems across the life cycle and gives guidelines on how AI-based systems in general can be tested using black-box approaches and introduces white-box testing specifically for neural networks. It describes options for the test environments and test scenarios used for testing AI-based systems. In this document an AI-based system is a system that includes at least one AI component. © ISO All Rights Reserved

The information about this standard has been compiled by the AI Standards Hub, an initiative dedicated to knowledge sharing, capacity building, research, and international collaboration in the field of AI standards. You can find more information and interactive community features related to this standard by visiting the Hub’s AI standards database here. To access the standard directly, please visit the developing organisation’s website.

About the tool



Tool type(s):


Objective(s):


Type of approach:



Usage rights:


Geographical scope:


Tags:

  • documentation
  • data quality
  • robustness
  • Accuracy and performance
  • human-centred design

Modify this tool

Use Cases

There is no use cases for this tool yet.

Would you like to submit a use case for this tool?

If you have used this tool, we would love to know more about your experience.

Add use case
catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.