Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Mission KI: Quality Standard



Mission KI: Quality Standard

Mission KI is developing a voluntary quality standard guideline for artificial intelligence (AI) that strengthens the reliability and trustworthiness of AI applications and systems. The standard is based on the "Ethics Guidelines for Trustworthy AI" developed by the High-Level Expert Group (HLEG) of the European Commission. The HLEG established key principles for assessing AI trustworthiness, which also served as a foundation for the AI Act’s requirements.

1. What is the foundation of the Mission KI Quality Standard based on?

1.1 — Values: In this framework, six core values have been identified to guide responsible AI development and deployment:

  1. Reliability
    • Performance & Robustness
    • Fallback Plans & General Safety
  2. AI-specific cyber security
    • Resistance to AI-specific attacks and security
  3. Data quality, protection and management
    • Data quality & integrity
    • Protection of personal data
    • Protection of proprietary data
    • Data access
  4. Non-discrimination
    • Avoidance of unjustified distortions
    • Accessibility and universal design
    • Stakeholder participation
  5. Transparency
    • Traceability & documentation
    • Explainability & interpretability
    • External communication
  6. Human supervision & control
    • Human capacity to act
    • Human supervision

1.2 — Protection needs analysis: 

The Mission KI Quality Standard relies on the protection needs analysis (SBA) as a starting point to ensure efficiency. This analysis determines the necessary protection requirements for the defined values and thus forms the basis for a targeted test. It filters out the relevant values and criteria for a use case and defines a target for the subsequent test.

The minimum standard therefore considers the variety of AI application scenarios - from energy distribution optimization and product recommendation systems to medical diagnostic tools. The relevance of the individual values varies depending on the use case.

For example, the value ‘non-discrimination’ plays a subordinate role in an AI for optimizing power distribution, as the decisions are based on technical parameters. In this case, the value of ‘transparency’ takes center stage: the AI's decisions must be comprehensible and understandable so that operators and regulatory authorities can check why certain energy distributions were made.

Regardless of the use case, the ‘reliability’ value is always subject to scrutiny, as it is considered fundamental to the quality of any AI application. The other values can be categorized as not applicable in whole or in part under certain conditions that are clearly defined in the protection requirements analysis.

2. How does the standard become auditable?

2.1 — The test criteria catalogue translates abstract values into measurable variables

The Mission KI test criteria catalogue is based on three sources in particular:

  • VDE SPEC 90012,
  • AI test catalogue of the Fraunhofer IAIS,
  • AIC4 criteria catalogue for AI cloud services from the Federal Office for Security and Information Technology (BSI).

In order to make the Mission KI quality standard testable, the 6 abstract values were translated into a structured test procedure based on the so-called ‘VCIO’ approach (Values - Criteria - Indicators - Observables). This is divided into several levels: The values form the foundation on which specific criteria are built. Indicators and measurable variables (observables) are used to assess these criteria. The degree of fulfilment of each value is systematically determined on the basis of this structure. This methodology ensures a precise and comprehensible assessment.

In addition, test tools are developed to check the fulfilment of the observables and thus increase the reliability of the test result.

 2.2 — The evaluation

Evaluations are conducted by either internal or external auditors. 

At the end of the test process, an overall assessment is made for each of the six defined values. This assessment is compared with the previously determined protection requirements. An AI application passes the test if it achieves the defined test target for each value. This documents that the quality measures and their evidence sufficiently fulfil the identified protection requirements.

The successful test thus confirms that the AI application fulfills the necessary quality standards and has implemented the required protective measures. This process ensures a thorough evaluation and creates transparency regarding the trustworthiness and security of the tested AI systems.

About the tool




Objective(s):


Impacted stakeholders:


Type of approach:






Geographical scope:


Tags:

  • transparent
  • ai reliability
  • trustworthiness
  • data quality
  • cybersecurity
  • human-centred design

Modify this tool

Use Cases

There is no use cases for this tool yet.

Would you like to submit a use case for this tool?

If you have used this tool, we would love to know more about your experience.

Add use case
catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.