These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
DIN DKE SPEC 99001 - Definition of a success method for labelling data for artificial intelligence training - Application focus: Question-Answering; Text in English
This DIN DKE SPEC in accordance with the DIN DKE SPEC procedure has been drawn up by a DIN DKE SPEC-consortium set up on a temporary basis. This DIN DKE SPEC has been developed and approved by the authors named in the foreword. This DIN DKE SPEC defines guidelines for the labelling of training data for QA systems and specifies the characteristics of labels. Furthermore, it defines terms related to NLP and labelling. While some of the guidelines are valid also for other types of NLP or machine learning applications, the focus of this document is Question-Answering systems specifically. The guidelines presented in this document cover requirements related to the labelling process, onboarding, tooling and ergonomics, as well as QCA for open-domain QA. Three different methods for quality control mechanisms are presented and evaluated. This document is applicable to all industries, topics, languages, document types and use cases. Labelling is used to tailor NLP models to specific domains. Accordingly, the process requirements are valid independent of the domain. Further, this document gives guidelines for setting up a labelling process. Further, there is no limitation of applicability with respect to the underlying technical basics. This document applies to all language models and model parameters. The implications of this document are valid regardless of programming languages, selected IT environments, user interfaces or deployment methods. This document focuses on open-domain (text-based) QA and does not cover QA for knowledge graphs or relational databases. This document does not cover a definition of a model for labels. It further does not establish a system for labelling data and defining the labeling processes, and they can also train their model with a larger amount of standardized data then available. Users benefit accordingly from a better performance of the models, since a larger amount of training data generally contributes to a better performance of models. © 2023 Beuth Verlag GmbH
The information about this standard has been compiled by the AI Standards Hub, an initiative dedicated to knowledge sharing, capacity building, research, and international collaboration in the field of AI standards. You can find more information and interactive community features related to this standard by visiting the Hub’s AI standards database here. To access the standard directly, please visit the developing organisation’s website.
About the tool
You can click on the links to see the associated tools
Developing organisation(s):
Tool type(s):
Objective(s):
Type of approach:
Maturity:
Usage rights:
Geographical scope:
Tags:
- data quality
- Data collection
- Data processing
Use Cases
Would you like to submit a use case for this tool?
If you have used this tool, we would love to know more about your experience.
Add use case