Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Assessing the trustworthiness of the use of machine learning as a supportive tool to recognize cardiac arrest in emergency calls

Jan 3, 2023

 Assessing the trustworthiness of the use of machine learning as a supportive tool to recognize cardiac arrest in emergency calls

Excerpt of: Frontiers | On Assessing Trustworthy AI in Healthcare. Machine Learning as a Supportive Tool to Recognize Cardiac Arrest in Emergency Calls (frontiersin.org)

This is a self-assessment conducted jointly by a team of independent experts together with the prime stakeholder of this use case. The main motivation of this work is to study if the rate of lives saved could be increased by using AI, and at the same time to identify possible risks and pitfalls of using the AI system assessed here, and to provide recommendations to key stakeholders.

The main contribution of this use case is to demonstrate how to use the general AI HLEG trustworthy AI guidelines in practice in the healthcare domain. Based on these four principles, the AI HLEG sets out seven requirements for AI systems to be deemed trustworthy and which assist the process of self-assessment. Each requirement is described below:

  • Human agency and oversight: all potential impacts that AI systems may have on fundamental rights should be accounted for and that the human role in the decision-making process is protected.
  • Technical robustness and safety: AI systems should be secure and resilient in their operation in a way that minimizes potential harm, optimizes accuracy, and fosters confidence in their reliability;
  • Privacy and data governance: given the vast quantities of data processed by AI systems, this principle impresses the importance of protecting the privacy, integrity, and quality of the data and protects human rights of access to it;
  • Transparency: AI systems need to be understandable at a human level so that decisions made through AI can be traced back to their underlying data. If a decision cannot be explained it cannot easily be justified;
  • Diversity, non-discrimination, and fairness: AI systems need to be inclusive and non-biased in their application. This is challenging when the data is not reflective of all the potential stakeholders of an AI system;
  • Societal and environmental wellbeing: in acknowledging the potential power of AI systems, this principle emphasizes the need for wider social concerns, including the environment, democracy, and individuals to be taken into account; and
  • Accountability: this principle, rooted in fairness, seeks to ensure clear lines of responsibility and accountability for the outcomes of AI systems, mechanisms for addressing trade-offs, and an environment in which concerns can be raised.

 To this end, we present a best practice of assessing the use of machine learning as a supportive tool to recognize cardiac arrest in emergency calls. The AI system under assessment is currently in use in the city of Copenhagen in Denmark.

The problem: Health-related emergency calls (112) are part of the Emergency Medical Dispatch Center (EMS) of the City of Copenhagen, triaged by medical dispatchers (i.e., medically trained dispatchers who answer the call, e.g., nurses and paramedics) and medical control by a physician on-site. 

The AI solution: A team lead by Stig Nikolaj Blomberg (Emergency Medical Services Copenhagen, and Department of Clinical Medicine, University of Copenhagen, Denmark) worked together with a start-up company and examined whether a machine learning (ML) framework could be used to recognize out-of-hospital cardiac arrest (OHCA) by listening to the calls made to the Emergency Medical Dispatch Center of the City of Copenhagen. The company designed and implemented the AI system and trained and tested it by using the archive of audio files of emergency calls provided by Emergency Medical Services Copenhagen in the year 2014. The prime aim of this AI system is to assist medical dispatchers when answering 112 emergency calls to help them to early detect OHCA during the calls, and therefore possibly saving lives.

For the assessment, we use a process to assess trustworthy AI, called Z-Inspection to identify specific challenges and potential ethical trade-offs when we consider AI in practice. The Z-Inspection is a holistic process based on the method of evaluating new technologies according to which ethical issues must be discussed through the elaboration of socio-technical scenarios. The Z-Inspection process, in a nutshell, is depicted in the figure below and it is composed of three main phases: 1) the Set Up Phase, 2) the Assess Phase, and 3) the Resolve Phase.

https://www.frontiersin.org/files/Articles/673104/fhumd-03-673104-HTML/image_m/fhumd-03-673104-g001.jpg

Benefits of using the tool in this use case

Evaluation of AI development with a holistic approach like Z-Inspection creates benefits related to general acceptance or concerns inside and outside the institution that applies an AI project. The approach can improve the quality of the project’s processes and increase transparency about possible conflicts of interest. In general, the system becomes more comprehensible, which improves the quality of communication for any kind of stakeholder.

For the public, communicating the evaluation process itself can help reinforce trust in such a system by making its exact workings transparent, even to non-specialist project staff. This transparency helps funders, oversight boards, and executive teams explain their decisions about funding and governing decisions as well as the system’s operation.

Shortcomings of using the tool in this use case

There is a danger that a false or inaccurate inspection will create natural skepticism by the recipient, or even harm them and, eventually, backfire on the inspection method. There are also legal issues (some of which are addressed in the Human-Machine interaction and Legal perspective Section). This is a well-known problem for all quality processes. We alleviated it using an open development and incremental improvement to establish a process and brand (“Z-Inspected”).

Learnings or advice for using the tool in a similar context

While this use case directly refers to the use of machine learning as a supportive tool to recognize cardiac arrest in emergency calls, there are various ways in which the findings of this qualitative analysis could be applicable to other contexts. First, the general framework for achieving trustworthy AI set out in the HLEG AI guidelines proved to be an adequate starting point for a specific case study discussion in the healthcare domain. Second, the ethical principles of the HLEG AI guidelines need some context-specific specification. Third, this contextualization and specification can successfully be undertaken by an interdisciplinary group of researchers that together are able to not only bring in the relevant scientific, medical and technological expertize but also to highlight the various facets of the ethical principles as they play out in the respective case.

Modify this use case

About the use case



Target sector(s):


Country of origin: