Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Why we need a catalogue of AI tools and metrics to promote trustworthy AI

Explainability, transparency and avoiding bias are among the most critical challenges for AI practitioners, as complex AI systems and algorithms can make these hard to attain. There are many tools at the disposal of AI practitioners and policy makers, but it is not always easy to find them and even more difficult to know which ones are the most effective. The catalogue provides the much-needed space where anyone can find and share tools and methods for making AI trustworthy.

How do we define “trustworthy”? This refers to AI systems that respect OECD AI Principles, such as promoting shared well-being and prosperity while protecting individual rights and democratic values.

In this context, “tools” is an umbrella term that covers almost anything that helps make AI more trustworthy, from computer software and programming code to employee workshops and training or guidelines and standards. This Catalogue gives access to the latest tools but also use cases about user experiences and metrics.

A public space for AI practitioners to share and compare tools and metrics for trustworthy AI

The catalogue is a platform where AI practitioners from all over the world can share and compare tools and build upon each other’s efforts to create global best practices and speed up the process of implementing the OECD AI Principles.

Share your tools and methods

If you have or know of any tools to help make AI trustworthy, please submit them to this catalogue. You can also share your experience using a tool by submitting a use case and share metrics for measuring trustworthy AI.

Video: What are tools for implementing trustworthy AI?

Technical, procedural and educational tools for trustworthy AI

The OECD AI Expert Group on Tools & Accountability identified three categories of tools:

  • Technical tools address AI-related issues such as bias detection, transparency and explainability, performance, robustness, safety and security against attacks. They include toolkits, software, technical documentation, certification and standards, product development or lifecycle tools, and technical validation tools. Most of these tools are GitHub open-source projects with a high popularity within the technical community, here understood as repositories in the 90th percentile with respect to the number of forks and stars for each AI principle. Taxonomy fields such as the AI principles, purpose, lifecycle stage and target users are annotated using ChatGPT, and later reviewed to ensure data quality.
  • Procedural tools provide operational and process-related guidance. They include guidelines, governance frameworks, product development methods, lifecycle, and risk management tools, sector-specific codes of conduct and collective agreements, process certifications and standards.
  • Educational tools cover all means for building awareness, such as preparing and upskilling stakeholders involved in or affected by implementing an AI system. They include change management processes, capacity and awareness-building tools, guidance for designing inclusive AI systems, training programmes and educational materials.

People share their experiences with tools through use cases

The catalogue allows users to submit their experiences as use cases, where they can give guidance, insights and a general appreciation of the tool. The use cases are linked to the tools they evaluate for easy access.

Technical metrics and benchmarks to evaluate trustworthiness

How a tool is evaluated for trustworthiness depends upon what it does and the outcomes it produces. Metrics and methodologies exist for measuring and evaluating AI trustworthiness and AI risks. These technical metrics are often represented through mathematical formulas that assess the technical requirements for achieving trustworthy AI in a particular context. They can ensure that a system is fair, accurate, explainable, transparent, robust, safe, or secure. Anyone can submit their equations and metrics to be part of the catalogue.

How the information in the Catalogue is compiled and kept up to date

The Catalogue of AI Tools & Metrics has mechanisms to ensure that content is accurate and current. It operates with an open submission process, where tools are submitted directly by the organisations or individuals who created them and by third parties (see Catalogue disclosures). The OECD Secretariat vets submissions to ensure accuracy and objectivity.

There is a biannual review and updating process when organisations are encouraged to submit new initiatives and update existing ones. If an existing initiative isn’t updated over a two-year period, it will be removed from the catalogue. Partnerships with relevant stakeholders such as Business at the OECD, the OECD Civil Society Information Society Advisory Council and the OECD Trade Union Advisory Committee facilitate this biannual review.

Automated tools and metrics submission

AI-related GitHub repositories are retrieved and classified using a REST API, then submitted as technical tools to the Catalogue. These tools address various AI-related issues, including bias detection, transparency, explainability, performance, robustness, safety, and security against attacks. They encompass toolkits, software, technical documentation, certification and standards, product development or lifecycle tools, and technical validation tools. Only repositories with significant popularity within the technical community are retrieved, defined here as repositories in the 90th percentile in terms of the number of forks and stars for each AI principle. Taxonomy fields such as AI principles, purpose, lifecycle stage, and target users are annotated using ChatGPT, and subsequently reviewed to ensure data quality.

Furthermore, new metrics are also identified using the Papers with Code database, which serves as a platform that connects research papers with their code implementations, offering a centralized resource for researchers to access the latest advancements across various fields through academic papers and their corresponding code implementations. AI-related papers are retrieved, and their associated metrics and use cases are identified and published to the Catalogue. In this process, ChatGPT is utilized to annotate certain taxonomy fields, such as the OECD AI principle and the target sector of the metric.

What taxonomy does the OECD use to categorise the tools?

The taxonomy to classify tools for trustworthy AI is largely based on the framework to evaluate approaches to trustworthy AI developed by the OECD.AI Working Group on Tools & Accountability. To learn more about the framework, please read the OECD report on “Tools for trustworthy AI: A framework to compare implementation tools for trustworthy AI systems”.

Partners

The catalogue is the result of a broad partnership that includes the US National Institute of Standards and Technology (NIST), the European Commission (EC), the UK’s Portfolio of AI Assurance Techniques[1] and the AI Standards Hub, Partnership on AI and the Institute for Future Initiatives at the University of Tokyo. Other partners include stakeholder groups like Business at the OECD (BIAC), the OECD Civil Society Information Society Advisory Council (CSISAC) and the OECD Trade Union Advisory Committee (TUAC).

Additionally, the catalogue has benefited greatly from the partnership with Duke’s Ethical Technology Practicum programme. In particular, we are grateful for the contributions of Amanda Booth, Anders Liman, Jacob Stotser and Nathan Gray, under the leadership of Prof. Lee Tiedrich.

[1]The Portfolio of AI Assurance Techniques is a collaboration between the UK government (CDEI) and industry to showcase how tools for trustworthy AI are already being applied by businesses to real-world use cases and assess their benefits and limitations and how they align with the UK’s AI regulatory principles. The Portfolio aims to increase awareness of, and demand for, assurance tools and services, as well as help governments to understand where there may be particular gaps in the supply of such services.

If you have another question, please contact us

Contact
catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.