Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Holistic AI Governance, Risk and Compliance Platform



Holistic AI Governance, Risk and Compliance Platform

Holistic AI is an AI Governance platform headquartered in the U.S., with offices in Palo Alto and London. Holistic AI offers end-to-end solutions to test and validate the safety, robustness, and trustworthiness of algorithms deployed across various sectors.

Holistic AI’s AI Governance, Risk Management and Compliance Platform provides a Software as a Service (SaaS) one-stop-shop to govern enterprise AI systems at scale. The platform utilises unique and proprietary solutions based on foundational research in trustworthy AI, such as AI robustness, bias, privacy, and transparency.

Each AI system is evaluated against a five-point criteria derived from Koshiyama et al (2021):

  • Bias: Risks of the AI system generating biased outputs due to improper training data (training bias), inappropriate context (transfer-context bias) and inadequate inferential capabilities (inference bias). 
  • Efficacy: Risk of the AI system underperforming relative to its use case. 
  • Robustness: Risks of the AI system failing in instances of adversarial attacks. 
  • Privacy: Risks associated with the AI system leaking sensitive information or personal data.
  • Explainability: Risks of the AI system generating arbitrary decisions, with its outputs not understandable to developers, deployers and users. 

Each of these risk verticals do not occur in siloes and can often be interrelated. A growing field of research strongly emphasizes the trade-offs and interactions that may occur between them. The Holistic AI Governance platform not only allows enterprises to review the performance of its AI systems against the criteria, but also streamlines the decision-making process around navigating trade-offs.

Holistic AI provides a proprietary solution for auditing Language Models through Safeguard, a specialized module dedicated to LM Governance. When specifically evaluating Large Language Models (LLMs), the Holistic AI team uses a robust combination of the following approaches: 

  • Benchmarking: Involves evaluating LLMs against both, academic and internally developed datasets to gauge levels of model bias, hallucinations, personal information leakage, toxicity, explainability and robustness.
  • Red Teaming: Involves adversarially prompting LLMs to unearth unknown model vulnerabilities through mechanisms like Jailbreaking and debiasing.
  • Fine Tuning: Involves leveraging high-quality datasets to align models towards safety, helpfulness and harm reduction.
  • Human oversight: This involves assessment of LLM-generated content by reviewers for its relevance, accuracy, and appropriateness, to identify any discrepancies that need improvement.
  • Assurance: Benchmarking risk discovery, triage, assessment and mitigation processes to regulation (like the EU AI Act), and standards (such as the NIST’s AI Risk Management Framework (AI RMF) to aid with compliance readiness and assurance.

The platform is structured around the EU’s risk-based approach to AI governance, mapping high to low-risk systems in a single pane Red-Amber-Green dashboard. It is used for both internal development and deployment, and for procurement (third-party risk management). 

Bringing AI law, policy and engineering together, the platform uniquely interconnects all risk verticals as needed to generate a complete picture of an enterprise’s AI risk exposure. This includes a mitigation function for when issues are identified. 

Use Cases

There is no use cases for this tool yet.

Would you like to submit a use case for this tool?

If you have used this tool, we would love to know more about your experience.

Add use case
catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.