Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Dyff



Dyff

Dyff is a cloud platform for hosting high-integrity evaluations of AI system performance, designed for the use case of testing deployment-ready AI systems on private datasets. Dyff evaluates AI systems that are packaged as containerized web services. The systems under test run within Dyff, allowing evaluations on private datasets where the evaluation data never leaves the Dyff system. This maximizes the useful lifetime of the evaluation by preventing AI systems from being trained on the evaluation data. The combination of the AI system image, additional data volumes, and runtime configuration is stored together under a permanent ID, making the systems and evaluations fully reproducible.

Evaluations are implemented as Jupyter notebooks. Dyff serves the rendered results of these notebooks, as well as providing hooks for returning scores that can be viewed in various summary “dashboards”. Dyff can also be used to host “challenges”, where participants compete to submit systems with the best performance on the challenge tasks.

Dyff is free and open source software with no mandatory proprietary dependencies. It runs on Kubernetes and is designed to be deployable on most Kubernetes clusters. Dyff is deployed using infrastructure-as-code technologies, and a free and open-source production-ready deployment configuration is also available.

About the tool


Developing organisation(s):


Tool type(s):



Impacted stakeholders:




Country/Territory of origin:


Lifecycle stage(s):


Type of approach:




License:




Stakeholder group:



Geographical scope:



Technology platforms:


Tags:

  • data catalogue
  • ai governance
  • ai auditing
  • large langage models
  • ai evaluation
  • deepfakes

Modify this tool

Use Cases

There is no use cases for this tool yet.

Would you like to submit a use case for this tool?

If you have used this tool, we would love to know more about your experience.

Add use case
Partnership on AI

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.