Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

Overview Tools Metrics About the catalogue

Deep Reinforcement Learning Agents

Github

This repository contains a collection of reinforcement learning algorithms written in Tensorflow. The ipython notebook here were written to go along with a still-underway tutorial series I have been publishing on Medium. If you are new to reinforcement learning, I recommend reading the accompanying post for each algorithm.

The repository currently contains the following algorithms:

Q-Table - An implementation of Q-learning using tables to solve a stochastic environment problem.
Q-Network - A neural network implementation of Q-Learning to solve the same environment as in Q-Table.
Simple-Policy - An implementation of policy gradient method for stateless environments such as n-armed bandit problems.
Contextual-Policy - An implementation of policy gradient method for stateful environments such as contextual bandit problems.
Policy-Network - An implementation of a neural network policy-gradient agent that solves full RL problems with states and delayed rewards, and two opposite actions (ie. CartPole or Pong).
Vanilla-Policy - An implementation of a neural network vanilla-policy-gradient agent that solves full RL problems with states, delayed rewards, and an arbitrary number of actions.
Model-Network - An addition to the Policy-Network algorithm which includes a separate network which models the environment dynamics.
Double-Dueling-DQN - An implementation of a Deep-Q Network with the Double DQN and Dueling DQN additions to improve stability and performance.
Deep-Recurrent-Q-Network - An implementation of a Deep Recurrent Q-Network which can solve reinforcement learning problems involving partial observability.
Q-Exploration - An implementation of DQN containing multiple action-selection strategies for exploration. Strategies include: greedy, random, e-greedy, Boltzmann, and Bayesian Dropout.
A3C-Doom - An implementation of Asynchronous Advantage Actor-Critic (A3C) algorithm. It utilizes multiple agents to collectively improve a policy. This implementation can solve RL problems in 3D environments such as VizDoom challenges.

About the tool

You can click on the links to see the associated tools

Tool type(s):

Toolkit/software

Objective(s):

Fairness
Robustness
Transparency

Purpose(s):

Recognition/object detection
Content generation

Country of origin:

United States

Lifecycle stage(s):

Operate & monitor
Deploy
Verify & validate
Build & interpret model
Collect & process data
Plan & design

Type of approach:

Technical

Maturity:

Implemented in multiple projects
In development

Usage rights:

Open source/Permissive

License:

MIT License

Target users:

Data scientist
Developer
Project manager
System operators
System integrators
Policy makers

Programming languages:

Python
Jupyter Notebook

Github stars:

2212

Github forks:

Modify this tool

Use Cases

There is no use cases for this tool yet.

Would you like to submit a use case for this tool?

If you have used this tool, we would love to know more about your experience.

Add use case

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.