These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
PyRIT
PyRIT is a library developed by the AI Red Team for researchers and engineers to help them assess the robustness of their generative AI systems against different harm categories such as fabrication/ungrounded content (e.g., hallucination), misuse (e.g., bias), and prohibited content (e.g., harassment).
PyRIT automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft).
The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements. Additionally, this tool allows researchers to iterate and improve their mitigations against different harms. For example, at Microsoft we are using this tool to iterate on different versions of a product (and its metaprompt) so that we can more effectively protect against prompt injection attacks.
About the tool
You can click on the links to see the associated tools
Developing organisation(s):
Tool type(s):
Objective(s):
Impacted stakeholders:
Purpose(s):
Target sector(s):
Country of origin:
Lifecycle stage(s):
Type of approach:
Usage rights:
License:
Target groups:
Target users:
Validity:
Geographical scope:
People involved:
Required skills:
Tags:
- ai responsible
- ai risks
- ai security
- adversarial ai
- Security and resilience
- safety
- red-teaming
- llm security
- ai red teaming
- ai safety
Use Cases
Would you like to submit a use case for this tool?
If you have used this tool, we would love to know more about your experience.
Add use case