These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
ShieldGemma
ShieldGemma is a set of instruction tuned models for evaluating the safety of text and images against a set of defined safety policies.
The models are designed to support the development of generative AI systems by evaluating the safety of prompts and model outputs against predefined safety policies. ShieldGemma functions as a content safety evaluation component that can be integrated into generative AI applications to detect and prevent policy violations in generated content.
The model family includes classifiers for different modalities. ShieldGemma 1 focuses on text content moderation and is available in several parameter sizes, including 2B, 9B, and 27B models. ShieldGemma 2 extends these capabilities to images and provides a 4-billion-parameter model designed to classify the safety of synthetic and natural images.
These models are trained to identify violations across key harm categories such as sexually explicit content, dangerous content, hate, and harassment. Developers can use ShieldGemma as a filtering mechanism within generative AI pipelines, for example by checking prompts before they reach a model or by filtering outputs generated by AI systems.
The models are provided with open weights and can be fine-tuned to adapt to specific use cases and safety policies defined by developers. By enabling automated classification of unsafe or policy-violating content, ShieldGemma supports developers in building generative AI applications that align with defined safety standards and reduce the risk of harmful outputs.
About the tool
You can click on the links to see the associated tools
Developing organisation(s):
Tool type(s):
Objective(s):
Purpose(s):
Lifecycle stage(s):
Type of approach:
Maturity:
Usage rights:
Target groups:
Target users:
Stakeholder group:
Geographical scope:
People involved:
Required skills:
Technology platforms:
Tags:
- ai ethics
- evaluation
- ai generated content
- ai safety
Use Cases
Would you like to submit a use case for this tool?
If you have used this tool, we would love to know more about your experience.
Add use case



























