These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
Moderation: Identify potentially harmful content in text and images
The Moderation endpoint allows developers to classify text and image inputs to determine whether they may violate OpenAI’s safety policies. The moderation models analyse content submitted through the API and return classifications indicating whether the input falls into specific policy categories. Developers send text or image inputs to the /v1/moderations endpoint, where the moderation model evaluates the content according to OpenAI’s safety policies. The response provides structured information about the categories that apply to the input and whether the content is flagged. The moderation system therefore enables automated analysis of potentially harmful or policy-violating content.
The models used for moderation are designed to support the detection of a range of policy categories defined in OpenAI’s safety framework. These categories include types of harmful or unsafe content defined in the OpenAI usage policies. The API returns classification results that allow developers to determine whether submitted content should be filtered, blocked, or reviewed. The moderation endpoint can therefore be integrated into applications to screen user-generated or model-generated content before it is displayed or further processed.
Developers interact with the moderation models through standard API requests, submitting inputs and receiving classification outputs in a structured format. The endpoint supports moderation of both text and image inputs, enabling safety checks across different types of content. By providing automated classification aligned with OpenAI safety policies, the Moderation API enables developers to incorporate safety checks into their applications and workflows.
About the tool
You can click on the links to see the associated tools
Developing organisation(s):
Tool type(s):
Objective(s):
Purpose(s):
Lifecycle stage(s):
Type of approach:
Maturity:
Usage rights:
Target groups:
Target users:
Validity:
Geographical scope:
People involved:
Risk management stage(s):
Technology platforms:
Tags:
- ai generated content
- api
- safety
- ai evaluation
Use Cases
Would you like to submit a use case for this tool?
If you have used this tool, we would love to know more about your experience.
Add use case



























