Moderation | OpenAI | Page 1

heavy token Dec 22, 2023, 9:51 PM

#

If I am a developer with a public facing application and users try to abuse it, will I get in trouble assuming the moderation endpoint flags the message?

signal boneBOT Dec 22, 2023, 9:51 PM

#

Hi @heavy token. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Based on the information provided in the extracts, OpenAI has a moderation endpoint that can be used to identify content that violates OpenAI's usage policies. This endpoint is designed to help developers prevent unintended access terminations due to inappropriate inputs or outputs.

However, it's important to note that the moderation endpoint is not integrated into the API endpoints for GPT-3.5T & 4 by default. This means that developers need to manually send requests to the moderation endpoint to check the content. If the moderation endpoint flags a message as inappropriate, it doesn't automatically prevent the message from being sent or received. It's up to the developer to implement the necessary actions based on the moderation results.

As for whether you would get in trouble if users try to abuse your application, it would depend on how you handle the flagged content. If you ignore the flags and allow the inappropriate content to be sent or received, it could potentially lead to access terminations. On the other hand, if you take appropriate action based on the flags (e.g., blocking the content, warning the user, etc.), it could help prevent such issues.

Here's an example of how to use the moderation endpoint:

curl <https://api.openai.com/v1/moderations> \
 -X POST \
 -H "Content-Type: application/json" \
 -H "Authorization: Bearer $OPENAI_API_KEY" \
 -d '{"input": "Sample text goes here"}'

This will return a response with a flagged field, which is set to true if the model classifies the content as violating OpenAI's usage policies, and false otherwise. It also returns categories and category_scores fields, which provide more detailed information about the potential violations.

#

Remember, the moderation endpoint is a tool to help you enforce OpenAI's usage policies in your application, but it's ultimately your responsibility to handle the flagged content appropriately.

Sources:

Forum Post: API Endpoints with Integrated Content Moderation (https://community.openai.com/t/353843)
Docs > Guides > Moderation > Quickstart Part 1 (https://platform.openai.com/docs/guides/moderation)

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@heavy token kapa.ai is still learning and improving, please let us know how it did by reacting below

heavy token Dec 22, 2023, 9:52 PM

#

@signal bone if I block the content when the moderation filter flags it, and don't send it to the GPT-4 model, I'm good?

signal boneBOT Dec 22, 2023, 9:52 PM

#

Let me check the knowledge base for you. One moment! :books:

#

Yes, if you use the moderation API to check the content and block it when it's flagged, you are taking appropriate action to prevent the violation of OpenAI's usage policies. This means you are not sending inappropriate content to the GPT-4 model, which is a good practice to maintain the ethical use of AI.

Remember, the moderation API is free to use and it's designed to help you identify content that could potentially violate OpenAI's policies. By using it effectively, you can ensure that your application is in compliance with OpenAI's guidelines and prevent any potential issues related to inappropriate content.

However, it's important to note that the moderation API is not perfect and may not catch all inappropriate content. Therefore, it's recommended to also have other safeguards in place, such as user reporting mechanisms, to handle any content that might slip through.

#Moderation

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

Source: Forum Post: Content filtered and charging > Official Answer (https://community.openai.com/t/424755)

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: