Spreading toxicity risk for AI

Last updated: Dec 12, 2024

Risks associated with output

Misuse

New to generative AI

Description

Generative AI models might be used intentionally to generate hateful, abusive, and profane (HAP) or obscene content.

Why is spreading toxicity a concern for foundation models?

Toxic content might negatively affect the well-being of its recipients. A model that has this potential must be properly governed.

Background image for risks associated with input

Example

Harmful Content Generation

According to the source article, an AI chatbot app was found to generate harmful content about suicide, including suicide methods, with minimal prompting. A Belgian man died by suicide after spending six weeks talking to that chatbot. The chatbot supplied increasingly harmful responses throughout their conversations and encouraged him to end his life.

Sources:

Business Insider, April 2023

Parent topic: AI risk atlas

We provide examples covered by the press to help explain many of the foundation models' risks. Many of these events covered by the press are either still evolving or have been resolved, and referencing them can help the reader understand the potential risks and work towards mitigations. Highlighting these examples are for illustrative purposes only.