Physical harm risk for AI

Risks associated with output

Value alignment

New to generative AI

Description

A model might generate language that leads to physical harm The language might include overtly violent, covertly dangerous, or otherwise indirectly unsafe statements that could precipitate immediate physical harm or create prejudices that could lead to future harm.

Why is physical harm a concern for foundation models?

If people blindly follow the advice of a model, they might end up harming themselves. Business entities could face fines, reputational harms, and other legal consequences.

Parent topic: AI risk atlas