Data poisoning risk for AI
Description
A type of adversarial attack where an adversary or malicious insider injects intentionally corrupted, false, misleading, or incorrect samples into the training or fine-tuning dataset.
Why is data poisoning a concern for foundation models?
Poisoning data can make the model sensitive to a malicious data pattern and produce the adversary’s desired output. It can create a security risk where adversaries can force model behavior for their own benefit. In addition to producing unintended and potentially malicious results, a model misalignment from data poisoning can result in business entities facing legal consequences or reputational harms.
Parent topic: AI risk atlas