0 / 0
Personal information in data risk for AI

Personal information in data risk for AI

Risks associated with input
Training and tuning phase
Privacy
Traditional

Description

Inclusion or presence of personal identifiable information (PII) and sensitive personal information (SPI) in the data used for training or fine tuning the model might result in unwanted disclosure of that information.

Why is personal information in data a concern for foundation models?

If not properly developed to protect sensitive data, the model might expose personal information in the generated output. Additionally, personal or sensitive data must be reviewed and handled with respect to privacy laws and regulations, as business entities could face fines, reputational harms, and other legal consequences if found in violation.

Background image for risks associated with input
Example

Training on Private Information

According to the article, Google and its parent company Alphabet were accused in a class-action lawsuit of misusing vast amount of personal information and copyrighted material taken from what is described as hundreds of millions of internet users to train its commercial AI products, which includes Bard, its conversational generative artificial intelligence chatbot. This follows similar lawsuits filed against Meta Platforms, Microsoft, and OpenAI over their alleged misuse of personal data.

Parent topic: AI risk atlas

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more