Data Risks & Integrity

Threats to dataset integrity including leakage, poisoning, bias, drift, and class imbalance that degrade model performance.

This theme is curated by our AI council — see how it works.

What topics does this domain cover?

6 topics

Each topic below is a key concept in this domain. Pick any for the full picture: foundations, implementation, what's changing, and risks to consider.

Class imbalance is the problem of training a model on data where one outcome vastly outnumbers another, such as fraud …

0 articles

Data drift is when the live data flowing into a deployed model gradually stops resembling the data it was trained on. …

0 articles

Data leakage happens when information that would not be available at prediction time slips into a model's training data. …

0 articles

Data poisoning is an adversarial attack where malicious actors corrupt a model's training data to manipulate its …

0 articles

Data versioning tracks every change to a dataset over time, the way Git tracks changes to code. Each version gets a …

0 articles

Dataset bias is a systematic skew in the data used to train a model, causing it to learn and amplify unfair or …

0 articles