ALAN opinion 9 min read June 7, 2026 Updated July 8, 2026

Does Active Learning Amplify Dataset Bias? The Ethics of Letting Models Choose What Humans Label

Conceptual view of a model selecting which data points humans will label, and the fairness questions that selection raises

The Hard Truth

Every dataset is a series of choices about whose reality gets recorded. What happens when we hand those choices to the model itself — and let it decide which slivers of the world a human ever stops to examine?

We rarely think of data labeling as a moral act. It feels clerical: someone looks at an example, assigns a category, moves on. But the moment a model starts choosing which examples a human sees, the clerical work becomes editorial. The annotator no longer surveys the world — she reviews a curated feed, and the curator is the very system she is supposed to be teaching.

The Choice We Pretend Is Neutral

There is a quiet assumption inside modern machine learning that more data, gathered more efficiently, moves us toward truth. Active Learning is the purest expression of that faith. Instead of labeling everything, the model flags the examples it finds most informative — usually the ones it is least certain about — and asks a human to resolve them. Labeling budgets shrink. Accuracy climbs. The Cold Start Problem of “where do we even begin annotating” seems to dissolve.

But efficiency is never neutral. Every Query Strategy encodes a theory of what is worth knowing. When the model decides what deserves human attention, it is not just saving money — it is drawing the boundary of the dataset’s moral universe. And the people who fall on the wrong side of that boundary do not get a vote.

What the Optimists Get Right

It would be dishonest to treat active learning as a villain. The case for it is genuinely strong, and thoughtful researchers make it well.

Done carefully, model-driven sampling can do more than cut costs — it can actively improve fairness. One study of uncertainty-based acquisition found that a strategy called BALD delivered significant gains in predictive parity and accuracy compared to ordinary random sampling, and improved further when paired with a fairness-correcting technique (Branchaud-Charron et al.). The intuition is elegant: the groups a model is most uncertain about are often the groups it has seen least, so chasing uncertainty can pull attention precisely toward the underrepresented.

This is the steelman, and it deserves respect. A well-designed Query By Committee or Diversity Sampling scheme can surface the edge cases that a passive, label-everything pipeline would drown in majority data. In that framing, letting the model choose is not abdication — it is a way to make scarce human attention land where it matters most. Uncertainty can be a compass toward the overlooked, not away from it.

The Assumption Hiding in the Machinery

So why does the same mechanism, in other studies, make things worse?

Here is the fault line. Active learning’s promise rests on a hidden premise: that the model’s uncertainty tracks the truth’s difficulty. That the examples it struggles with are genuinely hard, not merely unfamiliar. The optimistic story holds only when that premise holds — and often it does not.

The fairness literature has a name for the failure mode: uncertainty bias. Uncertainty Sampling can systematically over-query the very subgroups a dataset already underrepresents, and when the starting distribution is skewed, that skew gets amplified rather than corrected (Mehrabi et al.). The model is not seeking truth; it is seeking its own confusion. And a system can be confused about a group for reasons that have nothing to do with that group’s complexity and everything to do with how little it was ever shown.

The deeper trap appears when the labels themselves are biased. If the historical annotations a model learns from already carry human prejudice, then “acquire more of what the model finds informative” becomes “acquire more confirmation of the existing distortion.” One study put the warning directly in its title — More Data Can Lead Us Astray — showing that under label bias, even fairness-aware acquisition can mislead, so that gathering more data worsens outcomes instead of correcting them. More is not closer to right. More is closer to itself.

This is why the empirical picture refuses to settle. A survey of fairness-aware methods found that many traditional active learning strategies actually increase unfairness relative to plain passive learning, unless fairness reduction is deliberately written into the sampling objective (Fair classification AL study). The harm is not a bug in one algorithm. It is what happens when we let a system optimize for its own curiosity and call the result objectivity.

Who Holds the Pen

We have been here before, in a different costume. Bureaucracies have always shaped reality by deciding what gets counted. A census that has no box for your household does not merely overlook you — it makes you administratively invisible, and decisions flow from the count, not from you. The form is not a description of the world. It is an instrument that produces one.

Active learning is a census that writes its own questionnaire and revises it every round. The Pool Based Sampling loop selects, a human confirms, the model updates, and the next selection inherits the last one’s blind spots. In Modal Active Learning pipelines spanning text, images, and audio, the same dynamic compounds across modalities. The annotator believes she is teaching the model. In a real sense, the model is teaching her where to look — and quietly deciding where she never will.

That inversion is the ethical core of the thing. Human In The Loop is supposed to mean human judgment governs the machine. But if the machine sets the agenda for every judgment the human makes, who is actually in the loop?

The Position I’ll Defend

Thesis: Active learning has no fixed moral valence — it amplifies or corrects bias depending on who designs the query strategy, where the bias originates, and whether fairness is built into the objective rather than assumed to emerge from efficiency.

This is an uncomfortable conclusion because it denies us a clean verdict. We cannot say “active learning is fair” or “active learning is biased” and be done. The same loop that pulled attention toward the underrepresented in one study buried them deeper in another. The difference was never the cleverness of the algorithm. It was whether anyone treated fairness as a design requirement instead of a hoped-for side effect — and whether they understood that no amount of Data Deduplication removes a bias that lives in the labels, not in duplicate records.

The honest position is conditional, and the conditions are the whole point. Efficiency optimizes for the model’s confidence. Justice requires optimizing for the people the confidence forgets. Those two objectives are not the same, and pretending they are is how good intentions produce harm at scale.

The Questions We Owe the Overlooked

I will not pretend to hand you a procedure. But there are questions worth sitting with before the next labeling run.

When your model selects what humans review, do you know which groups it is quietly selecting against — and would you notice if you never measured it? If your historical labels carry the prejudices of the people who made them, what exactly do you expect “more data” to correct? And when fairness and labeling-cost pull in opposite directions, which one does your pipeline actually serve when no one is watching the dashboard?

Governance frameworks are beginning to insist on this. The NIST AI Risk Management Framework names “fair — with harmful bias managed” as one of its core trustworthiness characteristics, and frames bias as something to be managed across the entire data-to-model lifecycle, not patched at the end (NIST AI RMF). The selection step is part of that lifecycle. It deserves the same scrutiny we give the model’s outputs.

Where This Argument Is Weakest

I should name where I might be wrong. If fairness-aware acquisition methods mature to the point where they reliably correct for skewed distributions across most real-world settings — not just curated benchmarks — then my warning curdles into needless alarm, and active learning becomes a genuine instrument of equity. The evidence is moving, and some of it moves in that hopeful direction. My argument is a caution about how we deploy a tool today, not a prophecy about what it must always be.

The Question That Remains

Active learning forces a question older than machine learning: who gets to decide what is worth examining? We have handed that decision to a system optimized for its own certainty, and certainty has never been the same thing as fairness. The real question is not whether the model chooses well — but whether we will keep watching what it chooses to ignore.

Sources

Mehrabi et al.: A Survey on Bias and Fairness in Machine Learning - Taxonomy of dataset bias and the “uncertainty bias” failure mode in sampling.
Branchaud-Charron et al.: Can Active Learning Preemptively Mitigate Fairness Issues? - Evidence that uncertainty-based acquisition (BALD) can improve predictive parity and accuracy.
the “More Data Can Lead Us Astray” study: Active Data Acquisition in the Presence of Label Bias - How acquiring more data under label bias worsens fairness outcomes.
Fair classification AL study: Active learning with fairness-aware clustering for fair classification - Finding that traditional AL often increases unfairness without explicit fairness objectives.
NIST AI RMF: NIST AI Risk Management Framework (AI RMF 1.0) - Governance anchor treating managed fairness as a lifecycle requirement.

Aha Moments

MONA

Alan is right that the mechanism has no inherent direction, and the math explains why. Uncertainty sampling chases the model’s confusion, and confusion correlates with scarcity in the training distribution — not with the underlying difficulty of the truth. So the same loop can pull attention toward an underrepresented group or push it away, depending entirely on what the starting distribution looks like and whether the labels are trustworthy. The empirical results only seem contradictory until you separate sampling bias from label bias. They are different diseases. Treating them as one is how teams end up surprised when “more data” makes a fairness metric move in the wrong direction.

MAX

Mona’s distinction between sampling bias and label bias is exactly the kind of thing that belongs in writing before a single example gets queried. The failure Alan describes is rarely malice — it is an unstated assumption that nobody wrote down and therefore nobody tested. If fairness is a requirement, it has to be stated as one, with the subgroups named and the acceptance criteria defined up front. A selection loop that optimizes only for model confidence will do precisely that, faithfully, all the way into harm. The objective you forget to declare is the objective you quietly betray.

DAN

Both of you are circling the real friction, which is that fairness and labeling cost genuinely pull in opposite directions, and most teams only fund one of them. The market reality is that active learning gets adopted to save money, and the fairness work gets adopted after something goes publicly wrong. The teams that build the measurement in early will move faster later, because they will not be re-labeling a poisoned dataset under deadline. So here is what I keep asking the rooms I sit in: if you cannot show which groups your model is choosing not to learn from, do you actually know what you are shipping?

AI-assisted content, human-reviewed. Images AI-generated. Editorial Standards · Our Editors