True Positive Rate
Also known as: TPR, Sensitivity, Recall
- True Positive Rate
- The proportion of actual positive cases correctly identified by a classifier, calculated as true positives divided by total actual positives (true positives plus false negatives). Also called sensitivity or recall.
True positive rate measures the proportion of actual positive cases that a classification model correctly identifies, calculated as true positives divided by the sum of true positives and false negatives.
What It Is
When you evaluate whether a classification model actually works, one of the first questions is: “How many of the things it should have caught did it actually catch?” That is what true positive rate answers. In a confusion matrix — the standard 2x2 grid that breaks down every prediction your model made — true positive rate tells you how well the model performs on the cases that matter most: the actual positives. If you are building a spam filter, it is the percentage of real spam emails that actually land in the spam folder instead of slipping through to your inbox.
The formula is direct: divide the number of true positives (correctly identified positive cases) by the total number of actual positives (true positives plus false negatives). Think of it like a fishing net. Your net catches some fish (true positives) but lets others escape through the holes (false negatives). True positive rate measures what fraction of all the fish in the water your net actually captured. A rate of 0.90 means you caught 90% of them.
True positive rate focuses on one relationship in the confusion matrix. The matrix has four cells: true positives, false positives, true negatives, and false negatives. True positive rate only cares about actual positives and asks what proportion ended up classified correctly. This is why it is mathematically identical to recall. The term “sensitivity” comes from medical testing, where the question is “How sensitive is this test at detecting the disease?” All three names point to the exact same calculation.
While true positive rate focuses on catching positives, precision asks a different question: “Of everything the model flagged as positive, how many actually were?” These two metrics often pull in opposite directions — lowering the classification threshold catches more true positives but also lets in more false positives. The F1 score uses the harmonic mean to balance this tension into a single number.
How It’s Used in Practice
The most common place you will encounter true positive rate is in a classification report — the standard output when evaluating any machine learning model. Tools like scikit-learn generate these reports automatically, showing true positive rate (labeled as “recall”) alongside precision and F1 score for each class.
True positive rate becomes the priority metric when missing a positive case carries serious consequences. Medical screening is the classic example: a test for a dangerous disease needs high sensitivity because sending a healthy person for follow-up (a false positive) is far less costly than telling a sick person they are fine (a false negative). The same logic applies to fraud detection.
When teams build ROC curves — plots that map true positive rate against false positive rate at every possible classification threshold — they are asking: “At what point does catching more true positives stop being worth the extra false alarms?” The area under that curve (ROC-AUC) compresses this trade-off into a single score.
Pro Tip: Decide what matters more for your specific problem before tuning anything: catching every positive case (maximize true positive rate) or minimizing false alarms (maximize precision). That decision should happen before you start adjusting thresholds, not after you have already trained and deployed.
When to Use / When Not
| Scenario | Use | Avoid |
|---|---|---|
| Medical screening where missing a disease is dangerous | ✅ | |
| Email spam filtering where false alarms annoy users more than missed spam | ❌ | |
| Fraud detection with high cost per missed fraudulent transaction | ✅ | |
| Search result ranking where order of relevance matters more than coverage | ❌ | |
| Safety-critical defect detection in manufacturing | ✅ | |
| Balanced datasets where all error types carry equal cost | ❌ |
Common Misconception
Myth: A high true positive rate means the model is accurate overall. Reality: A model can achieve a perfect true positive rate of 1.0 by classifying everything as positive — but every negative case would be misclassified. True positive rate only covers one side: how well the model handles actual positives. You need precision or the false positive rate for the full picture. That is why the F1 score exists — to prevent single-metric tunnel vision.
One Sentence to Remember
True positive rate answers one question — “Of everything that was actually positive, what fraction did the model find?” — and the answer only becomes meaningful when you pair it with precision or examine it across thresholds on an ROC curve.
FAQ
Q: Is true positive rate the same as recall? A: Yes. True positive rate, recall, and sensitivity are three names for the identical calculation: true positives divided by the sum of true positives and false negatives.
Q: What is a good true positive rate? A: It depends on context. Medical tests often target 0.95 or higher. For general classification, anything above 0.80 is typically solid, but always evaluate it alongside precision.
Q: How does true positive rate relate to the ROC curve? A: The ROC curve plots true positive rate on the y-axis against false positive rate on the x-axis at every classification threshold, showing how the two metrics trade off as you adjust sensitivity.
Expert Takes
True positive rate isolates one quadrant of the confusion matrix — the ratio of correctly identified positives to all actual positives. It is invariant to the number of true negatives, which makes it useful for imbalanced datasets where negatives dominate. But that same property is a limitation: it completely ignores false positives. Reporting true positive rate without its complement, the false positive rate, gives an incomplete statistical picture of classifier performance.
When you set up a classification pipeline, true positive rate is the metric you wire into your monitoring dashboard when the cost of missing a positive outweighs the cost of a false alarm. Pair it with a threshold sweep to find the operating point that matches your business requirements. If you are tracking it in isolation, you are not evaluating your model — you are just measuring one axis of a two-axis problem.
Every product team ships a model and watches accuracy. The ones that win track true positive rate per segment — because aggregate recall hides where the model actually fails. If your fraud model catches most cases overall but misses a large share of a specific transaction type, that gap is where the losses accumulate. Segment-level true positive rate is where the real operational signal lives.
In high-stakes classification — criminal risk scoring, welfare eligibility, hiring screens — true positive rate disparities across demographic groups reveal structural bias. A model with strong recall overall but significantly weaker performance for a specific population is not a technical footnote. It is a fairness failure. Optimizing aggregate true positive rate without auditing it across subgroups means encoding inequality and calling it performance.