Skip to main content
Machine Learning & Deep Learning

⏱ About 15 min15 XP

Labels and Ground Truth

You have features — the input columns that describe each example. But a model learning from examples needs more than descriptions. It needs the correct answers. Where do those correct answers come from, and why are they so important? That is the story of labels.

What a Label Is

A label is the correct answer attached to a training example. It is the thing your model is trying to learn to predict. In the plant dataset, 'Healthy?' is the label: yes or no, is this plant thriving? In a spam classifier, the label on each email is 'spam' or 'not spam.' In a handwriting recognition system, the label on each image of a handwritten digit is the digit itself — 0 through 9. Labels and features together make a labeled dataset. Features describe the situation; the label is the outcome. A model studies thousands of (features, label) pairs and tries to learn the pattern that connects them.

Ground Truth

Labels are also called ground truth — meaning the actual, verified correct answer, as opposed to what a model guesses. If a model predicts a plant is unhealthy but the label says healthy, the label wins. Ground truth is the reference reality the model is measured against.

Ground truth has to come from somewhere. That somewhere is usually a human annotator — a person who looks at each example and records the correct answer. For a medical imaging dataset, a trained radiologist labels each scan: tumor present or absent. For a sentiment dataset, a human reads each review and labels it positive, negative, or neutral. For a self-driving car dataset, a person watches video and draws boxes around every pedestrian and vehicle. This is expensive. Labeling 100,000 images can take a team of annotators months. That cost is one reason labeled datasets are so valuable — and why some are sold for millions of dollars.

Supervised vs. Unsupervised Learning

Because labels require human effort, not all machine learning uses them. Supervised learning trains a model using a labeled dataset. Every training example has a feature row and a known label. The model learns to predict labels from features. Most of the powerful, practical ML you encounter — spam filters, image classifiers, voice recognition — is supervised. Unsupervised learning uses only features, with no labels at all. The model looks for structure and patterns in the data on its own. Clustering — grouping similar customers together — is a classic unsupervised task. There is no 'right answer' per example; the model discovers groupings. This module focuses on supervised learning because that is where the data pipeline is most visible and most critical.

Label Quality Is Model Quality

A model can be no more accurate than its labels. If a human annotator labels 10 percent of examples incorrectly, the model will learn those mistakes too. Clean, consistent labels are as important as plentiful features.

Flashcards — click each card to reveal the answer

A dataset of 10,000 emails has two columns: the email text and a column marked 'spam' or 'not spam.' What is the 'spam/not spam' column called?

Why is labeling data often expensive?

Become an Annotator

  1. Step 1: Gather ten short text messages or sentences — use real examples from a book, newspaper, or make them up.
  2. Step 2: Create a label column. Your task: label each sentence as 'question' or 'statement.'
  3. Step 3: Label all ten by yourself first.
  4. Step 4: Have a partner label the same ten independently.
  5. Step 5: Compare your labels. Where did you disagree? Discuss why.
  6. Step 6: Write two sentences about what those disagreements mean for a model trained on this data.