Skip to main content
AI Foundations

⏱ About 15 min15 XP

Labels: Telling Data What It Is

Imagine you are trying to teach a five-year-old what a dog is. You do not hand them a dictionary definition. You point at dogs and say 'dog,' and point at non-dogs and say 'not a dog.' After enough examples with those little attached answers, the child builds a mental model. Labels in machine learning work the same way. They are the correct answers attached to examples — and without them, a large class of AI systems simply cannot learn.

What a Label Is

In a dataset, a label is a piece of information attached to an example that says what that example is, or what its correct output should be. Labels are the answers that a supervised learning system is trying to learn to predict. Go back to the sleep-and-scores dataset from Lesson 3. If you wanted an AI to predict quiz scores from sleep data, the 'Quiz Score' column is the label — it is what the AI is learning to output. All the other columns (hours of sleep, bedtime) are the input features. Here is a broader set of examples: An email dataset for spam detection: each email (the example) is labeled either 'spam' or 'not spam' (the label). An image dataset for cat recognition: each photo (the example) is labeled 'cat' or 'not cat' (the label). A medical dataset for diagnosis: each patient record (the example) is labeled with the confirmed diagnosis (the label). A sentiment dataset for product reviews: each review (the example) is labeled 'positive,' 'negative,' or 'neutral' (the label). In every case, the label is the answer the AI must learn to generate from the inputs.

Definition: Label

A label is the correct answer or target output attached to a training example. In supervised learning, the AI studies (input, label) pairs and learns to predict the label when given only the input. Labels are what transform raw data into a training signal.

Labels can take many forms: Binary labels — only two options: spam/not-spam, cat/not-cat, pass/fail. Categorical labels — several named categories: 'dog,' 'cat,' 'bird,' 'fish'; or 'positive,' 'negative,' 'neutral.' Numerical labels — a number the AI must predict: a house price, a test score, a temperature tomorrow. When the label is a continuous number, the task is called regression rather than classification. Sequence labels — used in natural language processing, where the AI must label each word in a sentence (for example, tagging every word as a noun, verb, adjective, etc.).

Labeled vs. Unlabeled Data

Most raw data in the world is unlabeled. Photos on the internet do not automatically come with tags. Emails do not arrive with 'spam/not-spam' attached. Web pages do not announce their topic. Raw, unlabeled data is what you get when you just collect — no human has gone through to annotate it. Labeled data is much rarer and much more expensive. To label 10,000 photos as 'cat' or 'not cat,' a human has to look at each one and type an answer. At even a few seconds per image, that is hours of labor. For medical datasets — where a doctor must examine each patient record to confirm a diagnosis — labeling can cost tens or hundreds of dollars per example. This is one of the most important practical constraints in AI development: getting high-quality labels at scale is genuinely hard and expensive. It is why large labeled datasets like ImageNet (14 million labeled images) are so valuable, and why it took years and an enormous crowdsourcing effort to build.

Labels Are Made by Humans — and Humans Make Mistakes

Every label in a training dataset was created by a person (or a process that a person designed). That means labels can be wrong, inconsistent, or biased. If the labeler is tired, rushed, or brings unconscious assumptions, those errors go into the training data — and into the AI that learns from it. Label quality is just as important as data quality.

Flashcards — click each card to reveal the answer

Why This Matters for AI Fairness

Because labels come from humans, they can embed human biases and judgment calls that are not obvious. Consider these real scenarios: A facial recognition dataset where 'attractive' or 'not attractive' labels were assigned by workers — the labels will reflect those workers' cultural standards of attractiveness, which vary across cultures and demographics. A hiring AI trained on past hiring decisions, where the label for each resume is 'hired' or 'not hired' based on what previous human managers decided. If those managers systematically undervalued certain groups, the AI learns to do the same. A content moderation AI where 'toxic' labels reflect the sensitivities of a particular cultural group — phrases that are offensive in one dialect may not be flagged, while benign phrases in another dialect are. The labels are the teacher. If the teacher has biases, the student learns them. This theme — that biased data creates biased AI — will be the focus of Lesson 7. For now, the key insight is that labels are never neutral: someone chose what the correct answer is, and that choice always reflects a perspective.

What is the primary purpose of labels in a supervised learning dataset?

Why is labeled data much rarer and more valuable than unlabeled data?

Become a Data Annotator

  1. You are going to label a small dataset — and experience the challenge firsthand.
  2. Write down 10 sentences (or use a set provided by your teacher). Mix positive, negative, and ambiguous ones. Examples: 'This movie was incredible.' / 'I guess it was okay.' / 'My bag got wet but the concert was fun.'
  3. Label each sentence: positive, negative, or neutral. Work alone first — do not share yet.
  4. Compare your labels with a partner. Mark every sentence where you disagreed.
  5. For each disagreement: why did you disagree? What made the sentence ambiguous?
  6. Discuss: if you were building a sentiment AI and your labels disagreed 30% of the time, what would that mean for the AI you trained?