Skip to main content
Machine Learning & Deep Learning

⏱ About 15 min15 XP

What Networks Are Good and Bad At

Neural networks power some of the most impressive technology ever built. They also fail in ways that are embarrassing, dangerous, and sometimes deeply unfair. Understanding both sides is not pessimism — it is the literacy you need to think clearly about AI in the real world. This lesson gives you an honest picture.

Where Neural Networks Excel

Neural networks are exceptional at pattern recognition in high-dimensional data — situations where the input is so complex that humans cannot write rules by hand. Image recognition: Given millions of labeled photos, a deep convolutional network can surpass human accuracy at identifying objects, diseases in X-rays, or defects in manufactured parts. The pattern space of images is enormous; neural networks navigate it efficiently. Speech and audio: Networks transcribe speech to text, separate voices in a crowd, detect emotions in tone, and generate realistic speech from text — all tasks where the rules are too complex to write but the training data is abundant. Natural language: Large language models, trained on vast text, can summarize documents, answer questions, translate between languages, write code, and hold coherent conversations. The patterns of human language, learned from billions of examples, are compressed into the weights. Game playing: Given only the rules of a game and a reward signal, networks taught themselves chess, Go, and complex video games to superhuman levels — discovering strategies no human had ever found. Protein structure: AlphaFold 2 (2020) predicted the three-dimensional shapes of nearly all known proteins — a 50-year open problem in biology — with remarkable accuracy, potentially accelerating drug discovery by decades.

The Common Thread

Networks shine when: there is abundant labeled data, the pattern is too complex for hand-crafted rules, and small errors in individual predictions are acceptable. Remove any of these conditions and performance degrades quickly.

Now the honest other side: what networks are genuinely bad at. Robustness: A tiny, imperceptible change to an image — noise that a human cannot even see — can flip a network's confident prediction from 'panda' to 'gibbon.' These are called adversarial examples. The network has learned statistical patterns, not the visual concepts humans use. It is brittle in ways human perception is not. Out-of-distribution generalization: A network trained on photos taken in summer, with consistent lighting, may fail badly on winter photos or unusual camera angles. It learned the training distribution, not the underlying concept. Ask it about inputs that look different from training data and it often confidently gets it wrong. Reasoning and logic: Current neural networks struggle with formal reasoning chains — multi-step math proofs, logical deductions, planning in novel situations. They pattern-match; they do not reason from first principles. A network may solve a math problem it saw a variant of in training and fail entirely on a structurally identical problem with unfamiliar numbers. Causation vs. correlation: Networks detect correlations. A network trained on hospital data might learn that patients who receive a certain drug more often die — not because the drug kills, but because sicker patients get the drug. It has no concept of cause and effect. Explainability: Ask a trained network why it made a decision and it cannot tell you. The answer lives in millions of weights working together — there is no single rule to point to. In high-stakes domains like medical diagnosis, loan approval, and criminal sentencing, this lack of transparency is a serious problem.

Failure Modes That Matter

Three failure modes deserve special attention because they cause real harm: Bias amplification: A face recognition system trained mostly on lighter-skinned faces will perform worse on darker-skinned faces. The weights encode what was in the training data. If the data was unrepresentative, the network's behavior will be unrepresentative — and that inequality affects real people. Confident wrongness: Networks often output a high confidence score even when they are completely wrong. A network might say '99% confident: this tumor is benign' when it is actually malignant, simply because the image shares superficial features with benign tumors it trained on. Confidence scores are not the same as reliability. Data poisoning: If a bad actor can influence the training data — injecting mislabeled examples — they can manipulate the network's behavior in targeted ways. A spam filter could be trained to let specific harmful emails through. Security matters at the data level, not just the model level.

Match each strength or weakness to its correct description.

Terms

Image recognition
Adversarial examples
Out-of-distribution failure
Bias amplification
Confident wrongness

Definitions

Poor performance on inputs that look different from the training data
Tiny imperceptible input changes that flip a network's prediction dramatically
A task where networks can surpass human accuracy given enough labeled training data
A network outputting high confidence on an incorrect prediction
When skewed training data causes a network to perform unfairly across groups

Drag terms onto their definitions, or click a term then click a definition to match.

AI Is Not Magic — And Not Neutral

A neural network does exactly what it was optimized to do on the data it was given. If the data is biased, the network is biased. If the loss function rewards the wrong thing, the network will do the wrong thing confidently. The responsibility for what a network learns lies with the people who build it.

Why are adversarial examples a problem for neural networks but not for human perception?

A hospital network is trained on data where mostly male patients were diagnosed with heart disease. What is the likely consequence?

Strength or Weakness? Rapid Sort

  1. Step 1: Write each of the following AI use cases on a separate slip of paper: translating Spanish to English, proving a new theorem in mathematics, detecting cancer in a chest X-ray, deciding if someone deserves a loan, playing a video game, explaining why it made a decision, recognizing a friend's face, understanding sarcasm in text.
  2. Step 2: Sort the slips into two piles: Likely Strong (neural network is probably good at this) and Likely Weak (neural network probably struggles here).
  3. Step 3: For each 'Likely Weak' slip, write one sentence explaining the specific failure mode — out-of-distribution, reasoning, explainability, or bias.
  4. Step 4: Compare your sort with a partner and discuss any disagreements.