How a Network Makes a Decision
You now know what individual neurons compute and how layers are organized. It is time to watch the whole machine run. When a trained neural network encounters new data — a photo, a sentence, a set of sensor readings — it produces a prediction through a process called the forward pass. 'Forward' because the data moves in one direction: from the input layer, through every hidden layer in sequence, to the output layer. No information flows backward during a prediction. Understanding the forward pass is understanding how a neural network actually decides anything.
The Forward Pass, Step by Step
Let's trace a forward pass through a small network. The task: decide whether an email is spam or not-spam. The network has three inputs, one hidden layer of four neurons, and two output neurons. Step 1 — Input encoding. The three input features are computed from the email: fraction of words in ALL CAPS (say, 0.3), presence of the word 'FREE' (1 = yes, 0 = no — say, 1.0), and number of exclamation marks divided by total words (say, 0.15). These three numbers enter the input layer. Step 2 — Hidden layer computation. Each of the four hidden neurons receives all three inputs. Neuron H1 computes: (0.3 × w1) + (1.0 × w2) + (0.15 × w3) + bias, then passes that sum through its activation function to produce a single output number, say 0.82. Neurons H2, H3, and H4 do the same math with their own weights, producing their own outputs — perhaps 0.14, 0.67, and 0.39. Step 3 — Output layer computation. The two output neurons each receive the four hidden-layer outputs (0.82, 0.14, 0.67, 0.39) and compute their own weighted sums. Suppose after their activation functions they output: Spam neuron → 0.91, Not-spam neuron → 0.09. Step 4 — Decision. The network picks the output neuron with the highest value. Spam wins with 0.91. The email is flagged as spam. This entire process, from input to output, takes a fraction of a millisecond on a modern processor. It is pure arithmetic — no searching, no memory, no reasoning in any human sense.
A forward pass is a single left-to-right flow of data through the network: inputs are multiplied by weights, summed, passed through activation functions, and the results become inputs for the next layer. This repeats until the output layer produces the network's prediction. Every prediction the network ever makes is exactly one forward pass.
Notice what the forward pass does NOT do. It does not search through rules a programmer wrote. It does not look up anything in a database of answers. It does not compare the input to stored examples. It simply multiplies numbers by other numbers and adds them up, layer by layer, until an answer falls out at the end. This is both remarkable and worth being clear-eyed about. The speed and scalability of a neural network come from the fact that every prediction is just arithmetic. But it also means the network cannot explain why it made a decision in the way a human can. It does not have reasons — it has numbers. When researchers study what a network 'knows,' they are studying patterns in weights, not reading a reasoning process.
Softmax: Turning Outputs into Probabilities
The output layer of a classifier usually passes its values through a special function called softmax. Softmax takes a list of raw numbers (which can be any value, positive or negative) and converts them into a list of numbers that all add up to 1.0 — a probability distribution. So instead of Spam = 0.91 and Not-spam = 0.09, you might get Spam = 0.91 and Not-spam = 0.09 after softmax — and the guarantee that they sum to exactly 1. The highest value is still the winner, but now it reads like a confidence: 'I am 91% confident this is spam.' Be careful with this language: these are not true probabilities in the philosophical sense. They are the network's best-calibrated confidence scores given what it was trained on. A network that says '91% confident' is not always right 91% of the time. Calibration — whether confidence scores match actual accuracy — is a real research problem in AI.
Fill in the blanks to describe the forward pass correctly.
A neural network that outputs '95% confidence' can still be wrong — and sometimes spectacularly so. High confidence on a wrong prediction is called overconfidence, and it is a known failure mode of many networks. When using AI in high-stakes decisions (medical diagnosis, loan approval, criminal justice), overconfidence must be taken seriously.
In a forward pass, in what order does information flow?
What does the softmax function do to the output layer's values?
Trace a Forward Pass by Hand
- Here is a tiny network: 2 inputs, 2 hidden neurons, 1 output neuron.
- Inputs: x1 = 0.5, x2 = 0.8
- Hidden neuron H1:
- weights: w1=0.4, w2=0.6, bias=−0.1
- Compute: (0.5×0.4) + (0.8×0.6) + (−0.1) = ?
- Activation: if result > 0, output = result; otherwise output = 0
- Hidden neuron H2:
- weights: w1=−0.3, w2=0.9, bias=0.2
- Compute: (0.5×−0.3) + (0.8×0.9) + 0.2 = ?
- Activation: same rule as above
- Output neuron:
- Takes H1 output and H2 output as inputs
- weights: wH1=0.7, wH2=0.5, bias=−0.3
- Compute the weighted sum plus bias
- Write each step's calculation. What is the network's final output value?
- If output > 0.5 means 'yes' and output ≤ 0.5 means 'no,' what is the network's prediction?