One Disease at a Time

The Neuroscience of Anticipation and Reward

One of the more durable assumptions of twentieth-century psychology was that the brain's reward system worked the way the word "reward" suggests it would. Something good happens, the system fires, the organism learns the thing was good, and behaviour adjusts accordingly. By this picture, pleasure is the signal, and the signal arrives when the pleasure does.

Forty years of neuroscience have steadily undone that picture, with consequences for how we understand memory, attention, and the design of every digital environment a modern person spends time in. What the brain actually does at the moment of reward, in many of the most carefully studied cases, is much less than what it does in the moments before the reward — when the outcome is anticipated but not yet known. The interesting biology is not in the resolution. It is in the wait.

Prediction error and the Schultz finding

The decisive evidence came from a series of single-neuron recording experiments conducted by the Swiss neuroscientist Wolfram Schultz and his collaborators in the 1990s. By measuring the firing rates of dopaminergic neurons in the midbrain of primates trained on simple reward-learning tasks, Schultz's lab discovered a striking pattern. When an untrained monkey received an unexpected drop of juice, dopamine neurons fired briskly at the moment of delivery. But once the monkey had learned that a particular tone or light preceded the juice, the neural response shifted backwards in time — the dopamine spike moved from the moment of delivery to the moment of the cue. By the time the juice arrived, predicted and expected, the response at the actual moment of reward was muted.

The cleanest way to describe what was happening, and the framework that has organised dopamine research for the last quarter-century, is the reward prediction error. The dopamine signal does not report "this is good." It reports "this is better, or worse, than my model predicted." When something predictably good happens, the model is confirmed and the signal is small. When something unexpectedly good happens, the signal spikes. When something predictably good fails to happen, the signal dips below baseline. The system is, in essence, a learning algorithm whose job is to update an internal model of the world by responding to the discrepancy between expectation and outcome.

If a cue reliably predicts a reward, the dopaminergic activity migrates to the cue itself. The brain begins to treat the anticipatory moment as the locus of the signal — because that is when new information arrives, when the model updates, when the system has something to learn. By the time the actual reward shows up, there is, computationally, very little to do.

Wanting versus liking

A second body of work, developed largely by the American neuroscientist Kent Berridge and colleagues, sharpened the picture further by distinguishing two systems that had been collapsed under the single word reward. What Berridge called wanting — the motivational pull towards an outcome, the felt desire that organises pursuit — turned out to be predominantly a dopaminergic phenomenon, mediated by the mesolimbic pathway that includes the nucleus accumbens. What he called liking — the actual hedonic pleasure when the outcome is consumed — turned out to depend more on opioid and endocannabinoid systems, with distinct neural signatures and only partial overlap with the wanting circuitry.

The disentanglement matters. It means that the system organising desire and the system producing pleasure are not the same system, and that they can be dissociated. It also explains a phenomenon almost everyone has experienced and few can explain. The wanting is sometimes stronger than the liking. The anticipation can exceed the satisfaction. The chase, in many parts of human life, is genuinely more vivid than the catch.

This distinction has become increasingly important for understanding the design of contemporary digital environments, many of which are structured less around delivering satisfaction than around sustaining anticipation. The most successful systems do not necessarily maximise pleasure at every moment. Instead, they maintain a state of ongoing expectation in which the next outcome, reward, or revelation remains just beyond reach. In this sense, the psychology of engagement is often a psychology of wanting rather than liking.

The same dynamic is visible in modern gambling interfaces. A Dicepalace online casino, for example, is built around sequences of anticipation: the spin before the result, the animation before the reveal, the possibility that the next outcome may differ from the last. What keeps attention engaged is not simply the pleasure associated with winning but the motivational energy generated by uncertainty itself. Berridge's distinction helps explain why the pursuit can sometimes feel more compelling than the reward, and why systems designed around variable outcomes are often experienced as emotionally intense even when the actual moments of satisfaction are brief. The architecture of engagement resides less in what is obtained than in what remains possible.

What anticipation does to memory

The same mechanism that organises desire turns out to be involved in memory consolidation in ways that have only recently been mapped in detail. Events that are preceded by strong anticipatory dopaminergic activity tend to be encoded more durably than events that arrive without warning or warning that has failed to spike the system. The neural mechanism appears to involve the hippocampus, which receives dopaminergic input that modulates the strength with which a given experience is laid down in long-term memory.

The behavioural consequence is something autobiographical memory has always shown but rarely explained. People remember disproportionately the moments preceded by anticipation — the exam they had studied for, the meeting they had worried about, the trip they had planned for months. The day-to-day texture of ordinary life, in which nothing was particularly anticipated, dissolves into composite. The autobiographical self consists, to a substantial degree, of the moments at which the prediction system was running hot. Memory is, in part, a record of anticipation.

Engagement, attention, and the variable-ratio problem

The relationship between anticipation and engagement is the third major consequence, and the one that has had the most visible practical impact. The behavioural psychologist B. F. Skinner showed in the mid-twentieth century that variable-ratio reinforcement — rewards delivered on an unpredictable schedule — produces more sustained engagement than rewards delivered on any predictable schedule. The dopaminergic explanation now seems clear. Predictable rewards stop generating prediction error signal once the prediction is learned. Variable rewards never stop generating prediction error, because the prediction can never be fully resolved. The system stays warm, indefinitely.

This is the mechanism behind the most engaging activities in human life — exploration, scientific discovery, falling in love, learning new skills — and behind the most extractive ones — slot machines, social-media feeds, mobile games engineered by behavioural design teams who have read the literature. The same circuitry that gives meaning to discovery gives compulsion to the feed. The neutrality of the mechanism is the unsettling part. It does not know whether the prediction concerns a scientific hypothesis or a notification badge.

What follows

The picture that emerges from forty years of work is, in retrospect, philosophically substantial. The brain's reward system is not a pleasure detector. It is an anticipation engine — a continuously running prediction machine whose signal is the difference between what was expected and what arrived. Subjective experience, the durability of memory, the texture of engagement, all sit on top of that single computational principle.

Two implications follow with some force. Anticipation is not a peripheral feature of reward but a constitutive one; remove it and you have removed most of what made the reward matter. And any environment engineered to maximise unresolved anticipation has gained access to the most powerful behavioural lever the human brain provides — access that, once granted, is difficult to take back.