Decision theory

Back to the Edge question (here). Stanislas Dehaene gave his answer to ‘What is your favorite deep, elegant, or beautiful explanation?’ as The Universal Algorithm for Human Decisions. Most is below:

All of our mental decisions appear to be captured by a simple rule that weaves together some of the most elegant mathematics of the past centuries: Brownian motion, Bayes’ rule, and the Turing machine.

Let us start with the simplest of all decisions: how do we decide that 4 is smaller than 5? Psychological investigation reveals many surprises behind this simple feat. First, our performance is very slow: the decision takes us nearly half a second… Second, our response time is highly variable from trial to trial, anywhere from 300 milliseconds to 800 milliseconds… Third, we make errors – it sounds ridiculous, but even when comparing 4 with 5, we sometimes make the wrong decision. Fourth, our performance varies with the meaning of the objects: we are much faster, and make fewer errors, when the numbers are far from each other (such as 1 and 5) than when they are close (such as 4 and 5).

Well, all of the above facts, and many more, can be explained by a single law: our brain takes decisions by accumulating the available statistical evidence and committing to a decision whenever the total exceeds a threshold.

Let me unpack this statement. The problem that the brain faces when taking a decision is one of sifting the signal from the noise. The input to any of our decision is always noisy: photons hit our retina at random times, neurons transmit the information with partial reliability, and spontaneous neural discharges (spikes) are emitted throughout the brain, adding noise to any decision. Even when the input is a digit, neuronal recordings show that the corresponding quantity is coding by a noisy population of neurons that fires at semi-random times, with some neurons signaling “I think it’s 4”, others “it’s close to 5”, or “it’s close to 3”, etc. Because the brain’s decision system only sees unlabeled spikes, not full-fledged symbols, it is a genuine problem for it to separate the wheat from the chaff.

In the presence of noise, how should one take a reliable decision? The mathematical solution to that the problem was first addressed by Alan Turing, when he was cracking the Enigma code at Bletchley Park. Turing found a small glitch in the Enigma machine, which meant that some of the German messages contained small amounts of information – but unfortunately, too little to be certain of the underlying code. Turing realized that Bayes’ law could be exploited to combine all of the independent pieces of evidence. Skipping the math, Bayes’ law provides a simple way to sum all of the successive hints, plus whatever prior knowledge we have, in order to obtain a combined statistic that tells us what the total evidence is.

With noisy inputs, this sum fluctuates up and down, as some incoming messages support the conclusion while others merely add noise. The outcome is what mathematicians call a “random walk” or “Brownian motion”, a fluctuating march of numbers as a function of time. In our case, however, the numbers have a currency: they represent the likelihood that one hypothesis is true (e.g. the probability that the input digit is smaller than 5). Thus, the rational thing to do is to act as a statistician, and wait until the accumulated statistic exceeds a threshold probability value. Setting it to p=0.999 would mean that we have one chance in a thousand to be wrong.

… There is a speed-accuracy trade-off: we can wait a long time and take a very accurate but conservative decision, or we can hazard a response earlier, but at the cost of making more errors. Whatever our choice, we will always make a few errors.

Suffice it to say that the decision algorithm I sketched, and which simply describes what any rational creature should do in the face of noise, is now considered as a fully general mechanism for human decision making. It explains our response times, their variability, and the entire shape of their distribution. It describes why we make errors, how errors relate to response time, and how we set the speed-accuracy trade-off. It applies to all sorts of the decisions, from sensory choices (did I see movement or not?) to linguistics (did I hear “dog” or “bog”?) and to higher-level conundrums (should I do this task first or second?). And in more complex cases, such as performing a multi-digit calculation or a series of tasks, the model characterizes our behavior as a sequence of accumulate-and-threshold steps, which turns out to be an excellent description of our serial, effortful Turing-like computations.

Furthermore, this behavioral description of decision-making is now leading to major progress in neuroscience. In the monkey brain, neurons can be recorded whose firing rates index an accumulation of relevant sensory signals. The theoretical distinction between evidence, accumulation and threshold helps parse out the brain into specialized subsystems that “make sense” from a decision-theoretic viewpoint.

As with any elegant scientific law, many complexities are waiting to be discovered… Nevertheless, as a first approximation, this law stands as one of the most elegant and productive discoveries of twentieth-century psychology: humans act as near-optimal statisticians, and any of our decisions corresponds to an accumulation of the available evidence up to some threshold.

To nit-pick a bit. Algorithm seems an inappropriate term in the title. Turin is mentioned in two contexts but only the phrase ‘our serial, effortful Turing-like computations’ seems to refer to what we call a Turing machine; the decoding trick does not have to do with Turing machines. The neural noise level in the brain seems a regulated parameter to due with sensitivity and not just an unavoidable by-product of neural activity. None of these picky things take away from the brilliant explanation.

Leave a Reply

Your email address will not be published. Required fields are marked *