The Science Behind Habit Formation

This article was adapted from Mental Health Academy’s “The Neuroscience of Habits” professional development course.

Oh, here we go again! You’ve got a wonderful new smart phone – or maybe a computer – with all the bells and whistles, but how do you make it work? How do you get from one screen or one app to the next? Chances are, the first day will involve a bit of brainwork; you’ll notice what happens when you push this button or come to that screen and you may feel slightly clumsy working it, but after a day or two, you will be so used to the new device that you will forget how the old one operated. So, from the brain’s perspective, what just happened? The short answer is that you used your “thinking” brain to figure it out, and then – from repeatedly moving around in the device – were able to automate your movements, forming habits: that is, habitual ways of operating the new machinery. The longer answer about what happened, involving the neuroscience behind switching from thinking it through to doing it automatically, is the subject of this article.

Goal-directed > action-outcome associations

In essence, when we do something the first time, it is purposeful, goal-directed behaviour. You carefully read the instructions to make a latte with your new coffee machine. You take in all the choices for which buttons to push when heating up something in your new microwave oven. And you pay particular attention to the G.P.S. when it directs you to a new address for your crucial job interview. Assuming you get the latte, the heated-up food, and arrival at the correct place for the interview, you have created, in behavioural terms, an “action-outcome association”: that is, you have engaged a goal-directed behaviour in which you performed an action which has been reinforced (rewarded) (Joye, 2018). Maybe you even get further reinforcement by getting the job!

Outcome-sensitive stimulus > response associations

Let’s fast forward a few months. Let’s say you’ve been doing this wonderful new job whose interview the G.P.S. helped you get to. Every working day, you back out of your driveway and head into work via the same route along which the G.P.S. originally guided you. Are you thinking about it so intently now: noticing signposts along the way, carefully observing everything on the route? The answer is: no, probably not. With “training” and repetition – going to work each day – you have made an initially strange new behaviour into an automatic one: that is, a habit. You now have a learned association between an “event” and a behavioural response. Your automatic behaviour of driving to work becomes an “outcome-insensitive stimulus-response association” (Joye, 2018). The “stimulus” of knowing it’s a workday and seeing it’s time to depart for work leads to the automatic response of getting into the car and driving there – and hopefully, you continue to be rewarded by enjoying your time at work!

But even if you have a bad day at work on Tuesday (meaning: it’s not that rewarding), you will typically still get in the car and go on Wednesday, because the mere response of going has now become automatic. At the very least, the fact that you have created a habit of driving to work means that you have freed up precious processing space in your brain by “saving” an automatic stimulus-response association (knowing it’s time to go and getting in the car and going) that can be triggered with almost no thinking.

For decades, we have understood that this was how our minds operated – a stimulus, a response, and hopefully a reward (but even without an obvious reward, some automaticity) – without means of more in-depth exploration. With the advent of fMRI and other technology, however, we can now examine not just what our brain is doing, but also what part of our brain is doing it. Let’s translate this long-known behavioural knowledge into emergent neuroscientific understanding.

The habit loop: Three or four phases?

The loop according to Duhigg

With an interest in exploring habit, New York Times business writer Charles Duhigg wrote The Power of Habit (Duhigg, 2012). It highlights the psychological pattern Duhigg calls “the habit loop”: a process he says occurs in three phases, consisting of cue, routine, and reward. The “cue”, or trigger (a.k.a., the stimulus), tells your brain to go into automatic mode and allow a behaviour to be enacted; it tells it which habit is to be deployed. The “routine” (meaning in behavioural terms, the response), is the “behaviour” itself, which can be physical, mental, or emotional. Obviously, that is the part of the habit loop that we see, or which we think about when we think about habits. Finally, there is the “reward” (the reinforcement, in behavioural terms): the bit that your brain likes, which motivates it to remember the habit loop in the future if it thinks the loop is worth remembering.

Figure 1: The habit loop according to Duhigg (2012)

Over time, the loop – cue, routine, reward, cue, routine, reward – becomes more and more automatic. Duhigg cites multiple studies to show that, once the cue, the routine, and the reward start to happen together, the being (human or animal) that repeatedly experiences the loop begins to cultivate a craving that drives the loop. Not only that, but the cue and the reward become intertwined until a powerful sense of anticipation and craving causes the same spike in brain activity as the reward, but before the reward arrives. At this stage, a habit is born (Duhigg, 2012; Farnam Street, 2019).

It is not hard to understand, once we “get” the habit loop, why a gambler will sit at the slot machine for quite a while without the reward of a win. Once the gambling habit is formed, his brain is getting the dopamine hit of a reward just anticipating a possible win – even as the wheels are still spinning. If you are thinking that this is at least part of why habits are so pernicious, congratulations; you are an astute reader, but bear with us for another moment here. There is another model for the habit loop as well.

The loop according to Clear

James Clear (2018), in considering similar research, offers a slightly different conceptualisation of the habit loop. He notes that the first stage of the habit loop – also a cue in his model – is the bit of information that predicts a reward: important for our prehistoric ancestors who were constantly surveying the environment for the primary (i.e., survival-critical) rewards of food, water, and sex. While modern “rewards” are different (i.e., more like money, power, praise, or personal satisfaction), they still are important, so when we “sniff” that one is close because we register a cue, it naturally leads to a “craving” for that reward.

Craving is the second stage of Clear’s four-stage model, as opposed to Duhigg’s sense that the craving underlies and powers the whole loop. As the second step, Clear says, cravings are the motivational force behind every habit. Without some level of motivation or desire, we have no reason to act. Clear backs up this assertion with research showing how rats that were chemically manipulated to lose their desires began to die. This occurred even though they enjoyed the foods and treats that were given to them without the rats proactively seeking them. They began to die because they were not motivated to act to supply their needs (2018). We note here that cues are meaningless until they are interpreted. It is the thoughts, feelings, and emotions of the observer that turns them into cravings, which obviously differ from person to person.

The third step is response. This is the actual habit you perform, which can take the form of a thought or an action. Whether a response occurs depends on how motivated the person is and how much physical or mental effort is required to perform the behaviour.

Finally, the response delivers a reward. Rewards are the end goal of every habit. The cue is about noticing the reward. The craving is about wanting the reward. The response is about obtaining the reward. We chase rewards both because they satisfy us (the first purpose of rewards: satisfying the craving) and because they teach us. That is, they teach us which actions are worth remembering in the future. Regarding the brain as a reward detector, we can see how both feelings of pleasure and those of disappointment or aversion are part of the feedback mechanism that helps our brain to distinguish useful actions from useless ones. Rewards close the feedback loop and complete the habit cycle. If a behaviour is absent or insufficient in any of these four stages, says Clear, a habit will not form (2018).

Moreover, once we become aware that all behaviour is driven by the desire to solve a problem, we can understand how the first two stages – the cue and the craving – are the problem phase, and the third and fourth steps – the response and the reward – are the solution phase, where we take action and achieve a desired change. Thus, Clear’s model for habit formation, as seen in Figure 2, looks slightly different from Duhigg’s.

Figure 2: The habit loop according to Clear (2018)

So now let’s get even more scientific as we contemplate where in the brain all these processes are occurring.

What’s happening in the brain?

PFC for thinking

During the process of learning goal-directed associations (that is, extract coffee first and then froth milk in order to get reward of latte), a part of the pre-frontal cortex or PFC is active. The PFC is responsible for higher-level cognition and executive function, such as in decision-making and planning. On day one with the new computer, your PFC is actively searching for the buttons/routes to the sites and parts of the computer that you want to access (Joye, 2018).

Basal ganglia for automaticity

Neuroscientists, meanwhile, have traced our habit-making behaviours to a part of the brain in the limbic system called the basal ganglia (which also plays a key role in the development of emotions, memories, and pattern recognition). As you get better and better at operating the new coffee machine or the new computer, your brain sees that it can delegate decreasing amounts of brainpower to the making of the coffee or the operating of the computer; the tasks become more automatic and the decision-making PFC goes into a kind of sleep mode (Joye, 2018; Oppong, 2018).

Chunking

Early in the learning process, however, before the behaviour is automatic, a signal arises in the basal ganglia region known as the dorsolateral striatum (DLS). The DLS “chunks” the task-related events together so that the brain sees the whole task from beginning to end as one event. In our example, that would mean that each time you made coffee in your new machine, the DLS was noticing that you (1) filled up the unit with water; (2) grabbed a coffee pod from the pantry; (3) grabbed a cup from the cupboard; (4) poured some milk into the milk jug to be frothed, placing it on the machine; and (5) pushed the buttons to turn on the machine and then to begin the coffee extraction and milk frothing. Remember how we said that the brain tries to be as efficient (ok, seemingly lazy) as possible? Once it sees that you do all of the above steps every time you make that latte, it says, “OK, we can chunk these steps together and call them a single task”. By saying that, when you do this routine, you are only doing one thing, not five, the DLS is freeing up mental space for you to think about something else.

And there is more. Neurons related to the task of making that latte fire at the beginning or end of the task, or likely both, while neurons unrelated to the task do not activate at all. Thus, the entire (chunked) task is represented as a single event within the DLS, one bracketed by the beginning and/or ending neuron activation. Recall how we said that, at the beginning of a new task, you are thinking more, and there is probably some trial-and-error going on? Maybe the second time you made a latte, you remembered most of the steps, but failed to do step (3), placing a cup underneath where the coffee was to come out, so when you did step (5), pushing the buttons to extract the coffee, the precious liquid poured out onto your bench instead of into a cup: oops! But as you got more and more practiced at that latte-making – rendering an increasingly consistent response – the strength of your “chunked” sequence of steps would have been increasing (Joye, 2018; NPR/Duhigg, 2012; Farnam St., 2019).

One brain, two systems

With our simple act of making a latte, we have seen how our rather fancy mental equipment has invoked at least two systems: the PFC to figure out what to do and the basal ganglia (specifically, the DLS) to streamline the task by automating it. But how did our brain know which system was best used, or in other words, how do these systems work together to create a habit that occurs so seamlessly we barely notice it? Stay with us for this next part. This is where the theories – based on various scientific investigations with rats and people – become complex.

Until recently, it was believed that the brain circuitry underlying goal-directed actions and habitual actions (that is: you on Day One of making coffee and you on about Day 49, respectively) were competing with one another for dominance in the brain. The idea was – as you might hypothesise yourself from what we have already discussed – that all actions would begin as goal-directed ones, and then the habitual action system would take over, inhibiting the goal-directed connections and, as we stated above, freeing up the brain processing for other things.

Recent evidence, however, suggests that the two systems actually work together. It may be that the goal-directed action circuitry is needed in order to initiate a given routine, but habitual automaticity is used to complete a sequence or complex set of behaviours that the brain has learned to see as one unit (that is: a chunked routine, like making the latte, that the DLS has told the brain is really just one behaviour).

The implications of this are interesting. If these systems are complementary, it would help to explain some of our seemingly absent-minded errors. Dr Asha Bauer, psychologist and productivity coach, relates her own example of deciding not to go to the gym after work one day as she habitually did because she was feeling under the weather; she decided she would go straight home. But lo and behold, she soon found herself sitting in the gym car park (without her gym bag, obviously, given her PFC decision not to go!). In fact, her automatic systems (think basal ganglia and DLS) had basically taken over the wheel of her car, driving her to the gym as per normal, even when she had made a conscious decision otherwise. It would seem that Dr Bauer had “finish-work-go-to-gym-then-work-out” as a chunked response, one in which her PFC did not interfere, because its job was not to govern automatic responses (Bauer, n.d.). In neuroscience-speak, Dr Bauer’s goal-oriented action (going home because she didn’t feel well) was overtaken by a habitual routine (going to the gym) (Bauer, n.d.; Joye, 2018).

How do we know which system we are using: goal-directed or habitual?

If you’ve followed along with us to this point, you might very well be asking: well, how do we know whether we – or any being – may be performing goal-directed behaviour or just repeating a habit? It’s a good question, inasmuch as researchers estimate that 40 to 50% of our actions on any given day are done out of habit – and habits expert James Clear reckons the percentage is much higher (Clear, 2018). To answer the question, researchers in one study (Packard & McGaugh, 1996) put some rats into a plus-shaped maze and taught them to turn right to get their food reward; there were a total of four trials a day for 14 days: that is, many more than needed to form a habit, so the rats were “overtrained”.

At Days 8 and 16, the researchers conducted what they called “single probe trials” (Packard & McGaugh, 1996). For both of these, they kept the reward in the same arm of the maze, but this time put the rats on the other side of the maze: that is, 180 degrees around from the original starting arm. This meant, of course, that in order to get the reward, the rats had to turn left. Three minutes before the trial began, half of the rats were given a bilateral injection with saline solution; thus, they functioned as the control group because the solution would not affect them neurally. The other half were bilaterally injected with a 2% lidocaine solution, in order to produce neural inactivation. Half of these rats were injected into their hippocampus, with the other half injected into their dorsolateral striatum (DLS).

Results showed that the control group – injected only with saline solution – used their brains to change strategy (turning now left) at the Day 8 trial, but by the Day 16 trial, the “overtraining” meant that there was a shift in learning mechanism controlling their behaviour. In simple English, they created and were governed by habits by the end of the experiment. The rats whose hippocampus had been injected showed no preference for either way of operating at Day 8, but by Day 16, chose to continue turning right, according to their habit from training. This was as the researchers predicted, because their learning centre had been inactivated by the lidocaine and they had to rely on habit. The rats whose DLS had been injected, conversely, continued to work out which way to turn (that is, turning left to get the reward) at both the Day 8 and the Day 16 trial: again, predicted by the researchers because the DLS, the part of their brain involved in habit formation, had been inactivated with the lidocaine, and they had to figure it out freshly each time. The researchers concluded that the hippocampus and the DLS (basal ganglia) selectively mediate expression of learning, with the latter governing the learning of a habit-controlled response (Packard and McGaugh, 1996).

Influence of the environment: context cues

Research has also demonstrated that context cues – that is, triggers from our environment – play an important role in habit formation. When researchers trained subjects on a sequential task (performing Step 1, then Step 2, then Step 3), repeated practice resulted in fast reporting of the next step when primed with the prior step. When people became fast at reporting the next step (interpreted as having a strong habit), their habits were likely to persist, even when they intentionally wanted to add, remove, or change one of the steps (Joye, 2018). This type of research has huge relevance for mental health professionals, particularly those working in the area of alcohol and drugs. The demonstrated influence of environmental cues on habitual action can explain clients who maintain sobriety very well while in the controlled environment of rehab, but relapse into alcohol or drug misuse once back in the environment in which they formerly used. Simply, the new habit of the rehabilitation environment (i.e., no alcohol or drugs) masks or temporarily replaces the habit of using in the old environment. In rehab, the client is “safe” from relapse, but once back in the old environment, the established habit routines take over again, causing the person to use.

Want to change habits? Go on holiday

Duhigg insists that this understanding gives rise to an obvious and quite appealing solution. Do you want to replace a bad habit, he asks? Take a vacation. If you, say, want to give up smoking, do it while away on holiday. The old cues and rewards aren’t there anymore in the new environment, so there is a fresh opportunity to form a new pattern, which you can hopefully carry over into your life after the holiday is over (NPR/Duhigg, 2012).

There is much more to the process of changing habits. For further reading on the topic, we recommend searching (on Google) for James Clear’s framework based on his four-stage habit loop model, which he calls the Four Laws of BehaviourChange (2018).