Premium Value Investing NewsletterDownload Free Issue

Latticework of Mental Models: Variable Reinforcement

It’s not often that you would find me exercising my thumb muscles with a TV remote but it so happened that on a Saturday afternoon while killing time in front of the idiot box, an unsettling thought caught me off guard. I realized that I had been surfing through the TV channels for the past hour without really spending more than a minute on any single channel.

This habit is not uncommon but the unusual thing was once I had looped through all the channels a couple of times and once it was obvious that there was nothing interesting on TV, I still kept going without getting bored. So what was keeping me hooked?

As a flash of insight the answer that my mind constructed was an uncomfortable one.

It wasn’t the content in the TV which was keeping me engaged. It was the excitement of unknown that existed for those short moments between channel switches. The moment I decided to flick the channel, for a millisecond my mind wouldn’t know what was going to appear next on the TV screen.

My insight wasn’t entirely correct but I knew I was on to something. For few minutes, I even day dreamed about winning the next Nobel prize for a path breaking discovery in human behavior! Please don’t laugh as that dream was soon shattered by Mr. Google.

A little googling revealed an important mental model from the field of psychology. It’s called Variable Reinforcement. Let’s understand it.

What’s Variable Reinforcement?
A response or action is called reinforced response if it generates a reward i.e., a person will be motivated to repeat a response if he or she gets a reward for the same. That’s the well known theory of motivation. However, when a response is reinforced after an unpredictable number of tries, it generates a high and steady rate of response. Which means, when you receive the reward with irregular or unpredictable frequency, your behavior is reinforced even more strongly.

Take a moment and think about the following questions.

What’s your first instinct when you hear your phone ring? What’s the first thought when you see an incoming phone call from an unfamiliar number? Is it easy to ignore and forget about a WhatsApp buzz or an unread message icon in your mailbox?

Personally, my first instinct is to immediately answer the unknown call or read that unopened message. And my rationalizing mind justifies this urge with questions like – What if it’s an emergency? What if somebody has a surprise for me?

The situation described above is nothing but variable reinforcement in disguise. Psychologists have done elaborate experiments to prove that the effect of variable enforcement is pretty strong on human behaviour. Let’s take a look at one such empirical study.

Skinner’s Experiments
In 1950s, a scientist called B.F. Skinner validated the reinforcement theory by conducting experiments on mice. He observed that mice responded most voraciously to random rewards.

In one of his experiments, he selected two mice and fed them little differently. The first mouse was treated with same amount of food every time it pressed a lever. However to the second mouse, the supply of food was irregular and uneven. When this second mouse would press a lever, it would sometimes get a small reward, other times a large one, and sometimes nothing at all.

What Skinner observed that the second mouse pressed the lever compulsively when the food was given at irregular intervals. That established the theory of variable reinforcement.

Nir Eyal, author of the book Hooked: How to Build Habit-Forming Products, writes –

Humans, like the mice in Skinner’s box, crave predictability and struggle to find patterns, even when none exist. Variability is the brain’s cognitive nemesis and our minds make deduction of cause and effect a priority over other functions like self-control and moderation.

Variable rewards hook up our brains and keep it occupied. They sometimes create a trance like state for our minds. Variable reinforcement is used by a lot of sales and marketing people to keep you and me engaged with their products. Let’s find out where this mental model appears in the real world.

Variable Reinforcement in Gambling
Have you ever wondered why is gambling an addiction? Especially in casinos you will find people glued to slot machines pulling the lever compulsively. It’s been predicted by studies that even if some of those people end up winning the jackpot, they would prefer spending all that money back into slot machine. Interesting! Isn’t it? But why?

Slot machine players have no way of knowing how many times they have to play before they will win. All they know is that eventually a player will win. This is why slot machines are so effective and players are often reluctant to quit. There is always the possibility that the next coin they put in will be the winning one.

It would be naive to think that those slot machines offer a fair play. These machines are designed to exploit the variable reinforcement inherent in human behaviour. The machine doles out rewards in a very unpredictable (which might seem totally random to untrained eyes but it’s not) fashion.

Now that we are talking about slot machines, it’s worth mentioning another extremely intelligent aspect of slot machine design called “calibrated near misses”, which exploit another behavioural quirk called “deprival super reaction syndrome”. But let me save that story for some other day.

Variable Reinforcement in Daily Life
It’s common for most people to give in to the urge to take a phone call from an unknown caller or to check your email/mobile phone every few minutes.

What we don’t realize that we have limited supply of willpower and attention in a day. When we spend our mental resources (cognitive fuel if you will) attending to these small insignificant attention grabbing events, we are foregoing a chance to engage in deep thinking or doing any other important tasks (like reading a book).

Another personal observation – with smartphones and digital devices powered with fast Internet, any song or music in the world is just a few seconds away from you. Then why do people still listen to radio (Vishal just bought a new one 😉 )? It’s because radio offers the element of unexpected and novel experience with irregular interval (those annoying radio ads can go on for a long time before the next song is played on air). And you can’t know beforehand which music is going to be played while you’re driving back home.

Similarly, many companies use this trick to motivate people. Call centres often offer random bonuses to employees. Workers never know how many calls they need to make in order to receive the bonus, but they know that they increase their chances the more calls or sales they make.

Video games is another such example which engages people and can lead to addiction. In case you don’t know – the video gaming industry is bigger than movies and music industry combined (worldwide). Surprising, right?

Remember Farmville and Angry Bird games? The Twitters and Facebooks of the world are creating new habits by running users through a series of addictive products fueled by variable rewards.

Variable Reinforcement in Investing
The discussion will be incomplete without understanding how this behavioural quirk can come in your way to becoming a Safal Niveshak (i.e., a successful investor).

There are ample evidences to show that a buy and hold strategy triumphs over frequent trading for majority of investors. Then why do people still get into frequent trading, hoping to make sustainable profits?

Prof. Sanjay Bakshi compares these short-term speculators to butterflies jumping from one flower to the next. In a recent article, he has shared some eye-opening insights on the subject –

…reason why people resemble butterflies is because of the presence of a pleasure chemical called dopamine in our brains. The more the dopamine the more the pleasure. And novel experiences (imagine bungee jumping or a one-night stand) deliver enormous amounts of dopamine to the brain. The other thing that delivers dopamine is unexpected, pleasant surprises. And day traders get a lot of small, but pleasant surprises just like kids who are hooked to gaming. It gets addictive, this dopamine business. The more you get the more you crave for.

If you put a day trader who just had a winning bet in a fMRI machine and compare him with a cocaine addict, the doctors can’t tell the difference. Their brains look just the same. Behaving like a butterfly increases the probability of novel experiences. It increases the probability of small, unexpected surprises [variable reinforcements]. Warren Buffett has written that for many “investor-dreamers, any blind date is preferable to one with the girl next door, no matter how desirable she may be.” He is clearly on to something.

Similarly, checking your portfolio constantly is a type of addiction. Because of the inherent volatility of stock prices, there will be times when your portfolio will show profits (and sometimes it will be in red too). Every time you see a paper profit it releases a small quantity of dopamine, the feel good hormone, in your brain and your action (checking the portfolio) is reinforced. The notional increase in your portfolio value is a source of variable reinforcement.

Now you know why we aren’t very different from rats when it comes to variable reinforcement.

Antifragility Meets Variability
I am an admirer of Nassim Taleb’s ideas. Antifragility is one of his ideas that I found very insightful. It basically says that we should arrange our affairs in such a manner that the uncertainties of life should benefit us instead of causing a harm. So let me speculate a bit about reinforcement in the context of antifragility.

Knowing that variable rewards induce addiction, can we design an environment for ourselves where variations (the stimuli which create everyday experience) increase the quality of our life? The idea is to bring in the element of pleasant uncertainty.

How about doing something wacky which has the potential of creating a totally unexpected and new experience?

Like wearing different colored sock on each leg? Or parting your hair the other way? Hey! I am just suggesting. By the way I tried the hair trick and it freaked my wife big time. But guess what? It become a memorable day and she now calls me Mr. Wacky.

Then, consider what Vishal did on his recent trip to Varanasi. He roamed the holy city’s streets wearing the traditional Indian dhoti-kurta and found passersby amused at his attire. But as he tells me, it was one of his most memorable days in life and he would like to return to Varanasi soon and roam the streets wearing that dress again. 😉

A wise man said –

Life is not number of days you live but number of days you remember.

A word of caution though. Taking this idea to extreme may backfire too so goes without saying – please use common sense and please don’t break the law! Perhaps this explains the behaviour of those crazy adventure junkies. They are addicted to creating unique moments even if it means frequently putting their life at risk.

Too much wackiness on a consistent basis can end up creating unmanageable mess also, so go easy.

So we learnt that variable reinforcement is a powerful force that focuses attention, provides pleasure, and infatuates the mind. Next time when you see people getting addicted in an environment or to some product, pull out the variable reinforcement mental model from your latticework toolbox. It may give you some useful insights.

Nobel Laureate and author of the brilliant book Models of My Life Herbert Simon offers his view on the utility of mental models –

The better decision maker has at his/her disposal repertoires of possible actions; checklists of things to think about before he acts; and he has mechanisms in his mind to evoke these, and bring these to his conscious attention when the situations for decision arise.

Acquisition of worldly wisdom isn’t limited to learning the mental model. It’s just the first step. Once you learn a model, the question that you should ask yourself is – where can I find a practical implication of this?

This is where independent thinking comes into picture. Nobody can think for you. It’s a very personal and subjective process. You must have heard the following saying –

More the sweat in training, less the blood in war.

Training here is akin to learning the mental models and getting adept in their usage. This training will make sure that you get hurt less in making real life decisions. That’s the beauty of vicarious learning.

Take care and keep learning.

Print Friendly, PDF & Email

About the Author

Anshul Khare worked for 12+ years as a Software Architect. He is an avid learner and enjoys reading about human behaviour and multidisciplinary thinking. You can connect with Anshul on Twitter.


  1. Most of the time our behavior reinforces by some external variables which may have either a positive or negative impact on us. Understanding this mechanism and a logical decision making system can protect us from a negative Variable Reinforcement. Thank you Anshul Sir for this article!

  2. Prashant says:

    Dear Ansul,

    Very true. That portfolio point exactly suits me. Now i will try to improve myself on this.



  3. Deepak Krishnan says:

    Thanks a lot Anshulbhai 🙂

  4. Great series of articles Anshul. Thank you for enlightening me..!

  5. Hi Anshul,

    Nice write-up! I was wondering …

    “Then, consider what Vishal did on his recent trip to Varanasi. He roamed the holy city’s streets wearing the traditional Indian dhoti-kurta and found passersby amused at his attire… ”

    Q) I have never been to Varanasi before but why were the passersby amused at his attire ? 🙂

    Q) Did they think that he was a hilly-billy person from the village or don’t they wear traditional clothes anymore in Varanasi? 😉

    Cheers! 🙂

    • Anshul Khare says:

      Thanks George!

      Not sure why people were amused but it’s definitely an experiment worth trying. And it’s safe too 🙂

  6. Hi Anshul,
    Excellent writeup! Enjoyed reading it! The comparisons which you have picked from our day to day life are very amusing and worth thinking over!

    Best Wishes

  7. Sarvdeep says:

    Hi Anshul,

    Article was mind blowing, even I flick the channels for hours until I get something aligned to my taste. I need to work on that. Thanks for sharing.


  8. This is very great and thought provoking as to how irrationally our mind works, we got to train it by vicarious learning so that we can take rational decisions when needed.
    Thanks Anshul Sir!

  9. Bharath says:

    I love this post !! Good break from Munger and Buffet talks

  10. Jacob Manuel says:

    Great Article Anshul. Could you pl. suggest some books which talks more about ” Variable Reinforcement “

  11. Jacob Manuel says:

    Thanks Anshul!


  1. […] Wisdom By Peter Bevelin [4]: Influence: The Psychology of Persuasion By Robert Cialdini [5]: Variable Reinforcement [6]: Practical Thought on Practical Though? [7]: Coke and Happiness [8]: The Psychology Of Human […]

  2. […] minutes watch) Variable Reinforcement is a cognitive bias which keeps the compulsive gamblers glued to the slot machines in the casinos. […]

Speak Your Mind