Chapter 7: Reward and Reinforcement

2nd edition as of August 2022

Chapter Overview

Now that we have covered the basics of neuroscience and pharmacology, it should be apparent to you how drugs interact with our bodies on a cellular scale. What may be less obvious is how this culminates in the experiences and behaviors we see on the human scale. What makes drug use so pleasurable? How does addiction develop? And what can we do to keep it from happening? Believe it or not, the answers to these questions are rooted in the subjects we have already covered. To wrap up this unit, we will explain the rewarding aspect of drug use and the way it can change our behavior over time. This should help you integrate the topics we have covered so far with what you already know or have seen about drug use.

Chapter Outline

7.1. Learning: How Experience Shapes Behavior

7.1.1. Introduction to Behaviorism
7.1.2. Operant Conditioning
7.1.3. Operant Conditioning and Drug Use

7.2. Biological Basis of Reinforcement

7.2.1. Dopamine in the Brain
7.2.2. The Reward System

7.3. Consequences of Repeated Drug Use

7.3.1. Drug Dependence
7.3.2. Addiction
7.3.3. Treating Substance Use Disorders

Chapter Learning Outcomes

Describe the role operant conditioning plays in drug use.
Outline how the dopamine pathways in the brain affect reward.
Clarify what the consequences of repeated drug use are.

7.1. Learning: How Experience Shapes Behavior

Section Learning Objectives

Differentiate between classical and operant conditioning.
Explain operant conditioning and describe various factors that determine how drugs change behavior.

To understand how drugs affect behavior, we need to cover basic principles of learning. You may think of learning as something that primarily happens in school. But in psychology, learning is defined as a change in behavior caused by experience. We experience things all the time, and the nature of those experiences influences our actions in the future. We not only learn from our teachers, but also from the consequences of our actions.

This concept applies to drug use as well. In short, people use drugs because they learn that using them feels good or provides relief. This is as true for someone addicted to cocaine as it is for habitual coffee drinkers. In this section, we will explore how psychologists talk about learning and use those ideas to explain how and why drugs change our behavior.

7.1.1. Introduction to Behaviorism

In many cases, studying learning theory involves studying behaviorism, a discipline of psychology that was founded in the 1910s by John Watson and later championed by B.F. Skinner. The reason why behaviorism is so important is that much of the terminology we will use to explain changes in behavior was coined by behaviorists.

In behaviorism, behavior is seen as a response to some sort of stimulus. Sometimes the response is involuntary, or a reflex. For instance, a dog will salivate when presented with food. We can pair a reflex-inducing stimulus with an unrelated stimulus such as the ring of a bell. Normally, a bell would not cause a dog to salivate. But if you ring the bell each time you feed the dog, it will eventually learn that the bell is associated with food and will start to salivate once you ring it. This process, in which a stimulus is paired with a reflexive behavior, is a form of learning called classical conditioning (see image below).

Although classical conditioning is powerful, it is limited to reflexes and other involuntary behaviors. It cannot explain how you decide what to wear in the morning or whether you choose to keep eating a new dish. For that, we have to turn to operant conditioning, which describes how voluntary behaviors are changed by their consequences. To distinguish between classical conditioning and operant conditioning, remember that classical conditioning involves reflexes, while operant conditioning involves voluntary behaviors.

7.1.2. Operant Conditioning

The idea behind operant conditioning is probably intuitive to you. If you went to a restaurant and had a good time, you would be more likely to return to the restaurant. On the other hand, if the food and service were terrible, you would be less likely to return. In both of these cases, your behavior (going to a restaurant) was changed by its consequence (a good or bad experience). If the behavior is strengthened—that is to say, if it becomes more likely to occur in the future—then we say that the behavior was reinforced. A good dining experience reinforces the decision to eat at that place. In comparison, if a behavior is weakened or decreased, we say that it is punished. A bad dining experience would punish our decision.

There are two main ways that we can reinforce a behavior. We can add something good, or we can take away something bad. We call the former positive reinforcement and the latter negative reinforcement. What are some examples? Well, if you ever received an allowance for taking out the trash, that would be positive reinforcement—the money was something rewarding. If you take out the trash when it starts to smell though, that’s negative reinforcement—the bad smell being removed. In both cases, the behavior (taking out the trash) became more likely to happen again in the future.

We can do the same thing with punishment. Adding something bad will result in positive punishment while removing something good is negative punishment. Getting a ticket after you speed will make you less likely to break the speed limit in the future (positive punishment), but so will having your license revoked (negative punishment).

It is easy to get these four terms mixed up. Remember that the words positive and negative refer to whether something is being added or removed, not whether the thing is good or bad. (Think of the math symbols + and – if you have to.) In comparison, reinforcement and punishment always refer to whether the behavior increases or decreases. Here is a table defining each of these terms:

7.1.3. Operant Conditioning and Drug Use

So, how do we use this terminology to describe drug use? If using a drug makes us more likely to use it again in the future, then we would say that the drug is a reinforcer. Drugs are considered primary reinforcers because they are intrinsically rewarding, similar to food or sex. This is contrasted with a secondary reinforcer, which is only rewarding because it has some sort of learned value, such as money. A $100 bill isn’t a reward for a toddler, but, as we grow older, we learn that money can help us get primary reinforcers like food and learn to value it on its own.

Not all experiences are reinforcing. Not everyone will become addicted to a drug following the first use. Sometimes, a first experience is unpleasant; it is likely to turn someone off from the drug and discourage the person from using it. This is an example of punishment. For instance, if you cough a lot the first time you smoke a cigarette and decide smoking is not something for you enjoy, then your drug-taking behavior is reduced. The truth is that the reinforcing power of a drug is reliant on a variety of factors.

First is satiation, which is essentially the state of being “full” or satiated. In the context of food, this is obvious: food is more rewarding when you are hungry and less rewarding when you are satiated. In the context of drug use, this can extend to the user’s overall situation. Many people use drugs because they lack other sources of pleasure in their lives. The rush of pleasure from drug use will be more rewarding if you cannot get that same feeling from relationships, work, or hobbies in your life. It can also refer to the particular effect of the drug. Caffeine, for instance, is a lot more rewarding when you are tired and need a boost compared to when you are already brimming with energy.

Two other factors are immediacy and contingency. Immediacy refers to how quickly the response occurs, while contingency describes how reliably the consequence follows the behavior. Immediate, reliable responses are more effective reinforcers. A drug that produces an effect immediately is more addictive than one with a delayed response. This can also influence the route of administration, which that absorbs the drug faster, whether IV injection or snorting. Contingency also reinforces conditioning if the consequence consistently followed the behavior.

Another factor is the size of the stimulus. The more potent the stimulus, the more reinforcing it becomes. This has clear implications for drug use, as certain drugs provide more pleasurable effects and are more reinforcing than others.

As we discuss different types of drugs in the coming chapters, keep these factors in mind.

Classical Conditioning and Drug Addiction [1:59]

7.2. Biological Basis of Reinforcement

Section Learning Objectives

Describe the three main brain dopamine pathways that are related to reward.
Describe the circuitry of the reward system and briefly explain the functions of the various structures involved in it.

In the previous section, we stated that drugs are primary reinforcers because they are intrinsically rewarding. Why is that the case? Although drugs have a variety of different effects, almost all drugs of abuse target the same brain structures that are responsible for handling reward and motivation. By examining these structures, we can learn how our brain determines what behaviors to reinforce and how that reinforcement happens.

7.2.1. Dopamine in the Brain

When discussing neurotransmitters in Chapter 4, we mentioned how dopamine is related to reward and reinforcement. (As you may recall, it is also related to motor control, but that is not the focus of this chapter.) What this means is that dopamine is the neurotransmitter used in a handful of important brain pathways that control reward and motivation. These pathways connect various structures that play a role in determining which stimuli to attend to and which behaviors to reinforce.

Although dopaminergic (dopamine-releasing) neurons can be found throughout the brain, there are two major locations where they are clustered together. One is the substantia nigra (from substantia In Latin for “substance” and nigra for “black”) and the other is the ventral tegmental area (VTA). Both are located in the midbrain at the top of the brainstem and project axons to multiple different regions of the brain. The substantia nigra neurons have a high level of neuromelanin, a dark pigment. Ventral tegmental simply refers to the location of the area (the underside of the tegmentum, a part of the midbrain).

Neurons in these areas project their axons to different structures in the brain to form dopamine pathways. There are four main dopamine pathways in the brain. Three of these are involved in reward and reinforcement: the mesolimbic, mesocortical, and nigrostriatal pathways. Although the names may look confusing, they are fairly simple since the first part of the name identifies the origin of the neurons in the pathway and the latter part describes the brain area to which the pathway project.

The mesolimbic pathway (from meso– in Greek for “middle”) connects the ventral tegmental area of the midbrain to the nucleus accumbens and limbic system. Activation of this pathway is responsible for pleasure and reward. Overactivation of this pathway can lead to drug cravings and substance use disorder.
The mesocortical pathway also arises from the ventral tegmental area and projects to the prefrontal cortex, which is involved in executive functions including reasoning, decision-making, impulse control, and working memory. Disruption of this pathway can affect self-control, stress reactivity, and motivation to seek drugs.
The nigrostriatal pathway originates in dopamine neurons in the substantia nigra of the midbrain, and axons project into the corpus striatum, which is responsible for extrapyramidal motor control. The striatum is involved in habitual behavior such as compulsive drug-seeking.
There is a fourth pathway called the tuberoinfundibular pathway, which is involved in regulation of the endocrine role of the pituitary gland; it does not have a role in pleasure and reward.

The main focus of this text will be on the two pathways originating from the VTA, the mesolimbic and mesocortical pathways. Together they form a circuit of neurons called the reward system.

7.2.2. The Reward System

The reward system is a collection of structures that are responsible for reward and reinforcement. To get an overview of some of the structures involved and the roles they play, watch this video from Khan Academy:

Reward pathway in the brain [8:25]

Let us review the structures mentioned in the video. We have already learned about the VTA, which contains a cluster of dopaminergic neurons. Many of these neurons project to the nucleus accumbens which contains many dopamine receptors. Reward and pleasure are consistently associated with the activation of these receptors, which is why dopamine is sometimes called the “feel-good transmitter”.

Neurons from the VTA and nucleus accumbens also connect to certain structures in the limbic system. In particular, they connect to the amygdala which handles emotional responses, and to the hippocampus which forms long-term memories. These areas work together to create positive memories of pleasurable experiences, making it more likely that we will remember the stimulus and how to get it in the future.

Part of the mesolimbic pathway connects to the prefrontal cortex, which is also connected to the VTA through the mesocortical pathway. The prefrontal cortex is the anterior most part of the frontal lobe and is responsible for many executive functions such as planning, attention, and motivation. Activities here can direct our attention to rewarding stimuli and cause us to seek such stimuli out.

The circuitry of the reward system is very complex, but its overall purpose is to respond to things that are important to our survival and to reinforce the behaviors that help us obtain those things. This is usually oriented towards natural rewards such as food, sex, or sleep. Drugs can interfere with this process though, stepping in at some point and activating the reward system directly. Some drugs activate dopaminergic VTA neurons. Others increase dopamine production and release. Still, others might block dopamine reuptake to increase activation of dopamine receptors. The result is the same—the reward system is activated, and we register the experience of taking the drug as pleasurable and desirable.

7.3. Consequences of Repeated Drug Use

Section Learning Objectives

Describe tolerance and withdrawal and how they influence drug-taking behavior.
Explain how addiction can hijack the reward system in the brain.
Describe the treatment of substance use disorders.

Now that we know why drugs are reinforcing, it is time to learn about how drugs change our behavior over a long period. This is beyond the immediate increase in drug use that occurs as a result of reinforcement and instead focuses on the consequences of consistent and repeated drug use. Although our discussion of dependence and addiction will bring to mind illicit drugs, keep in mind that the same processes can occur with licit drugs as well.

Substance Use Disorder [3:46]

7.3.1. Drug Dependence

The main consequence of repeated drug use is the development of drug dependence. This term was introduced and defined in Chapter 1. It refers to the physiological changes wherein the body adapts to the drug throughout repeated use. In the figure below, the magnitude of the drug dose is in the lower panel, and the size of the drug effect is in the upper. Tolerance occurs when the drug effect starts to diminish despite the dose being held constant. It takes an increased dose to re-achieve the original intensity of drug effect.

Drug dependence will usually lead to tolerance to the drug, meaning higher doses are required to produce the same effect. In the figure above, the upper panel shows the magnitude of the drug effect while the lower panel shows the drug dose. For a time, a constant drug dose evokes a constant drug effect. But after a while, the same dose no longer produces the same drug effect which gradually diminishes over time. In the far right of both panels, only an increase in drug dose can re-achieve the original drug effect. Tolerance can arise from changes in pharmacokinetics or pharmacodynamics. If chronic drug use results in an increased rate of metabolism of the drug (pharmacokinetic tolerance), this would reduce the amount of drug that reaches the site of action. Chronic drug use can also result in the desensitization of target receptors (pharmacodynamic tolerance), reducing the effect of the drug but not the amount of drug reaching the site of action.

Why would the receptors change? The answer lies in homeostasis, or the body’s attempt to keep itself in a steady equilibrium. If a certain type of receptor is constantly overactivated, the body may try to compensate for this activity by reducing the number of receptors available or making them less responsive.

One reason why overdose occurs is that tolerance for diverse drug effects can develop at different rates. As tolerance for the desired effect increases, the dose taken needs to be increased; however, there may be less development of tolerance to the toxic or lethal dose. Recall the dose-response curves from the previous chapter. The curve for the desired recreational effect shifts further and further to the right indicating an increase in dosage until it approaches the toxic dose-response curve. Thus, the separation between desired and toxic curves becomes less and less.

There are other forms of tolerance. Cross tolerance occurs when tolerance develops to a similar drug even if the person had no previous exposure to the other drug. For instance, someone who is tolerant to heroin will also be somewhat tolerant to morphine as well, since both drugs have similar biological actions. People can also develop behavioral tolerance, in which the user becomes accustomed to the effects of the drug and learns to compensate for them. An example would be acting less disinhibited after drinking alcohol).

Tolerance and Withdrawal [5:32]

Tolerance has many other downsides besides needing to take more of the drug. Tolerance typically is tied to withdrawal, which is a severe reaction to a sudden drop or cessation of drug use. Since the body has attempted to adapt to the drug being present, when the drug is absent, this balance is thrown off. Withdrawal involves many different symptoms depending on the type of drug. All are generally unpleasant, and some may even be life-threatening. To avoid the effects of withdrawal, people with drug dependence will feel compelled to take the drug as it will provide temporary relief from the symptoms (a form of negative reinforcement).

Another consequence is the possibility of a conditioned compensatory response, which is an automatic response that is conditioned (learned) by repeated drug use. The response attempts to compensate for the effects of the drug by recognizing familiar stimuli associated with the administration of the drug (such as the sight of a needle or a particular location). Thus, the body goes into an automatic mode in anticipation of the drug’s effects. This is another potential cause of drug overdose. Taking the drug in an unfamiliar context can produce greater effects even at a constant dose since the cues that trigger a compensatory response are not present.

7.3.2. Addiction

As also described in Chapter 1, addiction is characterized by compulsive drug use despite harmful consequences. If you recall from that chapter, we mentioned how continued drug use can interfere with self-control and the ability to stop taking the drug. This is because drug addiction hijacks the reward system in the brain. This is explained in these two short videos:

Addiction, Episode One: The Hijacker [3:17]

How Drug Addiction Hijacks the Brain [2:46]

(Dr. Nora Volkow is the director of the National Institute on Drug Abuse, NIDA)

As mentioned in the video, the reward system becomes impaired with chronic drug use. One of the main ways this happens is through the downregulation or desensitization of dopamine receptors in the reward system. As mentioned previously, to preserve homeostasis, the body may reduce the number or responsiveness of overactivated receptors. Chronic drug use floods the synapses in the reward system with dopamine, resulting in the downregulation of dopamine receptors. This causes natural rewards such as food, shelter, and companionship to become less rewarding. This results in the drug becoming the sole source of pleasure, which compels the user to continue to seek the drug and neglect other parts of the user’s life.

There are various models of addiction. Originally, addiction was thought of as a moral failing and faulted the person for a weakness of character. Although many people still hold this opinion today, it is increasingly being challenged by the disease model of addiction, which views addiction as a kind of disease that interferes with the normal functioning of the body. As we have learned above, there is evidence in support of this medical disorder model, since chronic drug use does, indeed, cause physiological and behavioral changes. According to the disease model, the goal is to manage and treat addiction like any other disease.

Another model is the drive theory, which states that the body has innate drives (such as hunger) that increase and intensify until they are temporarily met. In this model, repeated drug use generates a drive to seek the reinforcing effect of the drug. Under this theory, the motivation for drug-seeking behaviors is always to gain positive reinforcement.

A related model is derived from the opponent-process theory, which, in essence, states that the effects of the drug are opposed by the actions of the body. This is again due to homeostasis as the body tries to reach a relative equilibrium to maintain normal functioning. As tolerance for the drug increases, the opponent processes also increase, resulting in withdrawal. In this model, the motivation shifts back and forth between seeking the pleasurable effects of the drug (positive reinforcement) and avoiding withdrawal (negative reinforcement).

Finally, another modern model is the incentive salience model. In this model, the drug is designated as something that should be desired. In other words, it becomes salient (commands our attention) and we are incentivized to pursue it. This model differentiates between merely enjoying a drug (pleasure) and being motivated to obtain it (compulsion).

Each of these four models takes a different approach to describing the mechanisms of addiction. However, each views addiction as the consequence of natural physiological processes being altered, accelerated, or interfered with in some way. While addiction is not included as a disorder in the DSM-5 by name, it is referred to as a substance use disorder instead.

7.3.3. Treating Substance Use Disorders

By now, it should be clear that addiction and substance use disorders are dangerous and destructive. Luckily, they can be treated, and normal functioning can be reclaimed. Treatment is a long process and involves multiple strategies, many of which may need to be combined to result in a successful treatment. The same treatment plan will not work for every drug and every person, but there are some common elements of each.

The first step is typically to remove the drug from the system, a process known as detoxification or simply “detox”. The next step is to manage withdrawal symptoms, as withdrawal is unpleasant and can increase drug-taking compulsions. This may involve drug therapies where weaker agonists or partial agonists are administered in a safe and controlled manner to replace the effects of the drug and to reduce withdrawal. Common examples include nicotine patches or lozenges that are designed to replace nicotine in a smoking cessation program. Methadone is used to reduce withdrawal signs in patients undergoing treatment for opioid use disorder.

Therapy is also used to help prevent relapse and re-establish order in life. Psychotherapies such as cognitive-behavioral therapy and multidimensional family therapy can help patients identify and avoid triggers for relapse, as well as address and cope with other issues in their lives that may contribute to drug use.

In some cases, individuals might undergo short in-patient treatment or stay in longer-term residential treatment facilities, such as therapeutic communities or halfway houses. These places help patients transition back to normal life by helping them develop life skills or seek employment. People recovering from substance use disorders may also be referred to various other programs and services, such as 12-step programs and support groups.

According to the National Survey of Substance Abuse Treatment Services, in 2014, around 22.5 million people needed treatment for an illicit drug or alcohol use problem, but only 18.5% of those people received any substance use treatment (SAMHSA, 2014). Although treatment is possible, there are disparities in access to resources and other barriers such as the stigma of addiction that prevent people from getting the help they need.

Chapter Summary and Review

In this chapter, we explored how drug use affects behavior through reward and reinforcement. We started our discussion by covering operant conditioning and the different factors that lead drugs to be reinforcing. We then moved on to the reward system, a collection of dopamine pathways that serves as the biological basis for reinforcement. Finally, we described how chronic drug use can lead to dependence and addiction, and covered methods for treating substance use disorders.

This is the final chapter for the first unit of this textbook. From this point on, the remainder of the text will be about different types of drugs and their specific mechanisms, effects, and treatments. We will kick off the next week by looking at CNS stimulants first. Make sure that you have a strong grasp of the concepts from these first seven chapters, as they will come up again and again in every chapter moving forward.

Chapter 7 Practice Questions

Answer the following questions:

John decides to take a shortcut through a park on his walk home but ends up getting mud all over his shoes. He makes a note to avoid the route in the future. What type of reinforcement or punishment occurred?
Frederic finds Gatorade incredibly tasty after a hard workout but enjoys it less when he isn’t thirsty. What principle is at play here?
Which structures do the mesolimbic and mesocortical pathways connect to?
What is the key synaptic connection at the center of the reward circuit?
Name several ways in which drugs can increase the neuronal release of dopamine in the reward system.
What is tolerance and how can it occur?
How can tolerance lead to a drug overdose?
How does withdrawal occur?
Name four models of addiction and briefly describe each.
Explain detoxification.
What are some ways that drug therapies can be used in treating substance use disorders?

2nd edition

License

Chapter 7: Reward and Reinforcement by Washington State University is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

7.1. Learning: How Experience Shapes Behavior

7.2. Biological Basis of Reinforcement

7.3. Consequences of Repeated Drug Use

Chapter 7 Practice Questions

License

Share This Book