People often misunderstand positive reinforcement because those of us who espouse and employ the technique can get sloppy with our definitions. As we’ve discussed before, for example, the “positive” in positive reinforcement need not mean ‘happy,’ ‘kind,’ or ‘joyful.’ It simply means “added in,” as in the reinforcer added in to make a behavior more likely to occur in the future. Sometimes, we get loose with positive reinforcement talk because we want to avoid sounding too technical or jargony—as some would say I just did in the previous sentence. If we want the world to operate on positive reinforcement principles, wouldn’t it be helpful to have a clear, concise, layman’s-language way to explain what the heck we’re talking about? I say yes, and that’s why I put together a few visual aids—a matrix, a spectrum, and a Buddhist-flavored frame—to help keep things clear. Feel free to use for yourself and share with others if you think these tools might help at all.
The Operant Conditioning Matrix
Positive reinforcement springs from the study of operant conditioning. Based on the often polarizing work of Harvard professor B.F. Skinner, the field of behavioral psychology suggests we operate on our environments—in other words, we act and interact—and that our choices then have consequences—both for ourselves and for others. Any results that select, strengthen, or maintain the original behavior serve as reinforcers. Any outcomes that cause us to avoid, weaken, or eliminate that specific behavior serve as punishers. Neutral consequences that have no effect on the likelihood of the original behavior appearing again—not reinforcers. Operant conditioning, then, uses a variety of stimuli—shaping, schedules, and reinforcements—to encourage or discourage certain behaviors.
Often, you’ll see the options for operant conditioning explained with a 2 x 2 matrix similar to the one pictured below. The top ‘shelf’ shows the two kinds of reinforcement that increase the frequency of a desired behavior, the bottom ‘shelf’ names the two types of punishment that decrease the frequency of an undesired behavior. The left column includes the two “positive” conditioners—those where the consequence meant something added in to the learner’s environment—while the column on the right contains the two “negative” conditioners—those where the consequence took something away from the learner’s environment. Feel free to take a moment to really “get” this. I still find staring at the four blocks helpful.
The success of any of these conditioning techniques relies directly on the skillfulness of the trainer employing them. The feedback has to be timed well: reinforcement happens best exactly as the behavior is happening. The consequence needs clarity, not getting linked with other conflicting or conflating messages. Frequency matters as well—too often can become meaningless, too rare can create apathy or confusion. In all cases, the consequence must have relevance for the learner. If he or she doesn’t care, the reinforcement or the punishment won’t hold.
Let’s run through each of the four quadrants with examples. I’ll mention responses a ‘trainer’ (i.e., a coach, parent, or significant other) could use in three different troublesome situations: an athlete making a repeated error in practice, like an errant softball throw; a teenager staying out past curfew; and a partner continuing to leave a kitchen messy. None of my examples will be perfect, and I imagine you’ll have responses of your own spring to mind. Noting your own suggestions will help the ideas grab hold.
Positive Punishment (P+) adds something unpleasant to the learner’s environment. Most of us have an intuitive sense what we mean by the word “punishment.” Harsh words or a slap to the face would qualify, but the pain need not be so obvious. A disapproving look a roll of the eyes, or even the threat of future punishment can serve the same function: they decrease the likelihood of an undesired behavior. Major or minor, they inflict some kind of pain.
In the case of the softball player making the errant throw, the coach might throw her clipboard to the ground in disgust or add extra wind sprints—both would qualify as positive punishment though the second would likely have little effect for its significant time lag. Screaming at the kid would as well. For the teenager out too late carousing, a parent could add extra chores or otherwise make life miserable. A partner finding dirty dishes in the sink could throw the dishes against the wall, put them on the offending partner’s desk, curse and yell, and so on.
Negative Punishment (P-) takes away or reduces something the learner enjoys or wants. It’s a kind of penalty, as in football where an offending team has to give up yardage or in hockey where a player who gets his stick too high in the air has to sit off-ice for a few minutes. Some imagine that negative punishment hurts less than positive, but it can pack just as much of a wallop. Yes, physical abuse, a positive punishment, is awful. But an active withdrawal of love or an abandonment, both negative punishments, can be cruel as well. Negative punishment doesn’t have to be harsh, but it can be.
Applying this method to our ongoing examples, a coach might pull her softball player making the error from the lineup, either right away or for the next. The teenager continually coming home late might get grounded (taking away freedom of movement), lose cell phone privileges (reducing connection with friends), or have to box up the video game (loss of entertainment). The cleaner housemate might stop smiling around the ‘offender’ as long as the kitchen stays dirty or perhaps might cease cooking dinner for the other.
Note that a consequence can have elements of both positive and negative punishment, in this ‘adding in’ and ‘taking away’ sense. A speeding ticket adds points to an offender’s driving record and takes away money from his bank account. A prison sentence takes away freedom and access to loved ones (negative punishment) and adds unpleasant living conditions and a social stigma (positive punishment). In such cases, the two types operate in tandem.
Note also that much of our society turns to punishment, both positive and negative, almost as a default. Partly it’s because we’ve been told that it works—even where it doesn’t—and we experienced it as kids (so it must work, right?). Maybe just as much, it’s because it seems to satisfy the trainer on some emotional level. Often, we think of and enact punishment as retribution, like it’s an agent of moral righteousness. The miscreant got what they deserved, we think. Now they’ll know. Of course, none of that motivation or emotional involvement as a trainer measures whether the punishment gets the learner any closer to the behavior we do want.
[Part 2 of this post can be found here.]
 One could ask whether these situations actually qualify as ‘problems’ or if they’re merely inconveniences for an outsider wanting to change someone else. That’s a fair question. When applying these techniques, it most certainly makes sense to check if we’re trying to encourage a behavior because it gives us a feeling of control and makes our lives more convenient or because it’s truly in the learner’s best interest—from their point of view. Our choices will often contain both motivations, of course, but it’s best to keep them distinct and to at least admit our own less-than-savory reasons for doing what we do. Hopefully, that keeps our eye on the ball of what matters most: the learner’s development.
 This is one of toughest dilemmas I face as a coach, particularly in softball, where we get limited flexibility with substitutions. If a player continues to make mistakes defensively, I may need to get her out of the game for the team’s sake, but I don’t want it to come across as a punishment. This is where all the relational and growth mindset ground work we do before such charged moments comes into play.