ABSTRACT

In control learning, unlike predictive learning, an individual can act upon and affect the environment. Skinner devised a schema for categorizing the consequences of one’s acts based upon two considerations: whether the consequence involved the occurrence (positive) or non-occurrence (negative) of an event; whether the consequence resulted in an increase (reinforcement) or decrease (punishment) of the preceding behavior. In the attempt to integrate 98Skinnerian and Pavlovian concepts an adaptive learning schema was introduced. Learning was described as acquiring the ability to predict and control the presentation, removal, or prevention of appetitive and aversive events.

Deprivation of appetitive substances has three major effects on behavior: overall activity level increases, presumably an adaptive response increasing the likelihood of discovering the appetitive substance; operant responding for an appetitive substance increases in vigor and speed; and behavior becomes more goal-directed as competitive behaviors decrease.

As with predictive learning, application of the scientific method has enabled determination of how different variables influence control learning. In general, there is consistency in the effects of the variables on both types of learning. For example, temporal contiguity between events is necessary for individuals to detect a predictive relationship. In control learning, the Law of Temporal Contiguity applies to positive reinforcement (controlling the presentation of an appetitive event), negative reinforcement (removing or preventing an aversive event), positive punishment (controlling the occurrence of an aversive event), and negative punishment (controlling the removal or prevention of an appetitive event). Short-term consequences are more powerful than delayed consequences.

Reinforcers (appetitive stimuli) and punishers (aversive stimuli) can acquire their effectiveness through genetic (unconditioned or primary) or experiential (conditioned or secondary) means. A discriminative stimulus signals that a specific behavior will be reinforced. A warning stimulus signals that a specific behavior will be punished. Sequences of behavior extending over short or lengthy time periods may be understood as consisting of stimulus-response chains in which responses alter the environment resulting in discriminative stimuli for subsequent responses leading to an appetitive event.

It is efficient to use the procedures of prompting, fading, and shaping to establish arbitrary behaviors in the laboratory and natural environment. Proceeding from physical, to gestural, to verbal prompts is a powerful sequence for having behavior come under the control of language.

The fact that intermittently reinforced responses are much more resistant to extinction than continuously reinforced responses has been named the partial reinforcement extinction effect (PREE). Once a behavior has been learned, schedules of consequences will determine the frequency and pattern of occurrence over extended time periods. Skinner devised a schema to organize intermittent behavior-consequence contingencies based on two considerations. Did the contingency require a certain number of responses (ratio schedule) or a single response during a window of opportunity (interval schedule)? Was the number of responses or length of time between opportunities constant (fixed) or inconsistent (variable)? Each of the four basic schedules of positive reinforcement (FR, VR, FI, VI) result in a characteristic response pattern. Fixed contingencies result in post-reinforcement pauses whereas variable contingencies produce consistent response rates. Ratio contingencies result in higher response rates than interval contingencies. Procedures holding reinforcement frequency constant reveal that the ability to influence how soon reinforcement occurs is responsible for the higher response rates under ratio schedules.