Skip to content

The reversal of discrimination in a simple running habit

By R. N. Berry, W. S. Verplanck, and C. H. Graham
Brown University

Our experiment is concerned with the ‘reversal’ of a discrimination in an experimental situation similar to one used by Verplanck (6). The term ‘reversal’ refers only to the conditions of reinforcement and non-reinforcement, and is used for convenience in description. When a reversal takes place in the present experiment, the originally reinforced stimulus is made the unreinforced stimulus and the unreinforced stimulus is made the reinforced stimulus. In this descriptive sense, the first set of conditions under which the rat learns the problem constitutes the original learning. The first change in the conditions of reinforcement is called the first reversal, and the second change is called the second reversal. Actually, the second reversal reinstates the conditions of original learning.

Evidence from experiments by Hunter (3) and McCulloch and Pratt (4) indicates that an original discrimination interferes with the establishment of the reversed discrimination. On the basis of results obtained on the lever-pressing apparatus, Skinner (5) concludes that induction effects are present between the reinforced and non-reinforced stimuli in the first reversal but not in the second. Some results of Fritz (1) show no carry-over of effect from an original discrimination to its reversal.

Our learning situation presents two measures, latent period and running time. The measures are most easily understood in terms of the experimental situation which gives rise to them. A full description is given by Verplanck (6). The latent period is defined as that period between the time at which the door in front of the starting box (of the ‘runway apparatus’) is opened and the time at which the rat, with the exception of tail, crosses a mark three inches in front of the starting box. The second measure, running time, is the time from the moment the rat crosses the mark to the moment when the door in front of the food box is dropped behind the animal.

The latent period may be thought of as a measure of the ‘running out’ response to the stimulus of opening the door. Running time may be considered as a measure of the ‘running-down-the-alley’ response. On the basis of this analysis we may think of the two measures as representing the respective strengths of two S-R members within a chain of correlations [cf. Verplanck (6) ].

Method

The apparatus is identical with the one used by Verplanck. One side of the runway is painted black; the opposite side is white; the sides are gray. By this arrangement the animal runs over a runway which presents two discriminative stimuli: ‘white’ or ‘black.’

Original discrimination for the rats utilized in this experiment was established by Verplanck. Verplanck employed 4 groups of rats in his experiment. Rats from two of these groups were employed in the present research. Both of these groups (Verplanck’s Groups I and IV) were originally trained to make the black-white discrimination. Each rat in Group I had run 6 cycles daily, each cycle consisting of (a) one reinforced trial on the white alley and white food box and (b) three trials on the black alley and black food box. Each rat in Group IV had received the same training except that he was run for only 4 cycles daily. All rats, when they came to the present experiment, had received a total of 96 trials on the actual discrimination problem. This number does not include the preliminary training and acquisition trials before discrimination was introduced.

On the day that the reversal of discrimination was introduced, the rats were first run for 3 retention cycles. Following these cycles and on the same day, the rats ran for 3 cycles on the new discrimination problem. On the 3 succeeding days, 6 cycles per day were run. On the fifth day, data were collected for three cycles. All together, over a period of five days, the first reversal involved a total of 96 trials.

The first 3 cycles of the second reversal were instituted in the second half of the fifth day’s experimental session. During the next 4 days, 96 trials were completed by each rat on this reversal. Six cycles (or 24 trials) were run each day by each rat, an exception being the last day, on which only 3 cycles were completed.

A complete trial consisted of the following sequence. The rat was placed in the starting box. As soon as the experimenter was ready, the door in front of the starting box was raised, so that the rat might run down the alley into the food box. If the rat entered the food box, the food-box door was dropped behind him. The animal was permitted to remain in the box and eat for one minute and 25 seconds. At the end of this interval, the food box was picked up and carried around to the other end of the apparatus. The rat was placed tail foremost in the starting box. Thirty-five seconds after the food box had been picked up, the door in front of the starting box was again opened, and another trial was begun. In the case of reinforced trials, the runway was turned over before the rat was taken back to the starting box, and food was placed in the food box. In the case of unreinforced trials, the runway was rattled and the can from which the food pellets were taken was moved about. This procedure was introduced so that the rat might receive no discriminative stimuli from these sources.

If, during the trials, the latent time was more than two minutes, the starting box door was closed and the next trial was started 30 seconds later [cf. Graham and Gagné (2) and Verplanck (6)] . In case the running time exceeded two minutes the rat was replaced in the starting box, and the starting box door raised 30 seconds later. The two-minute period was used because previous work with this experimental apparatus indicated that if the rat did not run or enter the food box within this period, he was not likely to do so at all. Times were taken by means of two stop watches.

The animals used in this experiment were male members of the inbred Wistar strain which constitutes the colony of the Psychological Laboratory. All rats had already been trained on the original discrimination by Verplanck. Each of the two of Verplanck’s groups which were used was further divided. A subgroup, made of rats from Groups I and IV, was run on the first retention trial 3 days after Verplanck’s last trial of original discrimination. The remaining rats in Groups I and IV were tested 21 days after the original discrimination. Table 1 indicates the number in each of the 4 groups thus created.

Table 1

Number of Rats in Each Group During the Experiment

Group I Group IV

3 day retention. . . . . . . .5 4
21 day retention. . . . . . . .3 3*

*One rat in this group had a 29 day retention period, instead of the 21 day interval.

All of the rats were maintained in good health throughout the period from the original learning to the retention trials by relaxing the feeding rhythm whenever it was considered necessary. However, all rats were returned to the twenty-minute-per-day feeding rhythm two days before the new problem was begun. During this regime the rat was fed a total of 20 minutes a day, part of which was spent in the apparatus.

Results

Although the rats were divided into the 4 groups of Table 1, the results, as finally presented, are computed for all the data rather than for particular groups. So far as can be seen, no significant group differences appear in the results of the first and second reversals. It appears that differences due to the original discrimination and the conditions of retention are too small to have any effect upon later training. For this reason it seems justifiable to average the data on all three groups, at least for purposes of arriving at a first approximate description of reversal. The records of original conditioning for the rats utilized in this experiment were taken from Verplanck’s data.

All measurements have been converted into logarithms in accordance with the procedure used by Graham and Gagné (2) and by Verplanck. Graham and Gagné have discussed the reason for using logarithms in this situation. The measure of central tendency employed is the median. (The logarithm of the median is, of course, the same as the median of the logarithms.) It is impossible to obtain a significant mean because the rat might fail to leave the starting box or to enter the food box. In these circumstances, times (running time and latent period) are ‘infinite,’ and by their indeterminate nature cannot be used in the computation of a mean. However, some sample means on trials showing no ‘infinite’ times are in very close agreement with the medians calculated for the same trials. Verplanck found this same agreement between the median and the mean. No measure of variability is reported. Deviations are of the same order as those obtained by Verplanck.

The results1 of the experiment are presented in Figs. 1 and 2. Fig 1A presents the medians of log latent period for all the unreinforced trials. Retention trials and the points of reversal are indicated in this figure and also in those which follow. Each of the values in this figure represents the median of all the unreinforced trials in any given cycle. Fig. 1B presents the medians of log latent period for the reinforced trials. Data on running time are given in Figs. 2A and 2B.

The log latent period curve for unreinforced trials (Fig. 1A) on the original discrimination differs somewhat from the one obtained by Verplanck. This is due to the lumping of the two groups in the present experiment, to the smaller number of animals, and, most

Fig. 1

A. The course of response (measured by latent period) to the unreinforced stimulus.
B. The course of response to the reinforced stimulus.

In both graphs the heavy vertical lines separate original discrimination from the first reversal and the first reversal from the second. The arrow indicates the trial on which discriminative stimuli were first introduced. The three points through which the smooth line is not drawn represent retention trials. The short vertical lines above the curves separate daily runs.

importantly, to the fact that we have not attempted to plot the daily extinction curves. Rather, the over-all trend is indicated, and this is true not only for the first discrimination but also for the succeeding reversals. Verplanck’s data show significant trends which are concealed when Groups I and IV are combined because these two groups were not run the same number of trials per day. The curve for original discrimination shows a decreasing latency followed by a rising latency until the latter approaches a final level. Values for the retention trials after original learning are indicated in the last three points, through which the curve does not pass.

The curve of the first reversal, separated from original discrimination by a vertical heavy line in Fig. 1A, shows a low initial value, a rise to a maximum, and a decline to a final level. The low initial value might be expected since this response had been the reinforced one on the retention trials and during original discrimination. The first value on the

Fig. 2

A. The course of response (measured by running time) to the unreinforced stimulus. Symbols as in Fig. 1.
B. The course of response to the reinforced stimulus. Symbols as in Fig. 1.
C. Curves of Figs. 2A and 2B placed on the same graph for purposes of comparison. Solid line: responses to unreinforced stimuli. Dashed line: responses to reinforced stimuli.

second day of the reversal is noticeably above the values for the first day. It is possible that the non-reinforcement of the day before is having more effect than the reinforcement which preceded in retention and in original discrimination. The sharp drop which follows this single high value might be attributed to an increasing amount of induction between the reinforced and unreinforced reflexes2. It may well be that induction between temporally corresponding members of two chains of reflexes may be greater between early members (leaving starting box) than between the later members (running down the alley). The data obtained in this experiment would indicate that such is the case. The second reversal shows no rise in latency on the unreinforced trials above the last latencies in the first reversal. It is possible that the induction between the two reflexes is so great that it is maintained throughout the experiment.

Fig. 1B gives data for reinforced trials. Since Verplanck’s log latent period curves for Groups I and IV were identical on the reinforced trials, a selection of rats from these two groups might be expected to yield the same curve as he obtained. This proves to be the case, and it is his curve which is fitted to the data. The first 8 trials run by Verplanck are not represented in any of the graphs. This is due to the fact that these trials constitute original acquisition; discrimination was not introduced until the ninth trial, the first reinforced trial being the twelfth. The three retention cycles (through which the curve is not drawn) do not deviate from the limits of the curve. The curve for the original reinforced trials exhibits a rapid drop and decelerates slowly to an asymptote. This final value represents the limiting response under these experimental conditions.

The first reversal (data of Fig. 1B to the right of the first heavy vertical line) for the reinforced trials shows an initial increase and then a rapid drop nearly to the level reached in the final stages of original discrimination. No attempt will be made to explain the presence of this initial rise which clearly appears to indicate an obvious decrease in strength of the reinforced response. The second reversal in this graph, separated from the first by another heavy line, shows no deviation throughout its course from the final level of the first reversal.

The data for a log running time on the unreinforced trials, shown in Fig. 2A, exhibit great variability. Verplanck has shown that this variability is only apparent, and that the data are fitted by smooth extinction curves drawn through the records for single days. However, we have not attempted to analyze original discrimination in this way. Indeed, such an analysis would be impossible, since the data are compounded of scores for two groups varying in the number of cycles per day. The curve, then, is drawn to represent a tendency rather than a rigid description. It is important to observe that the data for retention trials (the last three points before the curve of the first reversal) show no apparent deviation from the final values of the curve for original discrimination. This indicates that the change in experimenters does not influence the level of discrimination. The curve for the first reversal indicates a slow, but unstable, increase in the running time for the unreinforced trials. The second reversal shows the same general characteristics as the first. Perhaps the increase from the first trials to the last in the second reversal is slightly smaller than the corresponding increase during the first.

The original learning curve for log running time on reinforced trials in Fig. 2B fits the curve obtained by Verplanck, and it is his curve which is used. It shows a slow deceleration through the first four cycles and beyond this point attains a final level. The three retention cycles show slightly higher values then the last one of original learning, but the differences do not appear to indicate lack of retention to any significant degree.

The initial section of the first reversal in Fig. 2B exhibits a higher initial value than original discrimination. This is probably due to the fact that the animal is responding to the reinforced stimulus as if it were still unreinforced. The curve probably does not reach the low terminal value that is reached in original discrimination, but if we disregard the value for the first cycle, the shape of the curve seems to be essentially the same as that for original learning.

The second reversal has an initial value lower than that of the initial value of either original discrimination or first reversal. The terminal values of this curve are as low as those in original discrimination and perhaps lower than those in the first reversal.

Discussion

The general trends of the curves in Figs. 2A and 2B can be better understood if they are plotted on the same graph, as in Fig. 2C. The graph facilitates visual comparison. Little is to be gained by making a similar graph for the log latent period data (Figs. 1A and 1B). The latter values reach a constant level for both reinforced and unreinforced responses soon after the first reversal, and this level is maintained throughout the course of the experiment.

In considering Fig. 2C we may assume that the height of the curve for reinforced trials, at any ordinate value, represents some inverse measure of the reflexes: running down the alley to (a) the reinforced stimulus and (b) the unreinforced stimulus. The difference between corresponding ordinate values, then, is some measure of the difference in strength between the reinforced and unreinforced responses at a particular stage of conditioning. In these terms, we may say that the strengths of the reinforced and unreinforced responses show a smaller difference at the end of the first reversal than they do at the end of original discrimination. A similar diminished difference exists between the two reflex strengths at the end of the second reversal. Further, it may be observed that the difference in strengths of the two reflexes increases more slowly during the first and second reversals than it does during original discrimination. The first reversed discrimination, then, develops neither so rapidly nor so strongly as the original discrimination, and this is true of the second reversal, even though this reversal reinstates the conditions of the original discrimination. We have seen this effect in greatly augmented form in log latent period data. Soon after the first reversal the differences in strength measured by log latent period drop nearly to zero.

The term induction implies that two related reflexes mutually interact so that each is modified positively or negatively in strength by the other, depending on the characteristics of the influencing reflex. In the present experiment inductive effects seem to be present (a) between the reinforced and unreinforced reflexes occurring early in the chain (latent period measures) and (b) between the reinforced and unreinforced reflexes occurring late in the chain (running time measures). Induction is particularly strong in the early member, where the latent period measure is shown to reach a relatively constant value, uninfluenced by reinforcement or non-reinforcement, early in the first reversal. Since the latency finally attained is at or near the final level of latency to the reinforced stimulus during original discrimination, this must mean that almost the full weight of induction is in the direction: Rpos. to Rneg.

The same effect is shown in the reflexes measured by running time, although to a lesser extent. Here the differential response to the unreinforced stimulus is clearly maintained, but in diminished degree, as contrasted with its value in the original discrimination. On these grounds it may be true that induction more seriously influences an early member of a chain than a later one. If so, the explanation probably lies in the relative strengths assumed by the member correlations at their respective temporal (or spatial) intervals from reinforcement. Thus, in the present experiment, the extinctive effect of non-reinforcement has too little influence to counteract the excitatory induced effect from preceding reinforcements of the positive responses as measured by latent period. This is not true in the running time measures of the later members. Here, the extinctive effect is close in time and space to the final unreinforced response; hence, the response is less influenced, in absolute degree, by the induction effect from the reinforced response.

To the extent that different measures allow of comparison, our results seem to be in agreement with those obtained by Hunter (3), McCulloch and Pratt (4), and Skinner (5).

Summary

The present experiment has been concerned with the courses of (a) original discrimination, (b) its reversal (first reversal), and (c) its reinstatement (second reversal). The method employed involves the acquisition of a discrimination by a white rat on the runway apparatus of Graham and Gagné (2). During original discrimination, the animal’s response is reinforced on the white alley and unreinforced on the black alley. Ninety-six trials on the original discrimination are followed by 12 retention trials, 3 or 21 days later. Conditions of reinforcement and non-reinforcement are then reversed (first reversal). The second reversal starts 96 trials after the first trial of the first reversal and lasts for 96 trials. The second reversal reinstates the conditions of original discrimination.

The course of discrimination has been studied by means of measures of two members of the chain of reflexes leading to entrance into the food box. These two measures are latent period (early member) and running time (late member).

1. Latent period for the reinforced response during original discrimination falls from a high value and eventually reaches a final steady level. Latent period for the unreinforced responses shows an over-all rise to a relatively unstable final level.

2. With the introduction of the first reversal, the latent period exhibits a brief rise from the terminal level of original discrimination but soon declines to a low final value. The unreinforced response measured by latent time shows a significantly high value on the first full day of reversal, but thereafter it soon reaches the low level characteristic of the reinforced response.

3. During the second reversal, latent periods for both responses are maintained at a steady low level comparable to the final level reached by the latency of the reinforced response during original discrimination.

4. Running time (the measure of the later member of the chain) on reinforced trials, throughout all the discriminations, starts at a high value and thereafter drops to a steady final value. On unreinforced trials, running time increases during each discrimination. This increase is greatest during original discrimination.

5. The decreased difference between running times on reinforced and unreinforced trials, characteristic of the first and second reversals, is interrupted in terms of increased induction processes. This interpretation is also applied to the data on latent period.

(Manuscript received December 2, 1942)

References

1. Fritz, M. F. Long time training of white rats on antagonistic visual habits. Journal of Comparative Psychology, 1930, 11, 171-184.

2. Graham, C. H., & Gagné, R. M. The acquisition, extinction, and spontaneous recovery of a conditioned operant response. Journal of Experimental Psychology, 1940, 26, 251-280.

3. Hunter, W. S. Habit interference in the white rat and in human subjects. Journal of Comparative Psychology, 1922, 2, 29-59.

4. McCulloch, T. L., & Pratt, J. G. A study of the presolution period in weight discrimination by white rats. Journal of Comparative Psychology, 1934, 18, 271-290.

5. Skinner, B. F. The Behavior of Organisms. New York: D. Appleton Co., 1938, pp. ix + 457.

6. Verplanck, W. S. The development of discrimination in a simple locomotor habit. Journal of Experimental Psychology, 1942, 31, 441-464.

Footnotes

1. Numerical data are contained in a thesis filed in the John Hay Library of Brown University (R. N. Berry. The reversal of discrimination in a simple locomotor habit, 1941.)

2. In the light of the consistently short latent periods on both reinforced and unreinforced trials after the first reversal, it was thought possible that the animal might be responding to the opening of the door and not to the alley at all. If so, he might give a response great enough to cause his falling from the starting box in the absence of the alley. To test this possibility, the alley arrangement was reconstructed so that the alley might be removed from the apparatus. When the latent times had reached a low level in the course of the first and second reversals, the alley was removed at the beginning of a few unreinforced trials. At no time did an animal fall or nearly fall out of the starting box when the alley was absent.

Creative Commons License http://www.verplanck.conductual.com is licensed by Admin under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.