A joint-control analysis of generalized
California State University, Los Angeles
An Invited Address given at the convention of the Association for Behavior ANalysis
It can be fairly argued that one of the most significant problems facing the development of a complete analysis of behavior lies in the problem of explaining the nature of generalized abstract responding. Now what I mean by the term generalized responding is the usual meaning, responding appropriately to novel stimuli based on behavior acquired with training stimuli. And by the term abstract responding I simply mean responding to relations between stimuli such as relative size like larger or smaller rather than responding to concrete features such as a particular color, shape or size. Generalized identical matching would be an example of the kind behavior I’m talking about, where subjects trained in the identity-matching relation with one set of stimuli generalize that performance to a new untrained set of stimuli.
What I would like to do here is illustrate just why it is that behavior analysis does not seem conceptually equipped to explain generalized abstract performances, and then show how the notion of joint control accounts for a really amazing variety of generalized abstract performances--al the way up to goal oriented search.
I’d like to begin by first considering a form of responding we do understand: namely selecting stimuli in response to other stimuli. as in the conditional discrimination shown in Figure 1. In this task an array of comparison objects are presented -- here a triangle and a square -- and the subject must select one in response to the sample – here a square. On the usual account of this behavior, henceforth called unmediated selection, it would be said that due to a history of reinforcement, the sample serves as a conditional stimulus, making one of the comparison stimuli function as the SD for a selection response.
Thus, here, the sample square functions to make the comparison square act as an SD. for a pointing response, shown here by the hand.. Noticed that there is no mediating process of recognition here, instead, under unmediated selection, one stimulus is selected because it causes another stimulus to evoke the pointing response at a higher rate.
Where one stimulus is a word, the same account applies unchanged to two different behaviors. We see these in Panel A and Panel B of this next picture, Figure 2. Thus, we may explain both the selection of a square object, in response to the spoken word “square” as shown in the upper panel, and also the selection of the printed word square in response to the square object, as shown in the lower panel, in exactly the same terms: In both cases the sample, as a conditional stimulus, makes one of the comparisons function as an SD for the pointing response, thereby producing an unmediated selection of that comparison.
And so, in Panel A, the square object is selected in response to the spoken word square, and in Panel B, the printed word square is selected in response to the square object. Both are due to a heightened rate of emitting the selection response to these comparisons in the presence of their respective samples -- the spoken word square in the first case, and the square object in the second case.
But the problem with this unmediated selection account is that it can’t describe the emergence of generalized abstract responding . This is because the same mechanism operates for both abstract and for concrete responding . Thus, in Figure 3, just as the sample square may cause the comparison square to function as an SD for the pointing response -- as shown in Panel A, so may the sample square cause the color blue to function as an SD for the pointing response as shown in Panel B. Unmediated selection just does not appreciate the fact that an abstract relation, identity, exists in the first case, but not in the second case.
Likewise, in describing relations between words and objects, we find similar problems. Thus, if the only association between a word and an object is based on the unmediated selection process I have just described, then in word-object matching the subject can only respond with the actual relation trained between the stimuli. And so, as we see in Figure 4 after a subject is trained to select the appropriate shapes in response to the training phrases shown at the top: square over circle and circle under square, unmediated selection cannot account for generalization to novel combinations of these words so that, for example, with no additional training subjects select one and the same stimulus in response to the phrases “circle under square" and "square over circle", and select a second stimulus in response to the phrases “circle over square” and "square under circle”.
In summary, the point I wish to make here is that the ordinary notion of unmediated selection, as it is generally applied in behavior analysis, appears to be intrinsically unable to account for accurate responding based on generalized relations between objects and. between words and objects. An alternate account is seems needed. In this talk, I hope to outline such an alternate based on the notion of joint control. As I have discussed in recent articles, joint control itself involves nothing more than the usual kind of operant stimulus control, except that under joint control two verbal stimuli jointly exert stimulus control over a, single, common verbal topography. Now while this may not sound like much, this form of stimulus control seems to have exceedingly powerful and important effects.
As a demonstration of joint control consider the task, shown in Figure 5, in which you must locate a particular 6-digit number in an array in response to the sample 939173. Take a moment to find 939173 in this array.
Now finding the correct number required joint control. This next figure (6-ALT) illustrates this. .First off, after you were given the sample 939173, you began to scan the array of comparisons looking for that number. As you did so, you rehearsed to yourselves the sample number 939173. First you rehearsed what I said, and once you said it to yourself, then you repeated your own rehearsals. In the language of Skinner’s verbal operants, these would be described as echoic, and then self-echoic rehearsals of the sample.
In the figure (6-ALT) the echoic is illustrated along the top of the figure. For the sake of clarity in this drawing, the self-echoic has been omitted. But, the idea here is simply that after you saw the printed sample 939173 you rehearsed the sample as a repetition of what you just heard – first as an echoic, and also subsequently, in response to the stimulus cues generated by your own rehearsals as a self-echoic.
Second, as you perused the comparisons, you also attempted to tact each 6-digit number, and you continued doing this, until, as illustrated here, you encountered a particular 6-digit number: one that you could emit both as a tact, and simultaneously as the currently rehearsed self-echoic. That is, at some point you could say 939173 both as one of your self echoic rehearsals of the spoken number, and jointly as a tact of the printed number itself. And so, at this point you were repeating the number 93173 under joint tact/self echoic stimulus control an event which you report by saying you had found the specified number.
This event, this onset of joint control, is a unique source of stimulus control because it only happens when the specified comparison is encountered. – In this case the printed 939173. That is, you knew that you had located 939173 when you could say it BOTH as a self-echoic rehearsal and jointly as a tact. Indeed, the onset of this joint self-echoic/tact control is in fact the only possible way you can identify which printed number was the one I had said.
Now another thing; it is also important to note here that even though this is a matching to sample task, you were all able to recognize the specified number without actually emitting a pointing response to the number. This is very different from the conditions we described in unmediated selection, where the sole response is the selection response as when a child selects a stimulus by actually pointing to it or a pigeon selects a stimulus by actually pecking it.
In contrast to that, in the current case, you simply relied upon the onset of joint control to decide when you had located the correct number. That is, when you recognized the number. Indeed, I would suggest that the onset of joint tact/self-echoic control is itself the event you would identify as the event of actually recognizing the specified number. You might then subsequently report this recognition event by emitting the topography-based autoclitic “I found the number you said.”.
But, had I requested that you point to the specified number, you could have just as easily done that instead, and pointed to the digits 939173. But such a response woul;d not be like a pigeon’s peck evoked by unmediated selection. Rather, as I have discussed elsewhere, such a pointing response may be described as a selection-based autoclitic response. That is, a response telling others which number in the array entered into joint stimulus control with the self-echoic that you were already rehearsing. For the sake of clarity, here is another example of joint control.
In this next figure (7) the sample is the spoken phrase: Black dot inside a smaller pentagon . Now note that this is a very complex sample mentioning colors, shapes, relative size, -- that is, larger or smaller, and spatial relations – that is inside or outside. But despite this, again, the process is the same: As you peruse the among the 4 comparisons, you rehearse, as a self-echoic, the sample phrase “Black dot inside a smaller pentagon”. And when a comparison is encountered that jointly evokes this same topography, you again report this source of joint control-- here by an autoclitic pointing response.
Together, this example and the prior number-finding task, illustrate that the joint-control event is generic, and thus independent of particular stimulus properties. Thus, in these two tasks it didn’t matter whether it was the names of numbers or the names of colors, shapes, or relative sizes. In all cases, whenever the self-echoic comes under joint control, this event, this onset of joint control, is itself a common generic event. And it is this generic event that evokes a report of recognition and/or a pointing response. As we shall see next, the generic nature of joint control makes it surprisingly ubiquitous in our behavior: because it produces an exceedingly complex variety of behavioral phenomena typically attributed to cognitive notions of abstraction and conceptualization.
We shall begin this examination with a consideration of how words are tied to the objects they specify… That is, exactly how do words specify objects and events? Lets look at one of the first experiments I did on this topic. (Figure 8) In this experiment, prior to learning to match the stimuli, we trained retarded children to make one of the handsigns shown in Panel A to each of the corresponding shapes. As a result of this training the subjects were able to make each of the hand signs when shown the corresponding shapes.
In the next phase of the experiment, the children were taught the matching to sample performance shown in Panel B. Here, the children were first trained to make the correct handsign to each shape when it appeared as a sample. They then learned to rehearse the handsign over a roughly 10 second delay interval, and to continue to rehearse the handsign until they matched it to the correct, that is, to the identical comparison. Subjects thus successfully learned to use the handsigns to mediate identity-matching- to-sample behavior with the shapes shown in Panel A.
Subsequently, when generalized identity matching was tested with the novel stimuli shown in Panel C, there was no generalized identity matching. But after just training the handsigns for these novel stimuli, subjects immediately showed generalized identity matching -- that is, a generalized abstract performance with novel stimuli. These data thus illustrate that somehow, by incorporating joint stimulus control into a matching performance one incorporates into that performance the actual relation between the stimuli. -- in this case identity.
Now the way this may happen is not hard to see. As we see in this next figure (Figure(9) the handsign to the 2-dot sample is a tact, and maintaining the handsign over the delay interval is a self-duplic. And selection of the comparison occurs when these two sources of control operate jointly over the rehearsed topography. Thus we see that the 2-dot handsign is jointly evoked both as a self-duplic rehearsal, and simultaneously as a tact, but only with one particular comparison -- the two dots.
Likewise, in the lower part of the figure, in Panel B, after appropriate training, the line,-- as a sample,-- evokes the handsign illustrated as a tact, and if this handsign is maintained over the interval, it is then jointly evoked by the line comparison -- thereby necessarily producing an identity match. The way one stimulus specifies another is thus revealed. The sample stimulus specifies a particular comparison stimulus because the sample stimulus evokes a particular handsign, and that handsign can only enter into joint control with one particular comparison.
Thus the two-dot sample evokes the particular handsign shown here, and that handsign can occur under joint control only with the-two dot comparison. Likewise in the bottom figure, with the novel stimuli, the sample line specifies the line comparison because the sample line evokes the illustrated handsign, and that handsign only enters into joint control with the line comparison but no other. Thus we see how one stimulus may specify the selection of an identical other stimulus -- thereby emulating generalized abstraction of object-to-object identity matching.
Now with virtually no modification, this account provides a simple explanation of how names specify objects (Figure 10) . Thus, suppose the shape to be selected was named by the experimenter providing a handsign.. Thus, as we see in this figure, by providing a handsign, the experimenter is actually telling the subject which shape to select. Here he is telling the subject to select the two dots.
And of course the jump to vocal language is pretty obvious. Thus as we see in this next figure (Figure 11), the subject is still selecting the comparison that enters into joint control with the rehearsed phrase “two dots”. The only difference here is that the speaker’s response here is vocal rather than a handsign. Thus we see how a word or phrase, heard by the listener, specifies or refers to a particular object. That is, the heard word evokes the topography to be rehearsed by the listener, and that topography, in turn, specifies one particular comparison stimulus: namely, the stimulus that itself evokes the same topography.
Given this, I propose we may now behavioralize that aspect of word meaning usually called reference. We might say that for the listener, the referent of a particular word is that object, event, or relation that evokes a tact that in turn enters into joint control with echoic rehearsal of that word. Thus, here, the actual two dots is the referent for the phrase "two dots" because the actual two dots evokes a tact that enters into joint control with a self echoic repetition of the phrase " two dots
(FIG12) Of course this is not restricted to simple characteristics. Thus, as we see in Figure 12, the complex spoken phrase “Black dot in the smaller pentagon" may still be said to refer to one particular stimulus, because that spoken phrase can only come under joint control with that one particular stimulus, and no other. And so, in the present case, there is only one stimulus object here that evokes the spoken response “Black dot in the smaller pentagon” as a tact and that object is then the referent of the phrase” Black dot in the smaller pentagon.”
Given these examples, I would propose that this specification function also serves the role of what is ordinarily called DESCRIPTION. I would propose that the phrase Two black dots in a larger pentagon functions as a description of this one comparison stimulus, but no other, because these words enter into joint control with a tact of that one stimulus but no other.
It is important to notice however, that mediation by joint control also allows for generalized abstract responding; here because the phrase Black dot in the smaller pentagon may be applied to any stimulus that evokes that phrase. Thus the phrase does not specify just one particular stimulus, but rather provides for generalized abstraction by being applicable to an endless number of different stimuli. That is, to any stimulus that evokes the phrase black dot in the smaller pentagon. Furthermore, attendant to the role joint control plays in specification and description, we can also see a role for joint control in what is ordinarily called Recognition from a description. To see this, consider your behavior earlier when asked to find the 6-digit number . In that case I know you all could have reported that you did indeed recognize the specified number even if none of you actually emitted a selection response such as pointing to the screen. But without an overt pointing response what is meant behaviorally by the term “recognition”?
I think the notion of joint control provides a simple answer (Figure 13). Here you are given the phrase "circle in square". Can you find the circle in a square? You know you have recognized something from its description when your rehearsal of that description enters into joint control with a tact of the object recognized. Thus at the moment you are able to emit the entire phrase “circle in square” both as a self-echoic, and also as a tact of a particular stimulus, you would say that you had recognized the stimulus as the one described by saying something like “There it is” or “I found it.
Responses such as these, of course, would be autoclitics since ultimately they are evoked by the onset of joint control. The cognitive event of recognition, may then be identified as the behavioral event of the onset of joint control, and the verbal report of this recognition may be identified behaviorally as the autoclitic evoked by the onset of this joint control. We should note however, that this only happens the first time you select a stimulus in response to the description. If I subsequently ask you to point to that stimulus again, you no longer need repeat the description and respond under joint control. Indeed, given the immediate history of reinforcement for pointing to that stimulus in the presence of the description, I would think the behavior is now just unmediated responding.
Finally, one other aspect of word meaning that we need to cover is the notion of comprehension. What does it mean to comprehend a description? Here is an example. If I give you the description “point oh bar point”, you would say that you don’t really comprehend it. And if I show you these figures (Figure 14) your initial reaction is not much changed. But as you peruse these comparisons , you notice that the bottom comparison can be tacted in such a way as to fit the description. That is, the description point oh bar point, can function jointly as a self-echoic repetition of the spoken sample, and simultaneously as a tact of that one comparison. You can say point oh bar point both as a self-echoic of what I said, and also, however weakly, as a tact of the bottom row of the figure. I would thus suggest that we say that we comprehend a new description only when the self-echoic enters into joint control, but not before. All this is to say of course that the onset of joint control is the moment in which the description is comprehended and the event described is recognized. The cognitive events of description, comprehension, recognition may thus be replaced with a simple behavioral account.
I would like now to move on to something different. Up at the top here, Figure 15 we see five figures and their names – Art, Ben, Carl Doug and Ella. In the lower panel we see the usual joint control illustration -- here illustrating how, given the name Ben, the subject rehearses that name as a self-echoic until that figure is encountered so that the name Ben is emitted under joint control. But this need not be the only case. There is also the case in which the stimulus to be selected in response to the sample name bears some relation to the sample other than identity or reference.
One example of this kind of task is shown in Figure 16, Panel A where the figure to be selected is not the one named, but rather the figure with the next name in alphabetical order. Thus, given Art as the sample name, the correct selection would be the figure of Ben; and likewise with Ben as the sample name, the correct selection would be Carl and so forth. Although performances of this type seem to involve a high level of abstraction, (namely the abstraction next forward) , never the less, they may, in fact, be accounted for rather simply in terms of joint control.
Here is how: Let us assume the subject has been trained in the ordered list of names in Panel A so the list functions as a five-step intraverbal, that is, Art, Ben, Carl, Don and Ella. Next, as we see in Panel B, the sample Carl, given as a spoken word, evokes a repetition of that sample name as an echoic -- thereby beginning self-echoic rehearsal. Then, at some point during this self-echoic rehearsal, (it may be very quick) the subject emits the next name in the intraverbal, Don, and so begins to rehearse the name Don as the new self-echoic.
This intraverbal response from Carl to Don not only has the effect of transforming the rehearsed self-echoic from Carl to Don but it also has the effect of allowing the tact evoked by the figure Don to enter into joint control with the rehearsed name Don thereby causing the selection of the figure Don in response to the sample name Carl. Much of this of course we have already seen. But the changing of the rehearsed self-echoic and the resultant change in selection is something new. I call this changing of the rehearsed response, a transformational response. I call it transformational because it transforms the topography rehearsed --in this case the transformational response consists of changing the rehearsed topography from "Carl" to "Don". Naturally, we would expect the same from all of the other faces whose names occur in the intraverbal shown here.
But this is not all. Actually, there is additional level of abstraction possible here. It works as we see in this next figure. Figure 17 in Panel A. Now the intraverbal transformational response is of course itself an operant, and like any other operant, it is susceptible to stimulus control. This means that we should be able to bring different transformational responses under the control of different discriminative stimuli. For the sake of clarity I shall henceforth refer to discriminative stimuli having of this particular function, as instructional stimuli.
Here is an example. In the simple matching task shown at the top, in Panel A the phrase "Find Carl", would simply have the subject select Carl’s face in response to that name under joint control as we have already seen. . But, as we see in the lower picture, if the phrase were " Find the face after Carl", the phrase "face after" would act as an instructional stimulus for intraverbally transforming self echoic rehearsal from the name "Carl" to self echoic rehearsal of the name "Don" followed by selection of the face named Don. Thus we see that by controlling the transformational response, you can control the effective matching relation. Such performances involve exceedingly high levels of abstraction, and so I think it is important to stress that all of this is not entirely speculation. Research exploring the role of joint control of transformational responding in even more abstract performances than the few I have touched on here, has been published. But let us look at what we really have so far.
Thus, not only were subjects able to select novel pictures in response to their spoken names, as for example hear the name Carl, select the figure Carl, and not only were subjects able to transform any representation of the goal as for example hear the name Ben, select Carl, or hear the name Carl and select Don, but subjects were also able to do so, or not do so, in response to explicit stimuli which controlled the intraverbal transformational response. Thus, for example on a red background select the figure named in the echoic, while on a blue background emit the transformational response before repeating the echoic thereby selecting the figure next-forward from that named. We thus see that training subjects to respond joint control opens the subject's repertoire to a wide variety of novel and abstract performances. Once a subject is responding under joint control, just teaching the names for additional faces immediately enables all of the performances described here , abstract as they are. There is thus huge generalization in behavior mediated under joint control
But there is more. The notion of joint control appears to provide a clear account of a problem Behavior analysis has not yet really dealt with. Let us consider: in all of tasks we've seen so far, subjects were required to respond to the presence of a specified stimulus. But what about the absence of a specified stimulus? The issue goes to a crucial class of intellectual performances: the recognition of stimuli that do not possess a specified relation to each other. Recognizing that a number is missing from the series 1,2,3,5,6 or that the series A,B,D,C is out of order, are two examples of this class. But it extends to more abstract performances such as saying that a red circle is not a member of the class of stimuli entitled blue squares, or that an automobile is not an item of furniture or simply finding a stimulus that is different from a presented sample..
Although an explanation of behavior of this sort is vital to any complete account of intelligent behavior (Miller, Galanter and Pribram, 1960; Neisser, 1967), it has been virtually ignored in behavior analysis. It is easy to see why. An SD evokes a particular response by its presence. There is no parallel concept that would account for the control of a response by the absence of a controlling stimulus. The fact remains however, that we do respond to the absence of a specified stimulus. But how does happened? How do you respond to what is not there? Let's first look at the case we do understand.
As illustrated in the upper part of this figure, (Fig 18,) the subject is presented with the word "circle" as a sample, which he then repeats as a self echoic until a stimulus is encountered that causes the word circle to also occur under tact control. At this event, as we have said before, the stimulus is selected. In the other case, as shown in the bottom, upon hearing the sample word "circle" as always, the subject rehearses it as a self-echoic. But in this case the subject is unable to emit that topography under joint control because there is no circle – only a square-- and in the presence of the square stimulus only the response "square "can be emitted as a tact, This then is the issue. The specified stimulus, the circle, is not present and so how does the subject respond to the absence of the specified stimulus -- here the absence of a circle? what is the cue that evokes in the subject some response such as "it's not here", or "I can't find it?"
I would propose that just as the congruence of self-echoic and tact topographies illustrated in the top panel A serves as the stimulus event we call joint control, so may the disparity of response topographies, illustrated in the bottom panel serve as the stimulus event for some other response -- as for example an autoclitic response such as " Its not a circle" or "the circle is not here ". So as we see here in the lower panel, the rehearsed self-echoic is the response circle, but the appropriate tact to the comparison is the topography square. The two topographies “square” and “circle” are in conflict and what I am saying is that this conflict of topographies is just as much a stimulus event as the congruence of topographies is in the usual joint control event.. To recognize this distinction, I would suggest that the notion of joint control be joined now by a second form of control, one where disparate topographies are evoked and called joint oddity. Research supporting this distinction was published several years ago in journal The Analysis of Verbal Behavior.
Now as we shall see next, incorporating the notion of joint identity with joint oddity opens up an additional variety of abstract performances to a simple behavioral account. We may begin by considering a form of behavior not generally recognized as involving generalization, and that is the behavior of scanning amongst available alternatives while seeking one specified stimulus. What I am getting at is the question of how is it that we are able, while scanning amongst comparisons, to reject all of those comparisons not specified by the sample? Thus. given three or four comparisons and an appropriate sample, we quickly reject each of the comparisons not specified by the sample, while selecting only the correct comparison.
As I mentioned earlier, the traditional, unmediated selection account would have it that due to a prior history of reinforcement the incorrect comparisons are passively rejected because they evoke no selection response whereas the correct comparison immediately evokes the selection response. But as we saw, that account makes no provision for explaining generalized behavior, and so we shall focus today solely on the joint control account.. Let us consider a simple matching task of the sort we have been using today, in which several comparisons are presented along with the sample that specifies the correct comparison. The answer to the question is not difficult. The joint control account would simply argue that the subject has a history in which selections in the presence of joint identity were reinforced, while selection responses in the onset of joint oddity never were.
Thus as we see in the figure (Figure 19) told to select a square, the subject begins to scan, but first encounters a triangle thus engendering joint oddity between the subject's self echoic rehearsals of the topography square and emission of the topography triangle as a tact. And so, in response to this stimulus event of joint oddity, the subject moves to a different comparison and attempts to tact it. This again engenders joint oddity between the rehearsed topography square and the tacted topography circle and so again, in response to this joint oddity, the subject moves to the next comparison and emits the tact square which in this case enters into joint identity with the currently rehearsed self echoic topography square thereby setting the occasion for the selection response we have seen previously. Thus we see that the scanning performance is not specific to any particular set of stimuli, but rather is based on the generic events of joint control identity and joint control oddity: selecting the current stimulus on joint control identity and moving to the next stimulus on joint control oddity.
Finally , I would like to discuss one more performance and that is a performance that results from combining some of the different behaviors we have looked at today . In particular I want describe what I call constant-relation matching. To begin with, let us recall the relational matching performance I described earlier in which the correct comparison is not the one identical to the sample, but rather one bearing a relationship to the sample such as next forward in a list of names. And so as illustrated in this figure (Fig. 20), we've presented to the subject the sample name Carl, which the subject then intraverbally transforms to Don. The subject then begins to search for the face named Don. In the current case however that face is not available, And so the subject encounters joint control oddity. Now if encountering joint control oddity acts as an instructional stimulus for the intraverbal transformation of advancing the intraverbal list of names to the next name , the resultant performance might well be described as a goal-oriented performance. In this case, upon failing to find the specified comparison Don, the subject changes the search target to Ella and now seeks to find the comparison two steps past the sample.
That is, the subject has engaged in an ordered search based on position in the original list of names. That is, given the name Carl, , the subject transformed the echoic to Don and sought, Don, and in the event of not finding Don , the subject sought Ella Since this behavior is mediated by joint control, it would be generalizable to any novel set of stimuli for which subjects subsequently acquired tacts. Data confirm this claim. To sum up, I think what is of general significance here is that behaviors generally ascribed to the mediation of cognitive processes may in fact be produced and analyzed on the basis of a simple and parsimonious behavioral explanation requiring no new concepts and directly supported by experimental analysis. Thus, as we have seen, increasingly complex performances may be constructed by combining simpler operant behaviors.