Teaching basic skills to children with Down Syndrome and developmental delays: The relative efficacy of interactive modeling with social rewards for benchmark achievements and passive observation

In interventions attempting to remediate deficiencies in the skills repertoire of developmentally delayed children, no less than in medical interventions, it may be fairly said that less is more. That is, the instructor should intervene as little as possible both from the perspective of efficient instructional practice and from time allotment concerns which modern classrooms face. Evidence from this laboratory has indicated that in skills training for children with severe developmental delays the passive observation of a model demonstrating the target skill is more effective than interactive modeling involving hand-over-hand instruction with verbal prompting. We have considered the role of verbal prompting in interactive modeling and have found that prompts intended to provide typical social reinforcers are counterproductive (e.g. Biederman, Davey, Ryder, & Franchi, 1994). The present study examines the efficacy of hand-over-hand modeling with response-contingent verbal prompts. In such instruction, tasks are divided into identifiable sequential components, and the achievement of each component is marked by the delivery of some form of verbal prompt. In a within-subjects design, children were trained in one skill with response-contingent verbal prompts and in a second skill with simple passive observation. A separate group of children were trained with less rigorous verbal prompting in one skill and with passive observation in a second. Consistent with previous research, we found that passive modeling was overall significantly more effective than hand-over-hand modeling and moreover that passive modeling was significantly more effective than hand-over-hand modeling with response-contingent prompting. Our evidence therefore indicates that current classroom practice in training basic skills to children with severe developmental delays may require reassessment in that simple observation of modeled skills appears to be more effective than more labor-intensive instruction.

Download PDF

Biederman, G, Fairhall, Raven, and Davey, V. (1998) Teaching basic skills to children with Down Syndrome and developmental delays: The relative efficacy of interactive modeling with social rewards for benchmark achievements and passive observation. Down Syndrome Research and Practice, 5(1), 26-33. doi:10.3104/reports.72

Robertson and Biederman (1989) evaluated the methodological status of interventions directed towards developmentally delayed populations and posed the question, that, given the rich theoretical and experimental literature, why there appears to have been only limited success in demonstrating the effective use of skills modeling for children with severe developmental delays. In contrast to some learning approaches, which stress learning-by-doing, the observational learning approach posits that a wide variety of skills can be acquired vicariously by simply observing the performance of a task and then making subsequent attempts to imitate the actions of the model (Robertson & Biederman, 1989). Bandura and Walters (1978) credit the learning of social norms and academic skills to observation. Although Bandura's (1969) original modeling experiment was applied to the treatment of phobias, modeling is considered to be an especially important tool for teaching novel skills to children with developmental delays (Gladstone & Spencer, 1977); recent research (e.g., Meyer & Kohl, 1985; Palincsar & Brown, 1984) has shown that modeling is effective for teaching children with disorders such as developmental delays, Down syndrome, and autism. But in their review of studies involving observational learning, modeling, and imitation in atypical populations published between 1979 and 1988, Robertson and Biederman (1989) failed to find support for the efficacy of any modeling technique in skills instruction. A meta-analysis evaluated the efficacy of participant-, peer-, adult-, normal-, and atypical-modeling strategies and found no reliable evidence to suggest that any single strategy was effective.

One problem was the lack of experimentation that could validly identify promising strategies in this complex context. In the corpus studied, hand-over-hand demonstration of skills, otherwise known as participant or interactive modeling, was most frequently employed. Such modeling seems promising when recast in a methodology that could eliminate a major difficulty for research in this area, namely the lack of effective control of within-participant factors. Using a within-subjects, two-task design, Biederman, Ryder, Davey, and Gibson (1991) evaluated the relative efficacy of both interactive and passive modeling in children with severe developmental delays. The primary purpose of that study was to establish a within-subjects methodology, and the presumption of researchers and teachers involved in the study was that interactive modeling should clearly be more effective than simple passive observation. However, raters reliably indicated that tasks trained using passive modeling were performed significantly better than those trained using interactive modeling. The authors argued that these findings are not as paradoxical as they might at first appear: First, there is the standard problem of generalization to which all behavioral interventions are susceptible. Lepper and Greene (1978), Lepper, Greene, and Nisbett (1973), and Premack (1965) suggest that when reinforcement is made contingent on complex behaviors such as drawing, the newly learned behavior declines in frequency and may drop below baseline when reinforcement is removed. Thus, response-contingent verbal prompting may represent what has been called overjustification (cf. Lepper et al., 1973), and may account for the relative advantage of passive observation. Thus, when the special conditions present during intervention, such as physical guidance, or verbal prompts intended as social reinforcers, are removed, and unless specific steps are taken to bolster the generalizability of the new behavior (MacRae & Holding, 1966; Holding & MacRae, 1966; Baine, 1980; Hanley-Maxwell, Szymanski, Seay, & Parker, 1990 ; Riley, 1995; Mudford, 1995) extinction of the newly trained behavior may occur (cf. Stokes & Baer, 1977). It is clear that, in typical school contexts, the additional support training to avoid generalization decrement may be lacking (cf. Stokes & Baer, 1977). Second, there may also be a direct negative effect on learning new behavior of the delivery of such prompts - intended as rewards by the instructors - that is specific to populations with severe developmental delays and few language skills. The additional information provided by the verbal prompting may overload the information-processing capacity of the children, or may otherwise obscure learning by distracting the child from a necessary stimulus (cf. Prior & Hall, 1979; Paul and Cohen, 1985; Lincoln, Courchesne, Kilman, & Galambros, 1985 ; Asarnow, Tanguay, Bott, & Freeman, 1987). Biederman, Davey, Ryder, and Franchi (1994) tested some of these hypotheses using a mixed design. For half the subjects, verbal prompts such as "Good girl!" or "Good job!" - were used in the interactive modeling condition. For the remaining children, no verbal prompts were provided. All children were additionally trained on a second passively modeled task. Raters judged that the passive tasks were performed better than the interactive tasks trained with verbal prompts. In support of the hypothesis, no difference was found between passive and interactive tasks when the interactive task was instructed without verbal prompting.The theoretical implications of using verbal prompts intended by instructors as rewards have been discussed at some length in the literature and need not be reiterated in detail. Briefly, Ward (1995) suggested, in effect, that Biederman et al. (1994) were not validly using "positive reinforcement" as an independent variable as we had no manipulation of reinforcement contingency. Biederman and Davey (1995) responded that we had simply taken standard practice in modeling skills in special education classrooms in which children receive positive verbal prompts when the instructor feels that some behavior has arisen which "deserves it" or when the child appears to need some social support. The behavior of instructors in this context is closest to response shaping in which stimuli are delivered for approximations of the designated response (Bower & Hilgard (1981, p. 539; Lovaas, 1977). However, to avoid confusion, we use the term "verbal prompts" in the present paper.


The purpose of the present experiment was to replicate Biederman et al. (1994) using more rigorous and clearly defined criteria for the delivery of verbal prompts in order to bring the use of this intervention to a standard that permits us to directly address the use of such prompts intended as rewards by classroom teachers in remediation training.


Participants six children ( 5 boys, 1 girl; 5-13 years of age) from special education classes of the Metropolitan Toronto Separate School Board participated. The children had a variety of diagnoses including Down syndrome, and all showed severe developmental delays with little or no expressive language. The children had limited receptive language.


A variety of materials were used including snapping, buttoning, and lacing dressing frames (manufactured by Galt Toys), spoons and bowls, styrofoam cups, small toys, puzzles, hand towels, rubber letters, and articles of clothing (see Table 1).

Table 1. Description of Tasks and Characteristics of Subjects

Subject Sex Age Primary Diagnosis Benchmark Task Passive Task
1 F 13 Down Syndrome Lacing board Snapping board 1
2 M 10 Developmental Delays Jigsaw Puzzle Removing coat
3 M 9 Down Syndrome Zipping coat Matching Letter
Standard Task Passive Task
4 M 10 Developmental Delays Washing hands Putting on coat
5 M 11 Developmental Delays Unbuttoning board 1 Snapping board 1
6 M 5 Down Syndrome Finding marker Using spoon

Research Design

Each child was trained in two tasks not in his or her repertoire, and not previously trained by parents or teachers. For three children, one task was trained with benchmark instruction, in which the tasks are segmented and positive verbal prompts are delivered only when a benchmark has been achieved. The second task was purely an observational modeling procedure in which the instructor demonstrated the task repeatedly, with no verbal prompts provided. For the remaining three children, typical classroom positive verbal prompting was given for one task, and the purely observational modeling procedure was used for the second task. A test session for each task trained was given at the completion of training as given below.


Teachers and parents were asked to provide a list of 4 to 6 tasks or skills that were not presently in the child's repertoire (cf. Biederman et al., 1991, 1994). The experimenter chose two tasks, judged as comparable in difficulty for each child by his or her teacher, and randomly assigned the tasks to either the interactive or passive modeling condition. In interactively modeled tasks, half the children, randomly determined, received benchmark-contingent positive verbal prompts whereas the remaining children received informal verbal prompts. In passive modeling tasks no verbal prompting was given. Children received instruction on one of the two tasks for 20 minutes each day, for a total of 10 daily sessions, with counterbalancing over days for order of task presentation (interactive vs. passive). Children were randomly assigned to one of two instructors for the duration of the experiment. Instructors were adult female fourth-year psychology undergraduates with instructional experience in special education. To minimize distraction, each child was withdrawn from his or her classroom to an empty room within the school. Half the children began the first session in the interactive condition while the remainder began in the passive condition. During interactive modeling with social reinforcement, the child was provided with hand-over-hand instruction in which the instructor literally manipulated the child's hands to perform the task. Each child was prompted to watch what he or she was doing; verbal prompts were delivered for appropriate attention to the task and when the instructor felt that the child merited reward similar to typical classroom contexts. During interactive modeling with response-contingent reinforcement, the child was also provided with hand-over-hand instruction, however verbal reinforcers were delivered only when the child had reached a pre-determined benchmark within the task. The response-contingent strategy involved segmenting the task into naturally occurring steps. For example, learning to put on a coat would involve

  1. removing the coat from a hook,
  2. putting an arm into the appropriate sleeve,
  3. putting the other arm into the remaining sleeve,
  4. engaging the zipper fastener, and
  5. zipping the zipper.

Verbal prompts were delivered only for completion of each step. During passive training sessions each child observed his or her instructor model the target behavior. Each child was asked to sit quietly and watch without imitating. With respect to treatment fidelity, the instructors used a protocol in which the benchmarks in the response-contingent verbal prompt condition were clearly defined for each task to be used in this context. For both interactive task types, and for the passive training sessions as well, rehearsals were conducted with children not in the study so that techniques were appropriately refined. On the next school day after the 10 training sessions were completed, each child was asked to perform both tasks without physical guidance, or verbal instructing or prompting. In training skills in a population with severe developmental delays, it is clearly desirable that these skills are functional. The dressing boards are equipped with standard buttons and snaps to heighten functionality, to increase standardization of tasks, and to reflect typical classroom usage. The use of puzzles and games is also functional in that they address recreational skills learning which is an important component of instruction for children with severe developmental delays.

Rating Strategy

The dependent measure was the relative performance quality of the two tasks for each child. Test day performance was videotaped and edited by a professional who was unaware of the purpose or conditions of the study. During editing, each task was reduced to a random 30-second segment. An identification number for the child and a neutral task label [A (first-) or B (second-presented)] were superimposed on a black background prior to onset of the task segment. Order of presentation for the children as well as for the type of task (i.e., whether Task A was interactive or passive) were counterbalanced. This videotape was then presented to student raters (n = 85) from an introductory psychology class at the University of Toronto. Raters were read instructions that required them to rate each child performing the two tasks from the perspective of "a person on the street," and to make a judgment of relative task proficiency on a 5-point scale. Written instructions were also presented with diagrams explaining particular tasks where needed.

Instructions to Raters

The raters were assembled in a classroom with five television monitors distributed throughout the room, and were asked to arrange themselves so that they had a direct and unimpeded view of a screen. A sheet of paper with printed instructions was handed to each of the judges, and the instructions were also read aloud by one of the experimenters (JF) with opportunity for questions. The raters were instructed to have no communication with the experimenter or with other raters after questions were answered. The instructions were as follows:

On the video monitors directly in front of you, you will see children performing two tasks each (labeled TASK A and TASK B, respectively). On the rating sheet given to you, indicate your judgment of the relative performance quality of the two tasks that you will see. You will be asked to make a rating after you have seen each child complete both tasks. Place a mark on the place which represents your opinion of the relative quality of the performance of the two tasks:

_____ _____ _____ _____ __ X __

This rating would mean that Task B was much better than Task A;

_____ __ X __ _____ _____ _____

This rating would mean that Task A was somewhat better than Task B.

_____ _____ __ X __ _____ _____

This rating would mean that Task A and Task B were performed equally well.

Please wait until you have seen both tasks performed before you form an opinion. The mean of the judges' ratings for each child served as the datum. An alpha level of .05 was adopted for all analyzes. The use of untrained judges and their effective equivalence to trained judges, has been described by Wallander, Conger, and Ward (1983). The statistical safety inherent in large numbers is perhaps obvious, but the uniformity of the raters' judgments must be assessed to confirm the validity of this approach in determining instructional efficacy (cf. Aiken, 1985; Roff, 1981; Seiz, 1982). It may be argued that untrained raters are less subject to bias when judging the performance of simple skills; in any event our purpose was to identify obvious differences between training strategies that all observers could agree upon. As may be noted below, that goal has been achieved in the present study as well as in two prior experiments which have used hundreds of untrained judges with great reliability (Biederman et al., 1991, 1994).


Raters judged that passive tasks were performed better than active tasks, consistent with our previous reports, as indicated by a significant two-tailed t-test for the effect of interactive versus passive modeling, t (5) = 3.47, p < .05, and by inspection of Figure 1. In this figure, the data were arranged so that a score of 5 indicated that raters strongly preferred the passive to active task and a rating of 1 indicated that raters strongly preferred the active to the passive task. A rating of 3 indicated no preference. Two additional t-tests assessed whether type of reinforcement (social vs. benchmark-contingent) differentially influenced task performance. Passive observation reliably produced better-rated performance than hand-over-hand modeling under benchmark-contingent reinforcement, t (2) = 4.51, p < .05. Although the strength of an outcome is unrelated to sample size, we must note that findings as strong as the ones reported here are rare in learning experiments using small sample size (here, n = 3). Performance of passively observed and socially reinforced tasks were not reliably different, but the difference was in the direction of our previous findings, t (2) = 2.00. If the three participants' data are pooled with data from groups in our two previous studies using these identical conditions (see Method, Biederman et al., 1991; 1994), passive observation reliably produces better-rated performance, t (18) = 5.29, p < .001. This serves to emphasize our basic finding that modeling with standard reinforcement is significantly poorer in outcome than simple passive observation. A direct comparison of benchmark and standard reinforcement conditions showed no significant difference between groups at the accepted confidence level, t (4) = 2.33, p < .10. Analysis of variance was used to estimate the reliability of the multiple rater technique. The mean of the raters' judgments for each child was highly reliable, R = .98, F (5, 420) = 57.95, p < .01, which conforms to requirements for the use of multiple judges (Aiken, 1985; Seiz, 1982; Roff, 1981; Winer, 1971). A test for order of task presentation to raters showed no reliable effect, t (4) < 1.

Figure 1. The mean of the judges' ratings for each child.
A rating greater than 3.0 indicates superior performance on the passively modeled task and a rating less than 3.0 indicates superior performance on the interactively modeled task, on a 5-point scale. "Benchmark" refers to the condition in which strict criteria were observed for verbal prompts. "Standard" refers to the condition in which informal criteria such as those in classroom use were applied. Vertical lines depict standard errors of the means.


In this paper we examined the relative efficacy of interactive modeling with either response-contingent or informal verbal prompting, each contrasted with simple passive observation. Our evidence shows that passive observation is significantly more effective than interactive modeling with response-contingent verbal prompting and that there is no differential efficacy for such prompting in comparison with the more informal prompting strategy typically associated with instruction in special education classroom settings. This finding replicates two previous studies with respect to the relative efficacy of passive modeling.

How may we understand this outcome, which seems contrary to sound instructional practice? Asarnow, et al. (1987), Paul and Cohen (1985), and Prior and Hall (1979) have suggested that verbal processing abilities distinguish populations with developmental delays. Lincoln, et al. (1985) discovered that subjects with Down syndrome, for example, could identify auditory target words and differentiate them from a word background as well as children without developmental delays. However, a difference arose in the speed at which they were capable of doing so. These researchers argued that children with Down syndrome process information more slowly. Merrill and Mar (1987) examined sentence processing in persons with and without developmental delay. They argued that processing efficiency differences may be related to performance of more complex cognitive activities. When input is presented continuously and rapidly, the deeper levels of memory and comprehension may only be achieved if sufficient time is allowed for the processing of one unit prior to presentation of the next. Within the context of the present study, it is reasonable to hypothesize that participants had difficulties in processing prompts meant as verbal rewards. Specifically, such processing may interfere with performance of the task by producing confusion as to what behavior was actually being targeted. This confusion may serve to distract the learner from a necessary task-appropriate stimulus, or may interfere with the development of some appropriate behavior pattern.

Implications for Practice

The use of a behavioral intervention such as response-contingent verbal prompting introduces a generalization problem when prompting is removed. This generalization decrement likely contributes to the overall instructional superiority of passive modeling. There may be treatment strategies that reduce generalization decrement (cf. Baine, 1980), but the purpose of the present investigation is to consider the effect of interactive modeling in contrast to passive observation. It is not our purpose to enhance the generalizability of interactive modeling to the point where differences between this instructional method and passive observation fall away. We are interested in providing a method to examine the direct effects of classroom instruction of a given sort. If it turns out that additional interventions are needed for interactive modeling, then the logical conclusion is that interactive modeling is a less attractive classroom option than simple passive observation. Current instructional practice employing hand-over-hand modeling, combined with frequent verbal and gestural prompting intended as rewarding social responses on the part of instructors, limits the effectiveness of participant modeling in atypical learners. It seems clear that current practice needs reevaluation. Even were the weight of evidence ultimately to show that interactive and passive instruction techniques are equally efficacious, the latter is much more cost effective and much easier to disseminate. We, in fact, have preliminary evidence that videotaped modeling may be as effective as passive "live" modeling (Biederman, Davey, & Ahn, 1997). The use of the two-task within-subject, multiple rater method would appear to be useful for deciding between competing instructional methods in classroom contexts. In fact, this method could be useful in determining whether interventions designed to reduce generalization decrement are effective.


Supported by a grant to GBB from the Social Science and Humanities Research Council of Canada. The authors are grateful to the students, teachers, and parents of the Metropolitan Toronto Separate School Board who participated in this study and to Mr. Trevor Wilson of the Board for his advice and assistance in facilitating this research.


G. B. Biederman, Division of Life Sciences, University of Toronto at Scarborough, 1265 Military Trail, Scarborough, Ontario, Canada M1C 1A4. E-mail: bieder@scar.utoronto.ca, tel: 416 287 7433, fax: 416 287 7642.