Education and Treatment of Children

Permalink: ‎

Scholars in the field of special education put forth a series of papers that proposed quality indicators for specific research designs that must be present for a study to be considered of high quality, as well as standards for evaluating a body of research to determine whether a practice is evidence-based. The purpose of this article was to pilot test the quality indicators proposed for single-subject research studies in order to identify points that may need clarification or revision. To do this, we examined the extent to which the proposed quality indicators were present in two single-subject studies, both examining the effects of teacher praise on specific behaviors of school-age children. Our application of the quality indicators indicated that neither study met the minimal acceptable criteria for single-subject research. We discuss the use of the quality indicators in relation to their clarity and applicability and suggest points for deliberation as the field moves forward in establishing evidence-based practices.

A dvocating that educators base practice on research-in other words, that evidence-based practices be the primary means of

instruction utilized in classrooms–first requires that specific practices have been identified as evidence-based. Although not all educators would agree (e.g., Gallagher, 2006), we assert that scientific research is the most reliable means for determining an educational practice to be effective or evidence-based (e.g.,. Kauffman & Sasso, 2006; Landrum & Tankersley, 2004). But just how should research findings be synthesized to determine the effectiveness ofE a practice?

By yielding an overall effect size across studies examined, meta- analyses (Glass, 1976; Kavale, 2001) have become popular for synthe- sizing research findings and their results have advanced our under- standing of effective educational practices. However, no established

Correspondence should be addressed to: Melody Tankersley, 405 White Hall, Kent State University, Kent, Ohio 44242; e-mail: mtankers@kent.edu telephone: 330.672-0605.

Pages 523-548

TANKERSLEY, COOK, and COOK

approach currently exists for identifying the quality of studies that are synthesized (Cooper & Hedges, 1994), which may allow a poorly de- signed and executed study to influence the overall effect size – thereby potentially misidentifying an ineffective practice as effective, or vice versa. Moreover, no firm guidelines clearly establish the minimum number of studies needed to produce reliable meta-analytic results (Cooper & Hedges). Adding to this, no agreed-upon process exists for determining effect sizes (the metric required to conduct a meta-analy- sis) for single-subject research, although several methods have been proposed and debated (e.g., R2 as discussed by Allison & Gorman, 1993; percentage of non-overlapping data as discussed by Scruggs & Mastropieri, 2001).

To address such matters surrounding the identification of evi- dence-based practices, researchers in other fields developed and implemented frameworks for determining effective practices. For example, the Division 12 Task Force of the American Psychologi- cal Association (Chambless et al., 1998), the National Association of School Psychologists (Kratochwill & Stoiber, 2002), and the What Works Clearinghouse (WWC, established in 2002 by the U.S. Depart- ment of Education; http://www.whatworks.ed.gov/) have established guidelines for evidence-based practices in clinical psychology, school psychology, and general education, respectively. Although utilizing an existing framework, such as that developed for general education, for determining evidence-based practices for students with disabili- ties would be efficient, the WWC did not originally consider single- subject research in determining whether a practice is evidence-based. The WWC has recently added single-case designs as a special type of quasi-experimental designs that, in the absence of severe design or implementation problems, can be categorized as meeting evidence standards with reservations (the same level at which randomized control trials with severe design or implementation problems are cat- egorized). As of February 2008, the WWC had not yet disseminated standards for evaluating single-case research (WWC Evidence Stan- dards for Reviewing Studies, 2006). Given the low incidence of many disabilities and the individualized nature of special education, special educators have frequently applied single-subject research to examine the effectiveness of a wide variety of practices (e.g., Lloyd, Tankersley, & Talbott, 1994; Odom & Strain, 2002; Tawney & Gast, 1984). As well- designed single-subject studies exhibit functional control, it seems that single-subject research should play a prominent role in determin- ing evidence-based practices in special education.

The Council for Exceptional Children’s Division for Research (CEC-DR) sponsored a series of papers published in a special issue of

524

QUALITY INDICATORS IN SINGLE-SUBJECT RESEARCH

Exceptional Children that proposed quality indicators for experimen- tal, single-subject, correlational, and qualitative research that must be present for a study to be considered of high quality in special educa- tion (Graham, 2005). Homer, Carr, Halle, McGee, Odom, and Wolery (2005) recommended that high quality single-subject research studies meet a number of minimally acceptable methodological criteria in seven areas: description of participants and settings, the dependent vari- able, the independent variable, baseline, experimental control/inter- nal validity, external validity, arnd social validity. Moreover, they pro- posed standards for evaluating a body of single-subject research to determine if a practice is evidence based: (a) at least five studies that meet minimally acceptable methodological criteria, document experimen- tal control, and have been published in peer-reviewed journals; (b) the studies must be conducted by at least three different researchers across at least three different geographical locations; and (c) the stud- ies cumulatively include a total of at least 20 participants.

The minimally acceptable methodological criteria provide a framework for evaluating individual single-subject research studies to determine whether the results can be used for establishing evidence- based practices. Although such an evaluative role might not have been the intent of Homer et al. (2005), and using the criteria in an evalua- tive sense may be an unfair use of them, the overall goal of the special issue in which these criteria appeared was to “establish a set of quality indicators that were clearly stated, understandable, and readily avail- able for use as guides for identifying high-quality research in special education” (Odom et al. 2005, p. 142). As such, several papers have already been published that use the methodological criteria and their specified quality indicators set forth in the special issue of Exceptional Children to evaluate the methodological merit of research studies (e.g., Browder, Wakeman, Spooner, Ahlgrim-Delzell, & Algozzine, 2006), making this exercise meaningful in terms of establishing how useful they are as an evaluative tool.

Establishing guidelines for high-quality research and standards for evidence-based practice, such as those proposed by Homer et al. (2005), is an endeavor that has the potential to focus the efforts of spe- cial education and improve the outcomes of students with disabili- ties (see Lloyd, Pullen, Tankersley, & Lloyd, 2006). To build upon this foundation, it appears to us that the next step in this process of estab- lishing evidence-based practices in special education is to “pilot test” the proposed quality 4ndicators by applying them to actual studies. If meaningful difficulties are encountered during pilot testing, the qual- ity indicators can then be revised in order to optimize their reliability and validity. In the following sections, we therefore provide an over-

525

TANKERSLEY, COOK, and COOK

view of the quality indicators proposed for single-subject research, accompanied by a description of our thought process as we assessed the presence of the quality indicators in the reports of two research studies.

Quality Indicators for Single-subject Research

Homer et al. (2005) specified seven broad methodological fea- tures, or quality indicators, that must be present and adequately addressed for a study “to be a credible example of single-subject re- search” (p. 173): description of participants and setting, dependent variable, independent variable, baseline, experimental control/inter- nal validity, external validity, and social validity. Homer et al. enumer- ated specific criteria required to achieve each of these quality indica- tors (see Figure 1). In this paper, we examine the degree to which the quality indicators and their criteria, as proposed by Homer et al., are present in two single-subject studies. We chose single-subject stud- ies that assessed the effectiveness of the same independent variable, teacher attention, with a similar group of participants (students with or at-risk for emotional and behavioral disorders) but were conduct- ed many years apart for our pilot test. Because the quality indicators were intended to examine the research base regarding the extent to which interventions cause meaningful change in dependent variables, we thought they should be applicable to the entire library of stud- ies available for a particular intervention–studies conducted in the distant past as well as those that are more recent. We chose only two research articles instead of the entire body of literature related to the independent variable so that we could explore in depth the process of applying the quality indicators and describe the experience in detail; therefore, we do not make a determination of the effectiveness of the practice, but instead, only describe our application of the quality in- dicators.

In the first study, Hall, Lund, and Jackson (1968) investigated the effects of contingent teacher attention on the study behavior of six students who were nominated for participation by their teachers for disruptive and dawdling behaviors. This study appeared as the first research article in the Journal of Applied Behavior Analysis and has subsequently been cited 284 times according to the Web of Science (accessed January, 2008). The results of the ABAB withdrawal de- signs indicated that students increased their study behavior when the teachers provided attention contingent upon students being engaged in appropriate behavior. In the second study, Sutherland, Wehby, and Copeland (2000) investigated the effects of a teacher’s behavior-spe- cific praise on the on-task behavior of nine students identified with

526

QUALITY INDICATORS IN SINGLE-SUBJECT RESEARCH 527

Describing Participants and Settings

1. Participants described with sufficient detail to allow others to select individuals with similar characteristics (e.g., age, gender, disability, diagnosis).

2. The process for selecting participants is described with replicable precision. 3. Critical features of the physicad setting are described with sufficient precision

to allow replication.

Dependent Variable 1. Dependent variables are desciibed with operational precision. 2. Each dependent variable is measured with a procedure that generates a

quantifiable index. 3. Measurement of the dependent variable is valid and described with

replicable precision. 4. Dependent variables are measured repeatedly over time. 5. Data are collected on the reliability of interobserver agreement associated

with each dependent variable, and IOA levels meet minimal standards (e.g., IOA = 80%, Kappa = 60%).

Independent Variable 1. Independent variable is described with replicable precision. 2. IV is systematically manipulatid and under the control of the experimenter. 3. Overt measurement of the fidelity of implementation for the independent

variable is highly desirable.

Baseline 1. The majority of single-subject research studies will include a baseline phase

that provides repeated measurament of a dependent variable and establishes a pattern of responding that caut be used to predict the pattern of future performance, if introduction or manipulation of the independent variable did not occur.

2. Baseline conditions are described with replicable precision.

Experimental Control/Internal Validity 1. The design provides at least tluee demonstrations of experimental effect at

three different points in time. 2. The design controls for common threats to-internal validity (e.g., permits

elimination of rival hypothesis) 3. The results document a pattern that demonstrates experimental control.

External Validity 1. Experimental effects are replicated across participants, settings, or materials

to establish external validity.

Social Validity 1. The dependent variable is socially important. 2. The magnitude of change in the DV resulting from the intervention is

socially important 3. Implementation of the IV is practical and cost-effective. 4. Social validity is enhanced by implementation of the IV over extended

time periods, by typical interverition agents, in typical physical and social contexts.

Figure 1. Quality Indicators and Criteria for Determining Whether a Study Meets the Acceptable Methodological Rigor Needed to be a Credible Example of Single-Subject Research as Proposed by Homer et al. (2005)

TANKERSLEY, COOK, and COOK

emotional and behavioral disorders. During each intervention session, the teacher set a goal of delivering six behavior-specific praise state- ments and observers provided him feedback about his use of praise following each session. Results suggest that the teachers’ increased use of behavior-specific praise caused an increase in students’ on-task behavior. Although Sutherland et al. also discussed teacher behavior in their study (e.g., number of behavior-specific praise statements), we only focus on the analysis of the student behavior for this article. The Sutherland et al. study has been cited 13 times according to the Web of Science (accessed January, 2008) and represents a more recent application of teacher praise than the Hall et al. study.

In our pilot test the first author reviewed each study in relation to the 21 components of the 7 quality indicators put forth by Homer et al. (2005), using a binary scale (yes, the quality indicator is present or no, the quality indicator is not present) and described her justifica- tion for her score by making notes of specific evidence offered in the research studies or used to explain the quality indicator. We used a bi- nary scale because Homer et al. only stated what information should be in evidence in quality studies. The first and second author then re- viewed each quality indicator in relation to its presence in both stud- ies and confirmed a rating together, discussing points of disagreement and gaining consensus on a final score. Using this final score as the rating standard (see Table 1), the third author independently rated the presence of the each component of the quality indicators for each study. We calculated inter-rater reliability by dividing agreements by the total number of quality indicators and multiplying by 100. Per- haps it should be noted that each of the authors have some experience with single-subject research methods–as researchers (e.g., Mancina, Tankersley, Kamps, Kravits, & Parrett, 2000), but more extensively as translators of research through written outlets (e.g., Cook, Rumrill, Webb, & Tankersley, 2001; Tankersley, Harjusola-Webb, & Landrum, in press; Tankersley, McGoey, Dalton, Rumrill, & Balan, 2006) and ed- ucators in graduate programs that incorporate single-subject research methods extensively into their curricula.

The decision of the independent rater agreed with that of the collaborative decisions of the first two authors on 16 of the 21 crite- ria for the Hall et al. (1968) study, and 13 of the 21 criteria for the Sutherland et al. (2000) study-resulting in a total inter-rater agree- ment rate of .69 (the asterisk beside items in Table 1 indicate disagree- ment between the collaborative rating of the first two authors and the independent rating of the third author). In the following sections, we examine whether the quality indicators are present in the studies and describe our decision-making process for evaluating each in relation

Education and Treatment of Children

Leave a ReplyCancel Reply