AccessMyLibrary provides FREE access to over 30 million articles from top publications available through your library.
Create a link to this page
Copy and paste this link tag into your Web page or blog:
This study employed observed factor index scores as well as latent ability constructs from the Wechsler Intelligence Scale for Children--Fourth Edition (WISC-IV; Wechsler, 2003) in estimating reading and mathematics achievement on the Wechsler Individual Achievement Test--Second Edition (WIAT-II; Wechsler, 2002). Participants were the nationally stratified linking sample (N = 498) of the WISC-IV and WIAT-II. Observed scores from the WISC-IV were analyzed using hierarchical multiple regression analysis. Although the factor index scores provided a statistically significant increment over the Full Scale IQ, the size of the improvement was too small to be of clinical utility. Observed WISC-IV subtest scores were also subjected to structural equation modeling (SEM) analyses. Subtest scores from the WISC-IV were fit to a general factor (g) and four ability constructs corresponding to factor indexes from the WISC-IV (Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed). For both reading and mathematics, only g (.55 and .77, respectively) and Verbal Comprehension (.37 and. 17, respectively) were significant influences. Thus, when using observed scores to predict reading and mathematics achievement, it may only be necessary to consider the Full Scale IQ. In contrast, both g and Verbal Comprehension may be required for explanatory research.
**********
Considerable effort is required to obtain all of the scores in most IQ tests. Presumably, such an investment is made to garner clinically useful information not available from the interpretation of just one omnibus composite. The inherent assumption underlying the interpretation of lower order subtest scores and factor indexes is that they offer practical diagnostic or treatment benefits not available from the general intelligence (g) estimate (Kamphaus, 2001; Kaufman, 1994; Sattler, 2001). Should the analysis of subtest or factor scores fail these premises, their relevance is effectively vitiated.
Subtest analysis has undergone serious challenges over the past 2 decades, both methodologically and empirically. For instance, a series of methodological problems were identified that operate to negate, or equivocate, essentially all research into children's subtest profiles. Prominent among the many limitations is the circular use of subtest profiles for both the initial formation of diagnostic groups and the subsequent search for profiles that might inherently define or distinguish those groups (Glutting, Watkins, & Youngstrom, 2003; McDermott, Fantuzzo, & Glutting, 1990; McDermott, Fantuzzo, Glutting, Watkins, & Baggaley, 1992; Watkins & Kush, 1994). This problem is one of self-selection, which unduly increases the probability of discovering group differences. A second methodological deficiency is the nearly exclusive reliance on clinical samples. In contrast to epidemiological samples that are representative of the population as a whole, classified and referral samples (the majority of whom are subsequently classified) are unrepresentative and are also adversely affected by selection bias (Glutting, McDermott, Konold, Snelbaker, & Watkins, 1998; McDermott et al., 1992; Rutter, 1989). A third shortcoming is the misapplication of base rates, the frequency or percentage of a population identified with a diagnostic pattern (Cureton, 1957; Meehl & Rosen, 1955; Wiggins, 1973). The base rates routinely found in practice are so high that examinees overwhelmingly show an "exceptional" profile--sometimes exceeding 80% of all children in the United States (Glutting, McDermott, Watldns, Kush, & Konold, 1997; Kahana, Youngstrom, & Glutting, 2002). Besides being of dubious value, the high base rates raise a fundamental question: If everyone is exceptional, who then is normal?
The second trend comes from the empirical literature, which over the past 20 years has begun to demonstrate that subtest scores retain limited external validity. Examples of diminished utility include the inability of either individual subtest scores or score patterns to inform the identification of neurological deficits (Watkins, 1996), the diagnosis of learning disabilities (Daley & Nagle, 1996; Glutting, McGrath, Kamphaus, & McDermott, 1992; Kavale & Forness, 1984; Kline, Snyder, Guilmette, & Castellanos, 1992; Livingston, Jennings, Reynolds, & Gray, 2003; Maller & McDermott, 1997; McDermott, Goldberg, Watkins, Stanley, & Glutting, in press; Mueller, Dennis, & Short, 1986; Reynolds & Kamphaus, 2003; Smith & Watkins, 2004; Ward, Ward, Hatt, Young, & Mollner, 1995; Watkins, 1999, 2000, 2003, in press; Watkins & Kush, 1994; Watkins, Kush, & Glutting, 1997a, 1997b; Watkins, Kush, & Schaefer, 2002; Watkins & Worrell, 2000), or the classification of behavioral, social, and emotional problems (Beebe, Pfiffner, & McBurnett, 2000; Dumont, Farr, Willis, & Whelley, 1998; Glutting et al., 1992; Glutting et al., 1998; Lipsitz, Dworkin, & Erlenmeyer-Kimling, 1993; McDermott & Glutting, 1997; Reinecke, Beebe, & Stein, 1999; Riccio, Cohen, Hall, & Ross, 1997; Rispens et al., 1997; Teeter & Korducki, 1998). Indeed, nonreactive support for this trend comes from a retrospective review of textbooks on children's intelligence testing (cf. Kamphaus, 1993, 2001; Kaufman, 1979, 1994; Sattler, 1974, 1982, 1988, 1992, 2001). In earlier publications, page after page of empirical studies extolled the importance, and clinical necessity, of interpreting telltale subtest configurations. More recent publications, by contrast, display far fewer affirmative citations, and they lead to one of two conclusions: (a) Empirical support is beginning to wane, or (b) subtest analysis is so universally corroborated that there is no need for referencing. But alas, as demonstrated clearly by the empirical literature just cited, the latter proposition is untrue.
Like most practitioners, we would agree that at least some abilities beyond g are clinically relevant. Detterman (2002) indicated that g accounts for only 25% to 50% of the variance in achievement, leaving 50% to 75% of the variance to be explained by other constructs. Following this logic, Brody (2002) reported that "no one believes that g is the only construct needed to describe individual differences in intelligence" (p. 122).
Factor scores are leading prospects in the provision of information beyond g. Factor scores are more valid than conceptual subtest groupings. Unlike the inductively derived subtest organizations of Sattler (2001) and Kaufman (1994), factor scores retain considerable construct validity because they are formed empirically on the basis of factor analysis. Each factor score in a test battery also accounts for more variance than that available from individual subtest scores. As a result, factor scores are more reliable than single subtest scores (as per the Spearman-Brown prophecy). Furthermore, because factor scores represent phenomena beyond the sum of method variance, measurement error, and subtest specificity, they potentially escape the myriad drawbacks that beset attempts to interpret subtest scores.
Source: HighBeam Research, Distinctions without a difference: the utility of observed versus...