AccessMyLibrary provides FREE access to millions of articles from top publications available through your library.
The evaluation of individual work performance has long posed a perplexing dilemma to researchers and practitioners alike. There is little debate that more often than not, the most complete, readily available, and efficient measure of an individual's performance involves ratings of that performance by another individual. The thrust of the dilemma, then, is how to reduce the subjectivity inherent in performance ratings. Due in large part to a strong psychometric emphasis, the development of "better" rating scale formats has received a great deal of attention as an approach to resolving this dilemma. As noted by Steiner, Rain and Smalley (1993) research focusing on rating scale development probably peaked in the 1960's and 1970's with the development of the Behaviorally Anchored Rating Scale (Smith & Kendall, 1963), the Behavioral Observation Scale (Latham & Wexley, 1977), and the Mixed Standard Scale (Blanz & Ghiselli, 1972). Unfortunately, research examining the efficacy of the different rating scale formats generally indicated that ratings were not affected by changes in rating scale format. This finding was so pervasive that Landy and Farr (1980) called for a moratorium on rating format research. Subsequently, very little research in the past decade has focused on rating format.
One exception to the lack of research on rating format development has been the development of rating scales that solicit ratings pertaining not only to an individual's level of performance but the distribution of that performance as well. One such system, labeled Performance Distribution Assessment (PDA) (Kane, 1983; 1986), is based on the distributional measurement model postulated by Kane and Lawler (1979). An important characteristic of this model is a focus on the range of performance observed. Specifically, the model stipulates that not only is the level of performance important, but the fluctuation or variance in performance must also be considered. For example, two individuals may both be appropriately characterized as "average performers"; however, if one is consistently average and the other alternates between very poor and very good, very different pictures emerge with respect to the individuals' performance. Thus, distributional ratings have the potential for providing a performance rating system with several practical advantages over other rating systems. First and foremost among these is that the information generated with distributional ratings gives a measurement of performance variability as well as performance level. In addition, this information is solicited in such a way as to not require substantially more effort or time than other rating formats.
Although distributional rating formats represent an interesting approach to the evaluation of work performance, very little research exists pertaining to this approach. We found only two studies to date that directly evaluated distributional rating scales (i.e., Jako & Murphy, 1990; Steiner, et al., 1993). One reason for this lack of research stems from the dilemma of how to compare and evaluate the efficacy of rating scale formats. Previous attempts to compare and evaluate rating scales focused on determining which produced the "best" ratings. This research most often took a criterion-related approach to the evaluation of performance ratings. The criterion being either the psychometric characteristics of the ratings (i.e., rating errors) or the accuracy of the ratings. As noted, this line of research did not prove to be very fruitful. However, one of the benefits of distributional ratings is that in addition to a single measurement of a ratee's performance level with respect to a particular performance domain (i.e., performance dimension), a measure of the variability or consistency of the ratee's performance within the dimension is also provided. Thus, distributional ratings would be preferable to more traditional global ratings to the extent that they reflect meaningful fluctuations of ratee performance within dimensions and thus provide information different than that provided by global ratings.
With this in mind, a number of questions emerge with respect to distributional ratings. The first question is to what extent are raters sensitive to different distributions of performance. Are raters able to recognize and report differences with respect to the variability of performance? Steiner et al. (1993) addressed this question and found that raters using a distributional rating format were in fact sensitive to differences in the variability of performance, even if the mean level of performance was the same. These results add to other evidence that raters are capable of both detecting and providing reasonably accurate estimates of the variability in social information (Holland, Holyoak, Nisbett & Thagard, 1986).
A second question focuses on the extent to which information obtained using ratings from a distributional format differs from that obtained using more typical global rating scales. Here, it is important to consider the nature of rater information processing requirements underlying distributional ratings as well as those underlying global rating scales. The latter requires raters to recognize and encode relevant behaviors and then later to recall, categorize, and intuitively integrate information about the frequency of relevant behaviors in order to generate a rating. Distributional rating formats, on the other hand, require raters to recognize and encode relevant behaviors and then later to report actual or estimated frequencies of different categories of behavior. Advocates of the distributional rating approach (e.g., Kane, 1983; 1986) postulate that distributional ratings may minimize the cognitive demands on the rater by asking for judgments about the distribution of performance, rather than more global evaluative …