journal article
LitStream Collection
doi: 10.1177/014662168300700102pmid: N/A
The ways in which test specialists have defined content validity are reviewed and evaluated in order to determine the manner in which this validity might best be viewed. These specialists have differed in their def initions, variously associating content validity with (1) the sampling adequacy of test content, (2) the sam pling adequacy of test responses, (3) the relevance of test content to a content universe, (4) the relevance of test responses to a behavioral universe, (5) the clarity of content domain definitions, and (6) the technical quality of test items. After the theoretical and practical soundness of defining content validity in terms of each of these notions is evaluated, it is concluded that these notions are best regarded as definitions of concepts other than content validity. Since no appropriate means of defining this type of validity is therefore found, it is concluded that content validity is not a useful term for test specialists to retain in their vocab ulary.
doi: 10.1177/014662168300700104pmid: N/A
The two-parameter graded response latent trait model was applied to data collected from a conven tionally constructed Likert-type attitude scale. Com parisons were made of both the person latent trait esti mates and the item parameter estimates with their counterparts from the conventional scaling method. Also studied were the goodness of fit of the graded re sponse model and the information function feature of the model indicating the precision of measurement at each level of the attitude trait continuum. The results demonstrated that the graded response model could be successfully used to perform attitude measurement for Likert scales.
doi: 10.1177/014662168300700105pmid: N/A
Of interest in this study was the use of item re sponse models for obtaining accurate examinee do main score estimates and for increasing the prob abilities with which examinees are assigned correct ly to mastery states with ctiterion-referenced test scores. Specifically, the purpose of this investigation was to compare the one-, two-, and three-parameter logistic test models for estimating domain scores and making mastery/nonmastery decisions. Com puter simulation methods were used to recover a set of true domain scores with each of the logistic test models under a variety of testing conditions. Also, the percent of times the use of each model led to decisions which were consistent with decisions made with the true domain scores was studied. The one- parameter and three-parameter model resulted in highly comparable results for middle and high abil ity examinees, while for low ability examinees, the more general model always performed somewhat better.
doi: 10.1177/014662168300700106pmid: N/A
Most tasks used to gather information for multidimen sional scaling analysis are quite difficult for people to perform. An experiment was run to determine if sys tematic limits existed in such data collection situations and to determine the form that these limitations as sumed. The solution obtained from a complete order ing of stimuli to targets, using the conditional rank or der paradigm, was compared to solutions obtained from certain partial orders, constructed from the com plete orders by setting certain rankings equal. The par tial orders were found to reproduce the complete order solution quite accurately when about one half of the information was eliminated. The information elimi nated about similar items produced more differences in the obtained solutions than did the information about dissimilar stimuli. Suggestions about efficient techniques for gathering information for multidimen sional scaling purposes are discussed.
Howard, George S.; Obledo, Fernando H.; Cole, David A.; Maxwell, Scott E.
doi: 10.1177/014662168300700108pmid: N/A
The traditional procedure for obtaining judged rat ings, to ascertain if treatment-related change has oc curred, involves the randomization of the materials to be rated. An alternative approach (linked judgments) is investigated as a potential solution to certain instru mentation-related threats to statistical conclusion valid ity of the incumbent rating procedure. Data from a weight reduction study are presented which suggest that linked raters' judgments provide both a more powerful and a more valid index of treatment effec tiveness than the traditional procedure.
Markham, Steven E.; Dansereau, Fred; Alutto, Joseph A.; Dumas, MacDonald
doi: 10.1177/014662168300700109pmid: N/A
Problems in drawing inferences about leadership phenomena when multiple units of analysis (groups and individuals) simultaneously exist in a data set are addressed. Using a technique recommended by Dan sereau and Dumas (1977), within-unit and between- unit sources of covariation are examined for data con taining matched superior-subordinate reports. In this data set matched superior-subordinate reports were not significantly correlated at the individual level. When supervisory group differences were held constant, however, the relationships between these matched re ports were significantly greater than zero. This conver gent validity within supervisory units suggests an ap proach to validity which is not included in traditional theories of leadership.
Telonson, Paul A.; Alexander, Ralph A.; Barrett, Gerald V.
doi: 10.1177/014662168300700110pmid: N/A
This study compared three techniques for scoring a biographical information blank (horizontal per cent method, vertical percent method, and rare re sponse weighting) against various criteria for field sales representatives. The comparisons were cross- validated over five consecutive time periods. The re sults showed that the rare weighting technique sig nificantly predicted criterion group membership better than chance. Neither the horizontal nor ver tical percent methods predicted criterion group membership better than chance. Based on predic tive efficiency, the rare weighting technique was found to be superior to the other two techniques.
Showing 1 to 10 of 16 Articles