Sensitivity of the Linear Logistic Test Model to Misspecification of the Weight MatrixBaker, Frank B.
doi: 10.1177/014662169301700301pmid: N/A
Under the linear logistic test model, a weight is assigned to each cognitive operation used to respond to an item. The allocation of these weights is open to misspecification that can result in faulty estimates of the basic parameters. The effect on root mean squares (RMSS) of the differ ence between the parameter estimates obtained under misspecification conditions and those obtained under correct specification conditions was examined. Six levels of misspecification and four sample sizes were used. Even a small number of errors in the weight specifications resulted in large RMS values. However, weight matrices with a high proportion of nonzero elements tended to yield RMSs that were approximately half as large as those with a small number of nonzero elements. Although sample size had some effect on the RMS values, it was quite small compared to that due to the level of misspecification of the weights. The results suggest that because specifying the elements in the weight matrix is a subjective process, it must be done with great care.
Estimating Rater Agreement in 2 x 2 Tables: Correction for Chance and Intraclass CorrelationBlackman, Nicole J-M.; Koval, John J.
doi: 10.1177/014662169301700302pmid: N/A
Many estimators of the measure of agreement between two dichotomous ratings of a person have been proposed. The results of Fleiss (1975) are extended, and it is shown that four estimators— Scott's (1955) π coefficient, Cohen's (1960) k, Maxwell & Pilliner's (1968) r,,, and Mak's (1988) p—are interpretable both as chance-corrected measures of agreement and as intraclass correla tion coefficients for different ANOVA models. Rela tionships among these estimators are established for finite samples. Under Kraemer's (1979) model, it is shown that these estimators are equivalent in large samples, and that the equations for their large sample variances are equivalent.
Standard Errors of Levine Linear EquatingHanson, Bradley A.; Lingjia Zeng, ; Kolen, Michael J.
doi: 10.1177/014662169301700303pmid: N/A
The delta method was used to derive standard errors (SEs) of the Levine observed score and Levine true score linear equating methods. SEs with a normality assumption as well as without a nor mality assumption were derived. Data from two forms of a test were used as an example to evalu ate the derived SEs of equating. Bootstrap SEs also were computed for the purpose of comparison. The SEs derived without the normality assumption and the bootstrap SEs were very close. For the skewed score distributions, the SEs derived with the normality assumption differed from the SEs derived without the normality assumption and the boot strap SEs.
Equating Tests Under The Nominal Response ModelBaker, Frank B.
doi: 10.1177/014662169301700305pmid: N/A
Under item response theory, test equating involves finding the coefficients of a linear trans formation of the metric of one test to that of another. A procedure for finding these equating coefficients when the items in the two tests are nominally scored was developed. A quadratic loss function based on the differences between response category probabilities in the two tests is employed. The gradients of this loss function needed by the iterative multivariate search procedure used to obtain the equating coefficients were derived for the nominal response case. Examples of both hori zontal and vertical equating are provided. The empirical results indicated that tests scored under a nominal response model can be placed on a com mon metric in both horizontal and vertical equatings.
A Hyperbolic Cosine Latent Trait Model For Unfolding Dichotomous Single-Stimulus ResponsesAndrich, David; Guanzhong Luo,
doi: 10.1177/014662169301700307pmid: N/A
Social-psychological variables are typically measured using either cumulative or unfolding response processes. In the former, the greater the location of a person relative to the location of a stimulus on the continuum, the greater the proba bility of a positive response; in the latter, the closer the location of the person to the location of the statement, irrespective of direction, the greater the probability of a positive response. Formal probability models for these processes are, respec tively, monotonically increasing and single-peaked as a function of the location of the person relative to the location of the statement. In general, these models have been considered to be independent of each other. However, if statements constructed on the basis of a cumulative model have three ordered response categories, the response function within the statement for the middle category is in fact single-peaked. Using this observation, a unidimen sional model for responses to statements that have an unfolding structure was constructed from the cumulative Rasch model for ordered response categories. A location and unit of measurement parameter exist for each statement. A joint maxi mum likelihood estimation procedure was inves tigated. Analysis of a small simulation study and a small real dataset showed that the model is readily applicable.
A Method for Severely Constrained Item Selection in Adaptive TestingStocking, Martha L.; Swanson, Len
doi: 10.1177/014662169301700308pmid: N/A
Previous attempts at incorporating expert test construction practices into computerized adaptive testing paradigms are described. A new method is presented for incorporating a large number of con straints on adaptive item selection. The meth odology emulates the test construction practices of expert test specialists, which is a necessity if com puterized adaptive testing is to compete with con ventional tests. Two examples—one for a verbal measure and the other for a quantitative measure— are provided of the successful use of the proposed method in designing adaptive tests.