Probit Latent Class Analysis with Dichotomous or Ordered Category Measures: Conditional Independence/Dependence ModelsUebersax, John S.
doi: 10.1177/01466219922031400pmid: N/A
Flexible methods that relax restrictive conditional independence assumptions of latent classanalysis (LCA) are described. Dichotomous and ordered category manifest variables are viewed asdiscretized latent continuous variables. The latent continuous variables are assumed to have a mixtureofmultivariate-normals distribution. Within a latent class, conditional dependence is modeled as the mutual association of all or some latent continuous variables with a continuous latent trait (or in special cases, multiple latent traits). The relaxation of conditional independence assumptions allows LCA to better model natural taxa. Comparisons of specific restricted and unrestricted models permit statistical tests of specific aspects of latent taxonic structure. Latent class, latent trait, and latent distribution analysis can be viewed as special cases of the mixed latent trait model. The relationship between the multivariate probit mixture model proposed here and Rost’s mixed Rasch (1990, 1991) model is discussed. Two studies illustrate different uses of the proposed model.
Distinguishing Constant and Dimension-Dependent Interaction: A Simulation StudyTuerlinckx, Francis; De Boeck, Paul
doi: 10.1177/01466219922031419pmid: N/A
A simulation study was conducted to determine how well two models for local item dependency (LID), called interaction models, could be distinguished. The models examined were the constantorder interaction model (COIM) and the dimension dependent interaction model (DDIM). Data were simulated according to the latter model. Three factors were manipulated: sample size, the weight of the difference between the latent trait value of the examinee and the interaction parameter, and the value of the interaction parameter. Results indicated that (1) if the interaction parameter is not too extreme, the COIM will be rejected in favor of the true model (the Rasch model fit poorly for all levels of the interaction parameter); (2) a larger weight of the difference between the latent trait value and the interaction parameter facilitated the rejection of the COIM, although finding the true weight required a large sample size; and (3) the value for the interaction parameter with an optimal discrimination between the COIM and DDIM was not 0, as expected.
A Description and Demonstration of the Polytomous-DFIT FrameworkFlowers, Claudia P.; Oshima, T. C.; Raju, Nambury S.
doi: 10.1177/01466219922031437pmid: N/A
Raju, van der Linden, & Fleer (1995) proposed an item response theory based, parametric differential item functioning (DIF) and differential test functioning (DTF) procedure known as differential functioning of items and tests (DFIT). According to Raju et al., the DFIT framework can be used with unidimensional and multidimensional data that are scored dichotomously and/or polytomously. This study examined the polytomous-DFIT framework. Factors manipulated in the simulation were: (1) length of test (20 and 40 items), (2) focal group distribution, (3) number of DIF items, (4) direction of DIF, and (5) type of DIF. The findings provided promising results and indicated directions for future research. The polytomous DFIT framework was effective in identifying DTF and DIF for the simulated conditions. The DTF index did not perform as consistently as the DIF index. The findings are similar to those of unidimensional and multidimensional DFIT studies.
The Null Distribution of Person-Fit Statistics for Conventional and Adaptive TestsVan Krimpen-Stoop, Edith M. L. A.; Meijer, Rob R.
doi: 10.1177/01466219922031446pmid: N/A
Several person-fit statistics have been proposed to detect item score patterns that do not fit an item response theory model. To classify response patterns as misfitting, the distribution of a person-fit statistic is needed. The theoretical null distributions of several fit statistics have been derived for paper-and-pencil (P&P) tests. However, it is unknown whether these distributions also hold for computerized adaptive tests (CAT). A three-part simulation study was conducted. In the first study, the theoretical distribution of the lz statistic across trait. θlevels for CAT and P&P tests was investigated. The distribution of the l*zstatistic proposed by Snijders (in press) was also investigated. Results indicated that the distribution of both lzand l*zdiffered from the theoretical distribution in CAT. The second study examined the distributions of lzand l*zusing simulation. These simulated distributions, when based on O [UNKNOWN], were found to be problematic in CAT. In the third study, the detection rates of l*zand lzwere compared. The rates for both statistics were found to be similar in most cases.
A Rationale for Defining Achievement Levels Using IRT-Estimated Domain ScoresSchulz, E. Matthew; Kolen, Michael J.; Nicewander, W. Alan
doi: 10.1177/01466219922031464pmid: N/A
A new procedure for defining achievement levels on continuous scales was developed using aspects of Guttman scaling and item response theory. This procedure assigns examinees to levels of achievement when the levels are represented by separate pools of multiple-choice items. Items were assigned to levels on the basis of their content and hierarchically defined level descriptions. The resulting level response functions were well-spaced and noncrossing. This result allowed well-spaced levels of achievement to be defined by a common percent-correct standard of mastery on the level pools. Guttman patterns of mastery could be inferred from level scores. The new scoring procedure was found to have higher reliability, higher classification consistency, and lower classification error, when compared to two Guttman scoring procedures.
Testing the Equality of Two Independent αCoefficients Adjusted by the Spearman-Brown FormulaAlsawalmeh, Yousef M.; Feldt, Leonard S.
doi: 10.1177/01466216990234006pmid: N/A
An approximate statistical test is developed for the hypothesis of equality between the Spearman-Brown extrapolations of two independent values of Cronbach’s alpha reliability coefficient (α). This test assumes that the units added to or deleted from each instrument are classically parallel to the units included in the original version of each instrument. The projections for Tests 1 and 2 are based on lengthening or shortening factors of K1 and K2, which may or may not be equal. Special cases of this test include applications in which the projected values are intraclass coefficients or only one of the instruments is presumed to be altered in length. Monte carlo simulations demonstrated that the procedure effectively controls Type I error even when the original αs are based on as few as two test parts or two raters.
Software NoteChilds, Ruth A.; Chen, Wen-Hung
doi: 10.1177/01466219922031482pmid: N/A
The logistic versions of Samejima’s (1969) graded response model and Muraki’s (1992) generalized partial-credit model are parameterized differently by MULTILOG (Thissen, 1991) and PARSCALE (Muraki & Bock, 1996). Procedures for obtaining comparable item parameter estimates from MULTILOG and PARSCALE are described and example command files are provided.