Journal of Educational Measurement

Journal of Educational Measurement | DeepDyve

journal article

LitStream Collection

DOES THE RASCH MODEL REALLY WORK FOR MULTIPLE CHOICE ITEMS? NOT IF YOU LOOK CLOSELY

1986 Journal of Educational Measurement

doi: 10.1111/j.1745-3984.1986.tb00251.xpmid: N/A

This paper discusses various issues involved in using the Rasch model with multiple choice tests. By presenting a modified test that is much more powerful, the value of Wright and Panchapakesan's test as evidence of model fit is shown to be questionable. According to the new test, the model failed to fit 68% of the items in the Anchor Test Study. Effects of such misfit on test equating are demonstrated. Results of some past studies purporting to support the Rasch model are shown to be irrelevant, or to yield the conclusion that the Rasch model did not fit the data. Issues like “objectivity” and consistent estimation are shown to be unimportant in selection of a latent trait model. Thus, available evidence shows the Rasch model to be unsuitable for multiple choice items.

journal article

LitStream Collection

THE CHOICE OF SCALE FOR EDUCATIONAL MEASUREMENT: AN IRT PERSPECTIVE

YEN, WENDY M.

1986 Journal of Educational Measurement

doi: 10.1111/j.1745-3984.1986.tb00252.xpmid: N/A

Two methods of constructing equal‐interval scales for educational achievement are discussed: Thurstone's absolute scaling method and Item Response Theory (IRT). Alternative criteria for choosing a scale are contrasted. It is argued that clearer criteria are needed for judging the appropriateness and usefulness of alternative scaling procedures, and more information is needed about the qualities of the different scales that are available. In answer to this second need, some examples are presented of how IRT can be used to examine the properties of scales: It is demonstrated that for observed score scales in common use (i.e., any scores that are influenced by measurement error), (a) systematic errors can be introduced when comparing growth at selected percentiles, and (b) normalizing observed scores will not necessarily produce a scale that is linearly related to an underlying normally distributed true trait.

journal article

LitStream Collection

AN EXAMINATION OF THE ASSUMPTION THAT THE EQUATING OF PARALLEL FORMS IS POPULATION‐INDEPENDENT

ANGOFF, WILLIAM H.; COWELL, WILLIAM R.

1986 Journal of Educational Measurement

doi: 10.1111/j.1745-3984.1986.tb00253.xpmid: N/A

Linear conversions were developed relating scores on two recent forms of the homogeneous GRE Quantitative Test (GRE‐Q) and the specially constituted heterogeneous GRE Verbal‐plus‐Quantitative Test (GRE‐V+Q), using randomly equivalent groups of about 13, 000 taking each form. Specially defined homogeneous subpopulations were then identified, and conversions between scores on the two forms were again calculated, this time based on l000‐case samples drawn at random from the subpopulations. Finally, in order to develop empirical measures of equating error, I00 samples of 1, 000 cases each were drawn at random from the two total groups and used to calculate 100 conversions between scores on the two forms. The conversions based on the specially selected subpopulations were then compared with the total‐group conversions and evaluated in terms of the empirical standard errors. The results showed that the conversions for the subpopulations agreed with the total‐group conversion quite satisfactorily for the GRE‐Q and almost as well for the GRE‐V+Q. It was concluded that the data clearly support the assumption of population independence for homogeneous tests, but not quite so clearly for heterogeneous tests.

journal article

LitStream Collection

TEACHER EDUCATION AND TEACHER‐PERCEIVED NEEDS IN EDUCATIONAL MEASUREMENT AND EVALUATION

GULLICKSON, ARLEN R.

1986 Journal of Educational Measurement

doi: 10.1111/j.1745-3984.1986.tb00254.xpmid: N/A

Professors and teachers were compared relative to their perspectives on preservice educational measurement courses. Twenty‐four professors from different colleges in seven states and 360 teachers from elementary and secondary schools in one midwestern state responded via mailed questionnaire. Professors reported the emphasis given to each of eight topics in preservice educational measurement courses, and teachers reported the emphasis they believed should be given to each topic. In five of the eight content areas, the relative emphases given by professors differed from that recommended by teachers. Major differences emerged in nontest evaluation, statistical analysis, and formative and summative evaluation. Implications of these results are discussed.

journal article

LitStream Collection

DEMONSTRATING THE UTILITY OF THE STANDARDIZATION APPROACH TO ASSESSING UNEXPECTED DIFFERENTIAL ITEM PERFORMANCE ON THE SCHOLASTIC APTITUDE TEST

DORANS, NEIL J.; KULICK, EDWARD

1986 Journal of Educational Measurement

doi: 10.1111/j.1745-3984.1986.tb00255.xpmid: N/A

The standardization method for assessing unexpected differential item performance or differential item functioning (DIF) is introduced. The principal findings of the first five studies that have used this approach on the Scholastic Aptitude Test are presented.

journal article

LitStream Collection

THE SUBSET SELECTION TECHNIQUE FOR MULTIPLE‐CHOICE TESTS: AN EMPIRICAL INQUIRY

JARADAT, DERAR; SAWAGED, SARI

1986 Journal of Educational Measurement

doi: 10.1111/j.1745-3984.1986.tb00256.xpmid: N/A

The impact of the Subset Selection Technique (SST) for administering and scoring multiple‐choice items on certain properties of a test was compared with that of the two other commonly used methods, the Number Right (NR) and the Correction for Guessing Formula (CFG). Under SST, examinees are instructed to select any number of response alternatives, the objective being to include the correct answer in the chosen set. The effects of each scoring method on the psychometric properties of a test and on the performance of examinees with different achievement levels and/or risk‐taking propensities were investigated. Results indicated that SST outperformed the other two methods, producing not only higher reliability and validity coefficients for the test, but doing so without favoring high risk takers. The superiority of SST may be attributed to two interrelated factors: the efficiency of the technique in controlling for guessing and the encouragement provided examinees to use their partial knowledge in responding.

journal article

LitStream Collection

MEASURING THE ORGANIZATIONAL ASPECTS OF WRITING ABILITY

BENTON, STEPHEN L.; KIEWRA, KENNETH A.

1986 Journal of Educational Measurement

doi: 10.1111/j.1745-3984.1986.tb00257.xpmid: N/A

The present study assessed the relationship among holistic writing ability, the Test of Standard Written English (TSWE), and the following tests of organizational ability: anagram solving, word reordering, sentence reordering, and paragraph assembly. Based upon a sample of 105 undergraduate students, the main findings were that writing ability, as measured by the holistic method of scoring, was significantly correlated with performance on the TSWE and the four tests of organizational ability. A composite score on all four organizational tests was found to have the highest zero‐order correlation with the measure of writing ability. A stepwise regression analysis, with the measure of writing ability as the criterion, also indicated that the composite score explained a significant proportion of the variance beyond that explained by the TSWE. The results are discussed in terms of the Kintsch and van Dijk model of strategic discourse processing, which suggests that different organizational strategies operate at the levels of words, sentences, and paragraphs. It is concluded that tests assessing organizational strategies ought to be included in assessments of writing ability.

Showing 1 to 7 of 7 Articles

Articles per page

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

1989

1988

1987

1986

1985

1984

1983

1982

1981

1980

1979

1978

1977

1976

1975

1974

1973

1972

1971

1970

1969

1968

1967

1966

1965

1964

Related Journals: