Brusco, Michael; Steinley, Douglas; Watts, Ashley L.
doi: 10.1111/bmsp.12386pmid: 40012163
Integer programming (IP) is an extension of linear programming (LP) whereby the goal is to determine values for a set of decision variables (some or all of which have integer restrictions) so as to maximize or minimize a linear objective function of the variables subject to a set of linear constraints involving the variables. Although the psychological literature is replete with applications of multivariate statistics, implementations of mathematical modelling methods such as IP are comparatively far fewer. Nevertheless, over the decades, there have been a variety of important applications and the vast majority of these fall within the IP rather than the LP category. In this paper, we offer a brief overview of the history of IP methodology. We subsequently review some domains where IP has been gainfully applied in psychology, such as test assembly, cluster analysis and classification and seriation and unidimensional scaling. An illustrative example of using IP to cluster respondents measured on items pertaining to substance abuse disorder is provided. Finally, we identify areas where IP might be applied in emerging areas of psychology, such as in the domain of network psychometrics.
doi: 10.1111/bmsp.12389pmid: 40165349
Good scientific practice requires that the reporting of the statistical analysis of experiments should include estimates of effect size as well as the results of tests of statistical significance. Good statistical practice requires that effect size estimates be reported along with some indication of their statistical uncertainty, such as a standard error. This article provides a review of effect sizes for experimental research, including expressions for the standard error of each effect size. It focuses on effect sizes for experiments with treatments having a single degree of freedom but also includes effect sizes for treatments with multiple degrees of freedom having either fixed or random effects.
Belov, Dmitry I.; Lüdtke, Oliver; Ulitzsch, Esther
doi: 10.1111/bmsp.12396pmid: 40371820
Existing estimators of parameters of item response theory (IRT) models exploit the likelihood function. In small samples, however, the IRT likelihood oftentimes contains little informative value, potentially resulting in biased and/or unstable parameter estimates and large standard errors. To facilitate small‐sample IRT estimation, we introduce a novel approach that does not rely on the likelihood. Our estimation approach derives features from response data and then maps the features to item parameters using a neural network (NN). We describe and evaluate our approach for the three‐parameter logistic model; however, it is applicable to any model with an item characteristic curve. Three types of NNs are developed, supporting the obtainment of both point estimates and confidence intervals for IRT model parameters. The results of a simulation study demonstrate that these NNs perform better than Bayesian estimation using Markov chain Monte Carlo methods in terms of the quality of the point estimates and confidence intervals while also being much faster. These properties facilitate (1) pretesting items in a real‐time testing environment, (2) pretesting more items and (3) pretesting items only in a secured environment to eradicate possible compromise of new items in online testing.
Bolsinova, Maria; Gergely, Bence; Brinkhuis, Matthieu J. S.
doi: 10.1111/bmsp.12395pmid: 40476309
The Elo Rating System which originates from competitive chess has been widely utilised in large‐scale online educational applications where it is used for on‐the‐fly estimation of ability, item calibration, and adaptivity. In this paper, we aim to critically analyse the shortcomings of the Elo rating system in an educational context, shedding light on its measurement properties and when these may fall short in accurately capturing student abilities and item difficulties. In a simulation study, we look at the asymptotic properties of the Elo rating system. Our results show that the Elo ratings are generally not unbiased and their variances are context‐dependent. Furthermore, in scenarios where items are selected adaptively based on the current ratings and the item difficulties are updated alongside the student abilities, the variance of the ratings across items and students artificially increases over time and as a result the ratings do not converge. We propose a solution to this problem which entails using two parallel chains of ratings which remove the dependence of item selection on the current errors in the ratings.
Wiedermann, Wolfgang; Shi, Dexin
doi: 10.1111/bmsp.70000pmid: 40524418
Instrumental variable (IV) estimation constitutes a powerful quasi‐experimental tool to estimate causal effects in observational data. The IV approach, however, rests on two crucial assumptions—the instrument relevance assumption and the exclusion restriction assumption. The latter requirement (stating that the IV is not allowed to be related to the outcome via any path other than the one going through the predictor), cannot be empirically tested in just‐identified models (i.e. models with as many IVs as predictors). The present study introduces properties of non‐Gaussian IV models which enable one to test whether hidden confounding between an IV and the outcome is present. Detecting exclusion restriction violations due to a direct path between the IV and the outcome, however, is restricted to the over‐identified case. Based on these insights, a two‐step approach is presented to test IV validity against hidden confounding in just‐identified models. The performance of the approach was evaluated using Monte‐Carlo simulation experiments. An empirical example from psychological research is given to illustrate the approach in practice. Recommendations for best‐practice applications and future research directions are discussed. Although the current study presents important insights for developing diagnostic procedures for IV models, sound universal IV validation in the just‐identified case remains a challenging task.
doi: 10.1111/bmsp.70002pmid: 40530679
Test scores, like the sum score, can be useful for making inferences about the latent variables. The conditions under which such test scores allow for inferences of the latent variables based on a “weaker” stochastic ordering are generalized to any monotone latent variable model for which the latent variables are associated. The generality of these conditions places the sum score, or indeed any test score, well beyond a mere intuitive measure or a relic from classical test theory.
Chattopadhyay, Bhargab; Bapat, Sudeep R.
doi: 10.1111/bmsp.70001pmid: 40551576
Effect size estimates are now widely reported in various behavioural studies. In precise estimation or power analysis studies, sample size planning revolves around the standard error (or variance) of the effect size. Note these studies are carried out under sampling‐budget constraints. Hence, the optimum allocation of resources to populations with different inherent population variances is paramount as this affects the effect size variance. In this paper, a general effect size meant to compare two population characteristics is defined, and under budget constraints, we aim to optimize the variance of the general effect size. In the process, we use sequential theory to arrive at optimum sample sizes of the corresponding populations to achieve minimum variance. The sequential method we developed is a distribution‐free method and does not need knowledge of population parameters. Mathematical justification of the characteristics enjoyed by our sequential method is laid out along with simulation studies. Thus, our work has wide applicability in the effect size comparison context.
Rooij, Mark; Cotugno, Lorenza; Siciliano, Roberta
doi: 10.1111/bmsp.70004pmid: 40873424
In this paper, we propose the generalized mixed reduced rank regression method, GMR3 for short. GMR3 is a regression method for a mix of numeric, binary and ordinal response variables. The predictor variables can be a mix of binary, nominal, ordinal and numeric variables. For dealing with the categorical predictors we use optimal scaling. A majorization‐minimization algorithm is derived for maximum likelihood estimation. A series of simulation studies is shown (Section 4) to evaluate the performance of the algorithm with different types of predictor and response variables. In Section 5, we briefly discuss the choices to make when applying the model the empirical data and give suggestions for supporting such choices. In a second simulation study (Section 6), we further study the behaviour of the model and algorithm in different scenarios for the true rank in relation to sample size. In Section 7, we show an application of GMR3 using the Eurobarometer Surveys data set of 2023.
Showing 1 to 10 of 12 Articles
Experience Sampling Methodology (ESM) has been widely used over the past decades to study feelings, behaviour and thoughts as they occur in daily life. Typically, participants complete several assessments per day via a smartphone for multiple days. The growing adoption of ESM has spurred a number of methodological advancements. In this paper, we provide an overview of recent developments in ESM design, statistical analysis and implementation. In terms of design, we discuss considerations around what to measure—including the reliability and validity of self‐report measures as well as mobile sensing—as well as when to measure, where we focus on the pros and cons of burst designs and advances in sample size planning methodology. Regarding statistical analysis, we highlight non‐linear models, survival analysis for understanding time‐to‐event data and real‐time monitoring of ESM time series. At the implementation level, we address open science practices and advances in data preprocessing. Although most of the topics discussed in this paper are generic, many of the examples are focused on the study of affect in daily life.