Effects of Ecological Momentary Assessment Prompting Schedule on Affect Measurement Variability and Associations With Next-Day Health BehaviorsLangford, Jeremy S.; Hébert, Emily T.; Jones, Dusti R.; Tonkin, Sarah S.; Barker, Blayne A.; Ulm, Clayton; Shi, Dingjing; Becerra, Jessica A.; Businelle, Michael
doi: 10.1177/10731911261444953pmid: 42145270
Ecological momentary assessments (EMAs) effectively measure dynamic changes in constructs such as affect. Accurate assessment of affect variability may depend on study design characteristics. While EMAs delivered at random intervals are typically considered superior for capturing variability compared to EMAs delivered at set timepoints, there have been few experimental comparisons of these assessment methodologies. The present study used data from a nationwide factorial experiment to identify best practices in EMA methods. Participants (N = 205) were randomized to groups across five EMA study design factors, including prompting schedules (random or fixed). Daily EMAs assessed affect and health behaviors for 28 days. Multilevel regressions indicated significantly lower affect variability for the fixed prompt group in 8/16 models: compared to the random prompt group, there was between a 0.18 and 0.36 standard deviation decrease in variability in the fixed prompt group. A significant Prompt × Time interaction indicated that variability in the fixed group decreased over time relative to the random group in 6/16 models. However, prompt schedule did not reliably moderate associations between daily affect means/variability and next-day health behaviors. Thus, differences in affect variability between prompt schedules may exist for some affect items but may not be sufficiently large to influence associations with health behaviors.Trial Registration: ClinicalTrials.gov number NCT05194228; https://clinicaltrials.gov/study/NCT05194228
On the Structure of Fears of Evaluation: Testing for a General Factor and Its Validity in Bifactor ModelsZhang, Yifei; Chen, Junwen; Lv, Sixuan; Turel, Ofir; He, Qinghua
doi: 10.1177/10731911261447626pmid: 42178146
The Bivalent Fear of Evaluation model posits that individuals with social-anxiety experience both Fear of Negative Evaluation (FNE) and Fear of Positive Evaluation (FPE). Previous research has empirically supported a two-factor structure comprising FNE and FPE. More recently, theoretical reviews and empirical studies have turned to a broader construct, Fear of General Evaluation (FGE). Here, we sought to examine the existence and validity of FGE in two samples (644 Chinese undergraduates and 448 junior high school students). We captured two distinct measures of FNE and FPE, along with mental-health outcomes (social anxiety, generalized anxiety, and depression). The findings showed that both the two-factor and bifactor models fit the data well, with a slightly better fit for the bifactor model. The bifactor diagnostics also suggested that a reliable general factor likely exists. Moreover, FGE showed a clear pattern of criterion-related associations, relating most strongly to social anxiety and more moderately to generalized anxiety and depression. In conclusion, our results advance a plausible bifactor account of fears of evaluation, emphasizing the role of FGE in social anxiety and outlining implications for clinical practice.
Powering the Circumplex: A Practical Guide to Sample Size for the Structural Summary MethodGilbert, Kimberly J.
doi: 10.1177/10731911261444969pmid: 42109036
The Structural Summary Method (SSM) provides a central framework for evaluating constructs within circumplex models, yet guidance for prospective power analysis has remained unavailable. Here, I extend the SSM from a retrospective tool to one supporting a priori inference. Across two studies, I derive and validate power functions for amplitude and angular displacement. Study 1 establishes analytical power functions for amplitude, demonstrating that meaningful interpersonal differentiation can be detected with modest sample sizes. Study 2 extends this framework to angular displacement, showing that displacement precision depends critically on amplitude, with important implications for displacement-focused hypotheses. Validation using simulation and resampling across multiple datasets confirms that the proposed formulas provide reliable guidance for study design. These results clarify the distinct inferential properties of amplitude and displacement and strengthen the SSM as a tool for prospective research planning within circumplex models. This study’s code and materials are available at https://osf.io/63wyf.
Attention-Based Performance Validity Assessment in Paediatric SamplesRaasch, Emily; Christiansen, Hanna; Kneidinger, Johanna; Albrecht, Björn; Fuermaier, Anselm B. M.
doi: 10.1177/10731911261452115pmid: 42316779
Neuropsychological assessments require adequate task engagement, yet paediatric performance validity testing (PVT) remains limited. This study evaluated the Groningen Effort Test (GET) as an attention-based PVT in children and adolescents using a simulation design. Participants were typically developing controls (N = 113), instructed simulators (N = 40) and outpatient referrals (N = 47), aged 6–17 years, who completed the Test of Memory Malingering (TOMM Trial-1), Reliable Digit Span (RDS) and the GET. GET performance improved with age, and outpatient referrals scored below controls. Age-stratified cutoffs (<13 years vs. >13 years) produced high specificity and strong sensitivity in children, comparable to TOMM Trial-1 and RDS, while all PVTs showed reduced sensitivity in adolescents. ROC analyses indicated excellent accuracy in children and good to moderate accuracy in adolescents. Regression analyses revealed significant GET effects of group and age. Findings support the GET as a promising paediatric PVT requiring further clinical validation.
Development of a Computerized Adaptive Item Bank for the Big Five Personality Based on Large Language ModelsGao, Yaojie; Ma, Yuanqiu; Qi, Yunxiao; Liu, Tour
doi: 10.1177/10731911261427877pmid: 41904786
Traditional item development methods have constrained the advancement of computerized adaptive testing, hindering the achievement of fully intelligent assessments. With the progress of natural language processing technologies, automatic item generation (AIG) based on large language models (LLMs) offers a promising solution to this challenge. This study employed three LLMs to generate Simplified Chinese Big Five personality items and evaluated the effectiveness of the resulting adaptive item bank through two rounds of empirical testing. The goal was to leverage emerging technologies to address one of the key bottlenecks in CAT development and to promote the realization of fully automated, intelligent assessment workflows. Findings indicate that LLM-based AIG can produce high-quality Big Five CAT item banks cost-effectively and efficiently. Moreover, this approach demonstrates robust performance across different LLMs, highlighting its cross-model stability and practical potential.
Development of a Heart Rate Variability-Based Predictive Model for Depressive Symptoms in Chinese University StudentsHuo, GuiQuan; Zhao, Ruixue; Li, Jiayi; Wang, Jingjie; Wang, Jiameng
doi: 10.1177/10731911261436691pmid: 42015377
The current depression assessment tools are limited by subjectivity and potential bias. This study investigated the relationship between heart rate variability (HRV), body composition, and self-reported depressive symptoms to develop an objective depression screening model for Chinese university students. Data from 2,094 students, including demographics, body composition, Self-Rating Depression Scale (SDS) scores, and HRV indicators, were analyzed using SPSS 26.0 to construct a predictive regression model, with accuracy validated in GraphPad Prism 9.4.1. Subsequently, a subgroup of 359 students with depressive symptoms was screened using the model. The results showed no significant differences between the predicted and actual SDS scores (p > .05), with over 91% of the predicted scores falling within the 95% confidence interval of the actual scores. The strong correlation between HRV and SDS scores supports the use of HRV as a reliable indicator for depression screening. Overall, the model demonstrated a prediction accuracy of 92.61%, highlighting its potential for objective mental health assessment among university populations.
Advancing Understanding of Mania/Hypomania Symptoms’ Transdiagnostic, Multimethod AssociationsStanton, Kasey; Towne, Helena; Burke, Jack D.; Levin-Aspenson, Holly F.
doi: 10.1177/10731911251414599pmid: 41581058
We assessed mania/hypomania and a range of other symptoms, substance use, and personality dimensions across interview and self-report methods toward advancing understanding of mania/hypomania’s multimethod assessment. Our study, informed by transdiagnostic, dimensional models examined these issues in a clinically-oriented sample (N = 245; mean age = 32, SD =12; 59% currently accessing psychotherapy and/or medication), and as a secondary aim, we evaluated the structure of lifetime interview ratings of hypomania symptoms. We found evidence for a single-factor structure of interview-rated hypomania, and interview-rated hypomania associated strongly with psychosis and related variables, consistent with recent classification-focused research. Self-rated mania/hypomania-relevant measures shared these associations to some degree, but there were also key differences in associations across measures and methods (e.g., self-rated euphoria showed distinctive positive associations with extraversion). We draw on these results to provide ideas for modeling mania/hypomania symptom heterogeneity more thoroughly in future studies and for sharpening multimethod clinical assessment.
Defensive or Deceptive? Two Strategies for Detecting Concealed Psychopathology With the Personality Assessment InventoryKurtz, John E.; Hughes, F. AnNa; DeCamp, Derick M.; Ghosh, Ashmita
doi: 10.1177/10731911261423123pmid: 41802933
The Personality Assessment Inventory (PAI) is frequently employed in selection contexts to screen for psychopathology. Although the PAI has several validity scales that effectively measure positive response distortion, the ability of these scales to detect concealed psychopathology has not been fully evaluated. Using data from a previous study and a new sample of 203 undergraduate students, the current study examined whether scores on the Positive Impression Management (PIM), Defensiveness Index (DEF), and Cashel Discriminant Function (CDF) observed under job applicant role-play instructions could detect reports of psychopathology observed under standard instructions. The data from the new sample of students were also used to evaluate the convergent and discriminant validity of PIM-predicted deviation scores from role-play to predict corresponding scores from the standard administration, as originally reported in Kurtz et al. PIM scores from role-play correlated negatively, and CDF from role-play correlated positively, with elevated scores in the standard condition. PIM-predicted deviation scores from role-play showed convergent and discriminant validity with scores from standard administration. Several recommendations are offered based on these results for the effective use of the PAI in assessment contexts with strong incentives for defensiveness and concealment of symptoms and problems.
What Makes It Look Like That? Prototypicality and Color Diagnosticity for the RorschachDi Girolamo, Marzia; A. Pimentel, Ruam P. F.; Corgiat-Loia, Andrea; Pasqualini, Sara; Ales, Francesca
doi: 10.1177/10731911261442057pmid: 42152720
The Rorschach is one of the most widely used tests to assess personality functioning. Among the various categories, determinants play a fundamental role, yet sometimes their coding leaves room for subjective judgment and therefore ambiguities. This study investigates prototypicality and color diagnosticity within the Rorschach Performance Assessment System (R-PAS), focusing on how color–object associations influence test’s administration. In fact, prototypicality (the degree to which an object’s feature aligns with its conventional representation) interacts with color diagnosticity (the extent to which an object is associated with specific colors). We administered a questionnaire to 587 nonclinical participants, and examined common color-object associations by asking them to (a) describe the visual characteristics they considered most representative of a given object, (b) read a series of colors and list three to five items commonly associated with a specific color, and (c) read a series of object names and select the color they thought most people would use to color those objects. Responses were analyzed via hierarchical cluster analysis. Results showed that certain objects exhibit high prototypicality in association to specific colors, whereas other items showed more ambiguous associations. Discussion emphasizes the importance of appropriate administration training, as well as more standardized guidelines to mitigate subjective biases in coding.