Biometrics

journal article

Open Access Collection

Uncertainty quantification and multi-stage variable selection for personalized treatment regimes

Bi, Jiefeng; Borrotti, Matteo; Nipoti, Bernardo

2026 Biometrics

doi: 10.1093/biomtc/ujag081pmid: 42166188

A dynamic treatment regime is a sequence of medical decisions that adapts to the evolving clinical status of a patient over time. To facilitate personalized care, it is crucial to assess the probability of each available treatment option being optimal for a specific patient, while also identifying the key prognostic factors that determine the optimal sequence of treatments. This task has become increasingly challenging due to the growing number of individual prognostic factors typically available. In response to these challenges, we propose a Bayesian model for optimizing dynamic treatment regimes that addresses the uncertainty in identifying optimal decision sequences and incorporates dimensionality reduction to manage high-dimensional individual covariates. The first task is achieved through a suitable augmentation of the model to handle counterfactual variables. For the second, we introduce a novel class of spike-and-slab priors for the multi-stage selection of significant factors, to favor the sharing of information across stages. The effectiveness of the proposed approach is demonstrated through an extensive simulation study and illustrated using clinical trial data on severe acute arterial hypertension.

journal article

LitStream Collection

A zero-inflated hierarchical generalized transformation model to address non-normality in spatially-informed cell-type deconvolution

Melton, Hunter J; Bradley, Jonathan R; Wu, Chong

2026 Biometrics

doi: 10.1093/biomtc/ujag055pmid: 41994891

Oral squamous cell carcinomas (OSCC), the predominant head and neck cancer, pose significant challenges due to late-stage diagnoses and low five-year survival rates. Spatial transcriptomics offers a promising avenue to decipher the genetic intricacies of OSCC tumor microenvironments. In spatial transcriptomics, Cell-type deconvolution is a crucial inferential goal; however, current methods fail to consider the high zero-inflation present in OSCC data. To address this, we develop a novel zero-inflated version of the hierarchical generalized transformation model (ZI-HGT) and apply it to the Conditional AutoRegressive Deconvolution (CARD) for cell-type deconvolution. The ZI-HGT serves as an auxiliary Bayesian technique for CARD, reconciling the highly zero-inflated OSCC spatial transcriptomics data with CARD’s normality assumption. The combined ZI-HGT + CARD framework achieves enhanced cell-type deconvolution accuracy and quantifies uncertainty in the estimated cell-type proportions. We demonstrate the superior performance through simulations and analysis of the OSCC data. Furthermore, our approach enables the determination of the locations of the diverse fibroblast population in the tumor microenvironment, critical for understanding tumor growth and immunosuppression in OSCC.

journal article

LitStream Collection

A mixed effect similarity matrix regression model (SMRmix) for integrating multiple microbiome datasets at the community level

He, Mengyu; Zhao, Ni

2026 Biometrics

doi: 10.1093/biomtc/ujag077pmid: 42127284

Recent studies have highlighted the importance of the human microbiota in health and disease. However, in many areas of research, individual microbiome studies often provide inconsistent results due to limited sample sizes and the heterogeneity in study populations and experimental procedures. This inconsistency underscores the need for integrative analysis of multiple microbiome datasets. Despite the critical need, statistical methods that incorporate multiple microbiome datasets and account for study heterogeneity are not available in the literature. To address this, we propose a mixed effect similarity matrix regression (SMRmix) approach for identifying community-level microbiome shifts associated with outcomes. SMRmix has a close connection with the microbiome kernel association test, one of the most popular approaches for such a task, but it is only applicable when we have a single study. SMRmix enables researchers to consolidate findings from diverse microbiome studies. Through extensive simulations, we show that SMRmix maintains well-controlled Type I error rates and achieves higher power than competing methods. We further demonstrate its utility on two real-world datasets—17 HIV gut dysbiosis studies and 11 colorectal cancer studies—showing that SMRmix provides consistent results on community-level shifts in both applications.

journal article

LitStream Collection

A novel exact confidence interval for the difference of proportions in paired data using a restricted most probable statistic

Cao, Xingyun; Wang, Weizhen; Xie, Tianfa

2026 Biometrics

doi: 10.1093/biomtc/ujag061pmid: 42053378

Inference on the difference between two proportions in paired data is a key issue, particularly in biomedical research and clinical trials. Numerous methods exist for constructing confidence intervals for this difference. However, approximate methods that rely on asymptotic normality can be unreliable, underscoring the need for exact confidence intervals to improve reliability. In this paper, we develop a novel interval based on the restricted most probable method, which is further optimized using the h-function method to yield an optimal exact interval, ensuring both reliability and precision. We compare the proposed interval with other exact intervals developed through methodologies such as the score method, two Tang methods, the Wang method, the adjusted Wald method, and the score method with continuity correction. Our comparative analysis, utilizing the infimum coverage probability and total interval length as evaluation metrics, demonstrates the uniformly superior performance of the proposed interval. Additionally, an example illustrates its practical application in real-world scenarios. Supplementary Materials provide another example, numerical results on coverage and non-coverage probabilities, and R code.

journal article

Open Access Collection

Nonparametric estimation of the total treatment effect with multiple outcomes in the presence of terminal events

Gronsbell, Jessica; McCaw, Zachary R; Nogues, Isabelle-Emmanuella; Kong, Xiangshan; Cai, Tianxi; Tian, Lu; Wei, L J

2026 Biometrics

doi: 10.1093/biomtc/ujag053pmid: 42166186

As standards of care advance, patients are living longer and once-fatal diseases are becoming manageable. Clinical trials increasingly focus on reducing disease burden, which can be quantified by the timing and occurrence of multiple non-fatal clinical events. Most existing methods for the analysis of multiple event-time data require stringent modeling assumptions that can be difficult to verify empirically, leading to treatment efficacy estimates that forego interpretability when the underlying assumptions are not met. Moreover, many methods do not appropriately account for informative terminal events, such as premature treatment discontinuation or death, which prevent the occurrence of subsequent events. To address these limitations, we derive and validate estimation and inference procedures for the area under the mean cumulative function (AUMCF), an extension of the restricted mean survival time to the multiple event-time setting. The AUMCF is clinically interpretable, properly accounts for terminal competing risks, and can be estimated nonparametrically. To enable covariate adjustment, we also develop an augmentation estimator that provides efficiency at least equaling, and often exceeding, the unadjusted estimator. The utility and interpretability of the AUMCF are illustrated with extensive simulation studies and through an analysis of multiple heart-failure-related endpoints using data from the Beta-Blocker Evaluation of Survival Trial. Our open-source R package MCC makes conducting AUMCF analyses straightforward and accessible.

journal article

LitStream Collection

Copula Additive Distributional Regression Using R

Kneib, Thomas

2026 Biometrics

doi: 10.1093/biomtc/ujag008pmid: N/A

journal article

LitStream Collection

Rejoinder to the discussion on ‘‘Nonparanormal Adjusted Marginal Inference’’

Dandl, Susanne; Hothorn, Torsten

2026 Biometrics

doi: 10.1093/biomtc/ujag069pmid: 42083908

This rejoinder addresses the three discussions of our manuscript on “Nonparanormal Adjusted Marginal Inference”. We appreciate the thoughtful assessments and provide clarifications and refinements in response to the points raised.

journal article

LitStream Collection

Transfer learning estimation of the accelerated failure time model based on high-dimensional data

Lou, Yichen; Du, Mingyue; Zhao, Hui; Sun, Jianguo

2026 Biometrics

doi: 10.1093/biomtc/ujag103pmid: 42259651

Motivated by a study on seriously ill hospitalized adults to improve their end-of-life care, we consider estimation of the accelerated failure time model, one of the most commonly used models for regression analysis of failure time data. Although many methods have been developed for the problem, standard approaches may fail or underperform when available information is limited. To address this issue, we propose two transfer learning estimation procedures that leverage auxiliary information from multiple source datasets. The first is a data-driven source detection procedure that classifies the source datasets into positively and negatively transferable groups and performs estimation using only the positively transferable or informative source datasets. The other is an ensemble-based approach that adaptively assigns weights to source datasets based on their relevance to the target dataset. Theoretical justifications are provided for the proposed methods, and an extensive simulation study is performed, indicating that the proposed methods work well in practice. Finally, they are applied to the study above and identify some prognostic factors that would not be possible by using the existing methods.

journal article

Open Access Collection

Two-phase designs for biomarker studies when disease processes are under intermittent observation

Li, Kecheng; Cook, Richard J

2026 Biometrics

doi: 10.1093/biomtc/ujag088pmid: 42166187

Multistate models offer an appealing framework for studying the onset and progression of chronic diseases in large cohort studies. Such studies often involve the collection and storage of biospecimens at an initial assessment, and intermittent observation of the disease process at future assessment times. We consider the design of two-phase biomarker studies in such settings where budgetary constraints prohibit assaying all biospecimens. A subsample of individuals is instead chosen to have their biospecimens assayed to facilitate examination of the association between a biomarker of interest and the disease process. Analyses based on likelihood, conditional likelihood, and estimating functions are considered, with the efficiency gains from various subsampling strategies investigated. Pseudo-score residual-dependent sampling strategies are shown to yield highly efficient maximum likelihood estimates of biomarker effects on disease progression. This sampling strategy along with competing methods are empirically studied and applied to a motivating study of the relationship between the HLA-B27 marker and joint damage in patients with psoriatic arthritis.

journal article

LitStream Collection

Decentralized EM algorithm for Gaussian mixtures under data heterogeneity and partial labeling

Li, Xuetong; Wu, Shuyuan; Du, Bin; Wang, Hansheng

2026 Biometrics

doi: 10.1093/biomtc/ujag092pmid: 42201842

We systematically study several network-based Expectation–Maximization (EM) algorithms for the Gaussian mixture model within decentralized federated learning (DFL). Our theoretical investigation reveals that directly extending the classic EM algorithm to DFL leads to a seriously biased estimator if the data are heterogeneously distributed across different sites. To address this issue, we introduce a momentum network EM (MNEM) algorithm, which integrates information from both current and historical estimators from previous DFL iterations. We further develop a semi-supervised MNEM (semi-MNEM) algorithm, which utilizes valuable information provided by partially labeled data. Rigorous theoretical analysis demonstrates that the MNEM estimator can achieve the same asymptotic efficiency as the whole sample estimator under appropriate regularity conditions, even if the data are heterogeneously distributed. Moreover, the semi-MNEM estimator significantly improves the convergence speed of the MNEM algorithm, even if different mixture components are poorly separated. Extensive simulations are conducted, and a widely used chest X-ray dataset is analyzed to demonstrate the finite-sample performance of the proposed methods.

Showing 1 to 10 of 57 Articles

Articles per page

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

Related Journals: