Gabriel, Erin E.; Sachs, Michael C.; Follmann, Dean A.; Andersson, Therese M‐L.
doi: 10.1111/biom.13211pmid: 31868914
Many infectious diseases are well prevented by proper vaccination. However, when a vaccine is not completely efficacious, there is great interest in determining how the vaccine effect differs in subgroups conditional on measured immune responses postvaccination and also according to the type of infecting agent (eg, strain of a virus). The former is often called correlate of protection (CoP) analysis, while the latter has been called sieve analysis. We propose a unified framework for simultaneously assessing CoP and sieve effects of a vaccine in a large Phase III randomized trial. We use flexible parametric models treating times to infection from different agents as competing risks and estimated maximum likelihood to fit the models. The parametric models under competing risks allow for estimation of both cumulative incidence‐based contrasts and instantaneous rates. We outline the assumptions with which we can link the observable data to the causal contrasts of interest, propose hypothesis testing procedures, and evaluate our proposed methods in an extensive simulation study.
Li, Dateng; Cao, Jing; Zhang, Song
doi: 10.1111/biom.13212pmid: 31872435
Cluster randomized trials (CRTs) are widely used in different areas of medicine and public health. Recently, with increasing complexity of medical therapies and technological advances in monitoring multiple outcomes, many clinical trials attempt to evaluate multiple co‐primary endpoints. In this study, we present a power analysis method for CRTs with K≥2 binary co‐primary endpoints. It is developed based on the GEE (generalized estimating equation) approach, and three types of correlations are considered: inter‐subject correlation within each endpoint, intra‐subject correlation across endpoints, and inter‐subject correlation across endpoints. A closed‐form joint distribution of the K test statistics is derived, which facilitates the evaluation of power and type I error for arbitrarily constructed hypotheses. We further present a theorem that characterizes the relationship between various correlations and testing power. We assess the performance of the proposed power analysis method based on extensive simulation studies. An application example to a real clinical trial is presented.
Wu, Peng; Zeng, Donglin; Fu, Haoda; Wang, Yuanjia
doi: 10.1111/biom.13288pmid: 32365232
Individualized treatment rules (ITRs) tailor medical treatments according to patient‐specific characteristics in order to optimize patient outcomes. Data from randomized controlled trials (RCTs) are used to infer valid ITRs using statistical and machine learning methods. However, RCTs are usually conducted under specific inclusion/exclusion criteria, thus limiting their generalizability to a broader patient population in real‐world practice settings. Because electronic health records (EHRs) document treatment prescriptions in the real world, transferring information in EHRs to RCTs, if done appropriately, could potentially improve the performance of ITRs, in terms of precision and generalizability. In this work, we propose a new domain adaptation method to learn ITRs by incorporating information from EHRs. Unless we assume that there is no unmeasured confounding in EHRs, we cannot directly learn the optimal ITR from the combined EHR and RCT data. Instead, we first pretrain “super” features from EHRs that summarize physician treatment decisions and patient observed benefits in the real world, as these are likely to be informative of the optimal ITRs. We then augment the feature space of the RCT and learn the optimal ITRs by stratifying by super features using subjects enrolled in RCT. We adopt Q‐learning and a modified matched‐learning algorithm for estimation. We present heuristic justification of our method and conduct simulation studies to demonstrate the performance of super features. Finally, we apply our method to transfer information learned from EHRs of patients with type 2 diabetes to learn individualized insulin therapies from RCT data.
Shin, Yei Eun; Pfeiffer, Ruth M.; Graubard, Barry I.; Gail, Mitchell H.
doi: 10.1111/biom.13209pmid: 31863593
Cohort studies provide information on relative hazards and pure risks of disease. For rare outcomes, large cohorts are needed to have sufficient numbers of events, making it costly to obtain covariate information on all cohort members. We focus on nested case‐control designs that are used to estimate relative hazard in the Cox regression model. In 1997, Langholz and Borgan showed that pure risk can also be estimated from nested case‐control data. However, these approaches do not take advantage of some covariates that may be available on all cohort members. Researchers have used weight calibration to increase the efficiency of relative hazard estimates from case‐cohort studies and nested cased‐control studies. Our objective is to extend weight calibration approaches to nested case‐control designs to improve precision of estimates of relative hazards and pure risks. We show that calibrating sample weights additionally against follow‐up times multiplied by relative hazards during the risk projection period improves estimates of pure risk. Efficiency improvements for relative hazards for variables that are available on the entire cohort also contribute to improved efficiency for pure risks. We develop explicit variance formulas for the weight‐calibrated estimates. Simulations show how much precision is improved by calibration and confirm the validity of inference based on asymptotic normality. Examples are provided using data from the American Association of Retired Persons Diet and Health Cohort Study.
doi: 10.1111/biom.13222pmid: 31975369
In large‐scale problems, it is common practice to select important parameters by a procedure such as the Benjamini and Hochberg procedure and construct confidence intervals (CIs) for further investigation while the false coverage‐statement rate (FCR) for the CIs is controlled at a desired level. Although the well‐known BY CIs control the FCR, they are uniformly inflated. In this paper, we propose two methods to construct shorter selective CIs. The first method produces shorter CIs by allowing a reduced number of selective CIs. The second method produces shorter CIs by allowing a prefixed proportion of CIs containing the values of uninteresting parameters. We theoretically prove that the proposed CIs are uniformly shorter than BY CIs and control the FCR asymptotically for independent data. Numerical results confirm our theoretical results and show that the proposed CIs still work for correlated data. We illustrate the advantage of the proposed procedures by analyzing the microarray data from a HIV study.
Wang, Yue; Ibrahim, Joseph G.; Zhu, Hongtu
doi: 10.1111/biom.13219pmid: 32010968
Many biomedical studies have identified important imaging biomarkers that are associated with both repeated clinical measures and a survival outcome. The functional joint model (FJM) framework, proposed by Li and Luo in 2017, investigates the association between repeated clinical measures and survival data, while adjusting for both high‐dimensional images and low‐dimensional covariates based on the functional principal component analysis (FPCA). In this paper, we propose a novel algorithm for the estimation of FJM based on the functional partial least squares (FPLS). Our numerical studies demonstrate that, compared to FPCA, the proposed FPLS algorithm can yield more accurate and robust estimation and prediction performance in many important scenarios. We apply the proposed FPLS algorithm to a neuroimaging study. Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.
Peterson, Christine B.; Osborne, Nathan; Stingo, Francesco C.; Bourgeat, Pierrick; Doecke, James D.; Vannucci, Marina
doi: 10.1111/biom.13235pmid: 32026459
Alzheimer's disease is the most common neurodegenerative disease. The aim of this study is to infer structural changes in brain connectivity resulting from disease progression using cortical thickness measurements from a cohort of participants who were either healthy control, or with mild cognitive impairment, or Alzheimer's disease patients. For this purpose, we develop a novel approach for inference of multiple networks with related edge values across groups. Specifically, we infer a Gaussian graphical model for each group within a joint framework, where we rely on Bayesian hierarchical priors to link the precision matrix entries across groups. Our proposal differs from existing approaches in that it flexibly learns which groups have the most similar edge values, and accounts for the strength of connection (rather than only edge presence or absence) when sharing information across groups. Our results identify key alterations in structural connectivity that may reflect disruptions to the healthy brain, such as decreased connectivity within the occipital lobe with increasing disease severity. We also illustrate the proposed method through simulations, where we demonstrate its performance in structure learning and precision matrix estimation with respect to alternative approaches.
Pereira, Luz Adriana; Taylor‐Rodríguez, Daniel; Gutiérrez, Luis
doi: 10.1111/biom.13234pmid: 32012223
We propose a Bayesian hypothesis testing procedure for comparing the distributions of paired samples. The procedure is based on a flexible model for the joint distribution of both samples. The flexibility is given by a mixture of Dirichlet processes. Our proposal uses a spike‐slab prior specification for the base measure of the Dirichlet process and a particular parametrization for the kernel of the mixture in order to facilitate comparisons and posterior inference. The joint model allows us to derive the marginal distributions and test whether they differ or not. The procedure exploits the correlation between samples, relaxes the parametric assumptions, and detects possible differences throughout the entire distributions. A Monte Carlo simulation study comparing the performance of this strategy to other traditional alternatives is provided. Finally, we apply the proposed approach to spirometry data collected in the United States to investigate changes in pulmonary function in children and adolescents in response to air polluting factors.
Zhang, Wei; Liu, Aiyi; Li, Qizhai; Albert, Paul S.
doi: 10.1111/biom.13236pmid: 32083733
This article concerns the problem of estimating a continuous distribution in a diseased or nondiseased population when only group‐based test results on the disease status are available. The problem is challenging in that individual disease statuses are not observed and testing results are often subject to misclassification, with further complication that the misclassification may be differential as the group size and the number of the diseased individuals in the group vary. We propose a method to construct nonparametric estimation of the distribution and obtain its asymptotic properties. The performance of the distribution estimator is evaluated under various design considerations concerning group sizes and classification errors. The method is exemplified with data from the National Health and Nutrition Examination Survey study to estimate the distribution and diagnostic accuracy of C‐reactive protein in blood samples in predicting chlamydia incidence.
Showing 1 to 10 of 38 Articles