Bayesian Phylogenetic Inference via Markov Chain Monte Carlo MethodsMau, Bob; Newton, Michael A.; Larget, Bret
doi: 10.1111/j.0006-341x.1999.00001.xpmid: 11318142
Summary. We derive a Markov chain to sample from the posterior distribution for a phylogenetic tree given sequence information from the corresponding set of organisms, a stochastic model for these data, and a prior distribution on the space of trees. A transformation of the tree into a canonical cophenetic matrix form suggests a simple and effective proposal distribution for selecting candidate trees close to the current tree in the chain. We illustrate the algorithm with restriction site data on 9 plant species, then extend to DNA sequences from 32 species of fish. The algorithm mixes well in both examples from random starting trees, generating reproducible estimates and credible sets for the path of evolution.
Fundamentals of Survival DataHougaard, Philip
doi: 10.1111/j.0006-341x.1999.00013.xpmid: 11318147
Summary. Survival data stand out as a special statistical field. This paper tries to describe what survival data is and what makes it so special. Survival data concern times to some events. A key point is the successive observation of time, which on the one hand leads to some times not being observed so that all that is known is that they exceed some given times (censoring), and on the other hand implies that predictions regarding the future course should be conditional on the present status (truncation). In the simplest case, this condition is that the individual is alive. The successive conditioning makes the hazard function, which describes the probability of an event happening during a short interval given that the individual is alive today (or more generally able to experience the event), the most relevant concept. Standard distributions available (normal, log‐normal, gamma, inverse Gaussian, and so forth) can account for censoring and truncation, but this is cumbersome. Besides, they fit badly because they are either symmetric or right skewed, but survival time distributions can easily be left‐skewed positive variables. A few distributions satisfying these requirements are available, but often nonparametric methods are preferable as they account better conceptually for truncation and censoring and give a better fit. Finally, we compare the proportional hazards regression models with accelerated failure time models.
The Graft Versus Leukemia Effect after Bone Marrow Transplantation: A Case Study Using Structural Nested Failure Time ModelsKeiding, Niels; Filiberti, Marusca; Esbjerg, Sille; Robins, James M.; Jacobsen, Niels
doi: 10.1111/j.0006-341x.1999.00023.xpmid: 11318159
Summary. Over the last decade, J. M. Robins has developed a set of tools for assessing, from observational data, the causal effects of a time‐dependent treatment or exposure in the presence of time‐dependent covariates that may be simultaneously confounders and intermediate variables. This report concerns a case study of the application of one these techniques, G‐estimation using structural nested failure time models, to the problem of assessing the effect of graft versus host disease on leukemia relapse after bone marrow transplantation.
Hazard Models for Line Transect Surveys with Independent ObserversSkaug, Hans J.; Schweder, Tore
doi: 10.1111/j.0006-341x.1999.00029.xpmid: 11318171
Summary. The likelihood function for data from independent observer line transect surveys is derived, and a hazard model is proposed for the situation where animals are available for detection only at discrete time points. Under the assumption that the time points of availability follow a Poisson point process, we obtain an analytical expression for the detection function. We discuss different criteria for choosing the hazard function and consider in particular two different parametric families of hazard functions. Discrete and continuous hazard models are compared and the robustness of the discrete model is investigated. Finally, the methodology is applied to data from a survey for minke whales in the northeastern Atlantic.
A Model for ThermogramsRobin, Stéphane
doi: 10.1111/j.0006-341x.1999.00037.xpmid: 11318177
Summary. Thermograms are curves resulting from thermal analysis and are of great interest in the study of various food and biological products physical properties. A method to separate underlying peaks is proposed, and statistical properties of estimates for some characteristic parameters are derived. The total number of peaks can be estimated with a sequential analysis of the residual plots. For each new peak, a statistical criterion is proposed to check whether it is significantly different from the noise of the recording. As an example, the method is applied to a summer milk fat fusion thermogram.
Genetic Models with Reduced Penetrance Related to the Y ChromosomeDarling, R. W. R.; Holt, Tim
doi: 10.1111/j.0006-341x.1999.00055.xpmid: 11318179
Summary. Classical statistical genetics models of a quantitative trait depending on an autosomal gene indicate that father‐to‐daughter and mother‐to‐son correlations should be the same. If phenotypes are not sex‐dependent, father‐to‐son and mother‐to‐daughter correlations also share this common value. On the other hand, if the gene is sex‐linked, then the father‐to‐son correlation is zero. Such models do not explain genetic variation in pulmonary artery pressure (PAP) of cattle—important because cattle with high PAP are known to develop brisket disease, pulmonary heart disease, and congestive heart failure when taken to high altitudes. Data on 966 calves at a ranch in Colorado showed positive correlation (0.2) between sire PAP and male calf PAP but slightly negative correlation (–0.01) between sire PAP and female calf PAP; the dam‐to‐male calf and dam‐to‐female calf correlations are both about 0.1. The model presented here postulates an autosomal gene with reduced penetrance (i.e., the trait may remain at a normal level even when the genotype suggests abnormality) and that, in males, the rate of penetrance is related to an abnormality in the Y chromosome and is therefore passed on from father to son. Then under plausible selective breeding assumptions, the pairwise correlation between fathers and daughters can become zero or negative. Explicit formulas are computed for the model covariances, and numerical computations indicate that plausible parameter values can be chosen for the model.
Use of Summary Measures to Adjust for Informative Missingness in Repeated Measures Data with Random EffectsWu, Margaret C.; Follmann, Dean A.
doi: 10.1111/j.0006-341x.1999.00075.xpmid: 11318181
Summary. We discuss how to apply the conditional informative missing model of Wu and Bailey (1989, Biometrics 45, 939–955) to the setting where the probability of missing a visit depends on the random effects of the primary response in a time‐dependent fashion. This includes the case where the probability of missing a visit depends on the true value of the primary response. Summary measures for missingness that are weighted sums of the indicators of missed visits are derived for these situations. These summary measures are then incorporated as covariates in a random effects model for the primary response. This approach is illustrated by analyzing data collected from a trial of heroin addicts where missed visits are informative about drug test results. Simulations of realistic experiments indicate that these time‐dependent summary measures also work well under a variety of informative censoring models. These summary measures can achieve large reductions in estimation bias and mean squared errors relative to those obtained by using other summary measures.
Mixed Effects Models with Bivariate and Univariate Association Parameters for Longitudinal Bivariate Binary Response DataTen Have, Thomas R.; Morabia, Alfredo
doi: 10.1111/j.0006-341x.1999.00085.xpmid: 11318182
Summary. When two binary responses are measured for each study subject across time, it may be of interest to model how the bivariate associations and marginal univariate risks involving the two responses change across time. To achieve such a goal, marginal models with bivariate log odds ratio and univariate logit components are extended to include random effects for all components. Specifically, separate normal random effects are specified on the log odds ratio scale for bivariate responses and on the logit scale for univariate responses. Assuming conditional independence given the random effects facilitates the modeling of bivariate associations across time with missing at random incomplete data. We fit the model to a dataset for which such structures are feasible: a longitudinal randomized trial of a cardiovascular educational program where the responses of interest are change in hypertension and hypercholestemia status. The proposed model is compared to a naive bivariate model that assumes independence between time points and univariate mixed effects logit models.