Bias due to censoring of deaths when calculating extra length of stay for patients acquiring a hospital infection

Bias due to censoring of deaths when calculating extra length of stay for patients acquiring a... Background: In many studies the information of patients who are dying in the hospital is censored when examining the change in length of hospital stay (cLOS) due to hospital-acquired infections (HIs). While appropriate estimators of cLOS are available in literature, the existence of the bias due to censoring of deaths was neither mentioned nor discussed by the according authors. Methods: Using multi-state models, we systematically evaluate the bias when estimating cLOS in such a way. We first evaluate the bias in a mathematically closed form assuming a setting with constant hazards. To estimate the cLOS due to HIs non-parametrically, we relax the assumption of constant hazards and consider a time-inhomogeneous Markov model. Results: In our analytical evaluation we are able to discuss challenging effects of the bias on cLOS. These are in regard to direct and indirect differential mortality. Moreover, we can make statements about the magnitude and direction of the bias. For real-world relevance, we illustrate the bias on a publicly available prospective cohort study on hospital-acquired pneumonia in intensive-care. Conclusion: Based on our findings, we can conclude that censoring the death cases in the hospital and considering only patients discharged alive should be avoided when estimating cLOS. Moreover, we found that the closed mathematical form can be used to describe the bias for settings with constant hazards. Keywords: Bias, Censored deaths, Extra length of stay, Hospital acquired infection, Multistate model Background also because there are two possible controversy outcomes, Change in length of stay (cLOS) in hospital is a key out- namely in-hospital death and discharge from the hospital come when studying the health impact and economic con- alive. sequences of hospital acquired infections (HIs). A patient Barnettetal. [1] used a multi-state model to show the with an HI is likely to stay longer in the hospital, incur- occurrence of substantial bias in estimating cLOS when ring extra costs. Thus, appropriately quantifying the cLOS studies fail to treat HIs as a time-dependent exposure (this in hospitals (in days) due to HIs is crucial for economical bias is known as ’time-dependent bias’). and policy decision making. However, a correct estima- Brock et al. [2] found that the way in which mor- tion of cLOS is challenging and prone to bias. This is tality is handled while investigating other time-related not only because HIs are time-dependent covariates but outcomes (such as discharge alive) influences the esti- mate of cLOS. They contrasted two ad-hoc approaches. In the first approach they restricted the analysis to the *Correspondence: cube@imbi.uni-freiburg.de patients who survived. In the second approach, individu- Equal contributors Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical als who died were right-censored at the longest possible Center - University of Freiburg, Freiburg, Germany follow-up time. They concluded that the two methods can Freiburg Center of Data Analysis and Modelling, University of Freiburg, potentially give different results for the same data. Brock Eckerstr. 1, 79104 Freiburg, Germany Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 2 of 10 et al. argue that this could lead to conflicting conclu- cLOS by censoring patients that die. Motivated by the sions, unless the investigators are aware of the differences work of Joly et al. [8] and Binder and Schumacher [9], we between the estimators. systematically investigate the bias with respect to “dif- In many studies patients who are dying in the hos- ferential mortality”. In their setting differential mortal- pital are censored at the time of death to study the ity is a term which defines the difference in the rate of mortality of the patients with and without the infec- cLOS in hospitals due to HIs. One recent example is a study by Noll et al. [3]. They calculated cLOS by cen- tion. They consider an illness-death model where HI is soring the outcome of patients who died in the hospital, an intermediate event between admission and death. In had ventilator dependent respiratory failure, or with- our setting we have two competing outcomes (death in drew from the study. Another recent study by Guerra hospital or discharge alive). HI is an intermediate event et al. [4] censored the patients who were not discharged between admission and death or discharge which ever from the hospital to their usual residence within the comes first. Therefore, we have considered two kinds study period, namely death cases or patients that were of “differential mortality” in the time-constant hazards transferred, to investigate the cLOS due to HIs. Zuniga set up, which affects the absolute mortality risk of a et al. [5] censored death cases and analysed the cLOS patient: 1. “direct differential mortality”, when the death considering information only of the patients who were hazards with and without the infection differ while the discharged alive. discharge hazards with and without the infection remain However, death in the hospital is informative censoring the same. 2. “indirect differential mortality”, when dis- and should be treated in a competing risks framework as charge hazard rates with and without the infection differ proposed by Schulgen et al. [6]. In this article, we show that while death hazards with and without the infection remain treating death-cases in the hospital as non-informative the same. The type of differential mortality can be stud- censoring can lead to biased estimates of cLOS. ied with cause-specific Cox proportional hazards mod- It may be argued that the mortality rates in hospitals els for death and discharge with HI as time-dependent are usually not very high, as most of the patients are dis- covariate. charged alive. Thus, using only the information of the Moreover, we compare the estimate of cLOS patients discharged alive might lead to reasonable esti- from the biased model with the cLOS attributed to patients discharged alive. To do so, we use the for- mation of cLOS in many cases. However, the efficiency mula derived by Allignol et al. [10]. They propose a of such an estimator might be questionable. Moreover, in intensive-care units (ICUs) where HIs are a serious prob- simple method to split the extra days due to HIs in lem, the mortality can reach up to 30-50%. This is for the hospital into days attributable to patients that die instance the case for ventilated and critically-ill patients. and attributable to those that are discharged alive. Since cLOS is often used to calculate costs as costs are This can be done for both homogeneous Markov driven by bed days, we argue that the costs of a hospital models and for time-inhomogeneous Markov mod- stay are not affected by the status of the patient at the end els. The methods for the time-inhomogeneous model of stay. are implemented in the R-packge etm,developedby A reason for censoring the death cases may be the Allignol et al. [7]. wish to give cLOS (and cost) estimates for a hospital- In “Methods” section part 1 we shortly discuss the for- population which is discharged alive. Therefore, we pro- mulas to estimate cLOS with the two approaches under pose to follow the approach by Allignol et al. [7]. They the constant hazards assumption. In “Methods”section suggest to first use the combined endpoint ’discharge part 2, we aim to provide a proper analytical expression (dead or alive)’ to calculate the overall cLOS (which can of the potential bias in estimating the cLOS due to HIs also be used for a cost analysis) and second, to distinguish when the information on the death cases in the hospital is the impact of HIs on cLOS between patients discharged censored. Assuming a time-homogeneous Markov model, alive and patients deceased. Based on this approach, stud- where the transition hazards are time-independent, we ies censoring the patients at the time of their death are systematically explore the amount and direction of the pronetobias. bias. In “Results and discussion” section, we illustrate the To understand and quantify the difference of the com- real-world relevance of the bias by analysing a random peting risks and the censoring approach, we assume the subset of the SIR-3 prospective cohort study on hospi- simplified setting of constant hazards. For this setting, we tal acquired pneumonia in ICUs in Berlin, Germany. For derive an analytical expression for the difference of cLOS the real data analysis, we estimate the cLOS by apply- estimated as proposed by Allignol et al. [7]and cLOS ing the method for time-inhomogenuous Markov models estimated by censoring the deceased patients. This ana- developed by Allignol et al. [7], which is based on the lytical expression can be used to investigate and analyse Aalen-Johansen estimator. The paper ends with a short the magnitude of the bias that occurs when estimating discussion in “Conclusion”section. Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 3 of 10 Methods status X ∈{0, 1, 2, 3}. By definition, all individuals start Multi-state model for hospital infections in the initial state 0 of being alive in the hospital and free We focus on estimating the cLOS in the hospital due of HI, i.e., X = 0. We denote T as the smallest time at to HIs. We study the amount of bias which can occur which the process is in an absorbing state, T = inf{t : X ∈ when estimating the cLOS by treating patients that die as {2, 3}}. Eventually, end of the hospital stay occurs when censored. X ∈{2, 3}. To evaluate the impact of HIs on the subsequent hos- To do so, we describe the data setting with a multi- state model as proposed by e.g. [7]. Figure 1 displays this pital stay, Schulgen and Schumacher (1996) [6] suggested model (model A), which is a multi-state model with states, to consider the difference of the expected subsequent stay 0 = admission, 1 = infection, 2 = discharge alive and given infectious status at time s, φ(s) = E(T |X = 1) − 3 = death. For simplicity we assume that the hazard rates E(T |X = 0). Schulgen and Schumacher called φ(s) the are constant over time so that we can focus on the key ’expected extra hospitalization time of an infected indi- points concerning the censoring of the death cases. We vidual dependent on time s’. In our setting, the process denote α (t) = α as the hazard of moving from state “i” follows a homogeneous Markov model. Allignol et al. [7] ij ij to state “j”. An example hazard is, studied the cLOS for model A (Fig. 1) mathematically and found that cLOS does not depend on the time s in the α (t)·t ≈P(HI acquired by time t+t|no HI up to time t). homogeneous case. The cLOS can therefore be expressed as The actual hazard α (t) is obtained by taking limits as α + α 1 02 03 t → 0. We define the hazard rates, α = infection haz- CLOS = φ(s) = − 1 × true α + α α + α + α ard rate; α = discharge hazard rate without infection; 12 13 01 02 03 α = death hazard rate without infection; α = discharge (2) 03 12 hazard rate with infection and α = deathhazardrate Furthermore, Allignol et al. provided a formula to sepa- with infection. Under a constant hazards assumption, one rate the estimation of the cLOS for the discharged patients estimates α by using the maximum likelihood estimator ij and the deceased patients under the constant hazard set number of i → j transitions up. This formula is given by α ˆ = .(1) ij person-time in state i CLOS = CLOS(due to discharged alive) Under this model the mean sojourn time of an infected + CLOS(due to deaths) patient in the hospital is and of an uninfected (3) α +α 12 13 α α 12 13 patient it is .Wewrite X for the state occu- = × CLOS + × CLOS α +α +α 01 02 03 α + α α + α 12 13 12 13 pied by the patient at time t. At a time point t,the patient Fig. 1 Model A: The four state Multistate Model; 0 is “Admission” without hospital acquired infection (HI); 1 is hospital acquired “Infection”; 2 is the status of the patients who are “Discharged Alive” and 3 is the “Death” of the patient in the hospital. The constant hazard rates, α is the hazard rate to acquire the hospital infection during the hospital stay; α is the hazard rate to be discharged alive without the HI; α is the hazard rate to dead 02 03 without the HI and α is the hazard rate to be discharged alive after the HI; α is the hazard rate to be dead after the HI 12 13 Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 4 of 10 Hence, we can separately estimate cLOS attributable to No differential mortality patients discharged alive and cLOS attributable to death The bias predominately depends on the hazard rates. In cases by plugging in the estimates of the constant hazards the following we study the magnitude of the bias under obtained with (1). differential mortality. When there is no differential mor- Model B results from model A when treating death cases tality, that is, no difference between the death hazards as censored. In contrast to model A, patients that die with and without infection and no difference between the are assumed to remain under the same risk of being dis- discharge hazards with and without infection,  = α − 1 13 charged alive as patients that are still in the hospital. While α = 0and  = α − α = 0, the bias becomes 0. 03 2 02 12 the discharge hazards of model A and B are the same, the The following formula can be used to obtain an idea of the absolute chance of discharge alive in model A depends on magnitude and the direction of the bias for given values of the competing risk death and therefore differs from the the hazard functions when the death cases are censored. discharge probability modelled in model B. To derive the Direct differential mortality cLOS that results from model B, we apply the formula Under direct differential mortality, there is a non-zero proposed by Allignol et al. which is then difference between the death hazards with and without infection while the discharge hazards with and without α 1 infection are the same, that is  = α − α = 0and CLOS = − 1 .(4) 2 02 12 α α + α 12 01 02 = α − α = 0. Then, the bias can be expressed as 1 13 03 Analytic expression for the bias 1 1 CLOS − CLOS = (α − α ) · · (6) true 13 03 Our focus is on investigating the bias in cLOS when the α α 0· 1· information of the patients that die is censored. Using the 1 1 =  · · . formulas in Eqs. (2)and(4), we deduce that the bias in α α 0· 1· cLOS due to censoring is, 1 1 The bias changes with  .Moreover, as and are α α 0· 1· theaverage sojourntimeinstate0andstate1ofunin- α (α − α ) 03 02 12 CLOS − CLOS = fected and respectively infected patients, the bias also true α (α + α + α )(α + α ) 12 01 02 03 01 02 increases when the average sojourn times increase. (α α − α α ) 02 13 03 12 α (α + α + α )(α + α ) Indirect differential mortality 12 01 02 03 12 13 α (α −α ) (α α −α α ) Under indirect differential mortality, there is a non-zero 03 02 12 02 13 03 12 = + , difference between the discharge hazards with and with- α α α α α α 12 0· 0· 1· 12 0· out infection while the death intensities with and without (5) infection are the same, that is  = α − α = 0and 1 13 03 = α − α = 0. Then, the bias is 2 02 12 where α = α + α + α , α = α + α , α = 0· 01 02 03 01 02 1· 0· 1 1 α (α + α ) α + α and α = α . The formula shows that the bias 03 0· 12 12 13 12 1· CLOS −CLOS = (α − α )· · · true 02 12 depends on the product of the mean LOS in state 0 (α ) α α α 0. 1· 12 0· and a term depending on all hazards. The second term 1 1 α (α + α ) 03 0· 12 =  · · · .(7) determines the direction of the bias which could be pos- ∗ α α α α 0· 1· 12 0· itive or negative. In the following, we study the bias in specific settings which we call differential mortality. We The bias changes with  . The bias also increases with define “direct differential mortality” as the setting where the average waiting time in state 0 and in state 1. Again, the discharge hazards α and α are the same but the in most of the real world situations, we observe  > 0, 02 12 death hazards α and α differ. In contrast, “indirect which means the infected patients have lower discharge 03 13 differential mortality” is described by equal death haz- rates than the uninfected ones. Then, the bias is positive ards but different discharge hazards. Of note, due to the which leads to an overestimation of the cLOS. competing risk situation both settings influence - directly The derived analytical expressions demonstrate for a or indirectly - the overall hospital mortality. We define simplified setting (constant hazards, differential mortal- = α − α and  = α − α and emphasize that ity) how estimation of cLOS is influenced when informa- 1 13 03 2 02 12 both quantities are likely to be positive because infected tion of the death cases is censored. Only in the situation patients often have a higher mortality hazard and a lower where HIs have neither an effect on the death hazards discharge hazard, i.e., they stay longer in the hospital. nor on the discharge hazards, the bias is avoided. Other- A formal mathematical derivation of the bias can be wise, the bias increases with increasing magnitude of the found in Additional file 1. differential mortality. Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 5 of 10 Results and discussion cLOS under the “censored model” (model B), the informa- To show the real world relevance of our findings, we tion of the patients who died is censored at the time of apply the method to a data example. The constant hazards their death. Table 1 shows an extract of the dataset under assumption is a facilitating way to compare the estimands each model. of cLOS resulting from model A and model B. However, To obtain first insights into the data structure, we esti- for real data application it is often too restrictive. There- mate the cause-specific cumulative hazards for model A fore, in our data example we compare models A (Fig. 1) (shown in Fig. 3). The graph indicates that the cumula- and B (Fig. 2) both under the constant hazards assumption tive discharge hazards are not straight lines (which implies (time-homogeneous Markov model) and more generally that they are not constant). Moreover, we observe that under a time-inhomogeneous Markov model. the discharge hazard is consistently reduced for patients We consider a subset of the SIR-3 cohort study from with an HI. The cumulative hazards are estimated using the Charite university hospital in Berlin, Germany, with the R-package mvna, developed by Allignol et al. (2008) prospective assessment of data to examine the effect of HIs in intensive care (Beyersmann et al. 2006a) [11]. The Table 1 Extract of the data showing the artificial censoring of the aim of this study was to investigate the effect of pneu- patients who died in the hospital at the time of their death, monia which may be acquired by the patients during denoted by “cens”. It shows the patient identification number their stay in the ICU. The data is publicly available in (“id”), transition state (“from” and “to”), time taken by the patient the format of los.data from the etm R package. Briefly, to move from state 0 to the current state “to” is given by “time”. los.data includes 756 patients who are admitted to the State “1” defines when the patient is infected, state “2” defines ICU between February 2000 and July 2001. After having when the patient is discharged alive and in model A, state “3” been admitted to the ICU, 124 (16.4%) patients acquired defines death of the patient at the hospital while the same pneumonia (infection) in the hospital. Among those who patients are artificially censored in model B got infected, 34 (27.4%) patients died. Overall, 191 patient Id From To Time died after ICU admission which is 25.3%. None of the Model A patients were censored. 22 0 1 4 For the analysis, we first modify the data structure such 22 1 2 16 that it corresponds to model A. Moreover, to analyze the 29 0 1 6 29 1 3 22 .. .. .. .. 245 0 2 28 250 0 2 9 .. .. .. .. 40 3 11 17 0 3 4 .. .. .. .. .. .. .. .. Model B 22 0 1 4 22 1 2 16 29 0 1 6 29 1 cens 22 .. .. .. .. 245 0 2 28 250 0 2 9 Fig. 2 Model B: Multistate Model resulting from censoring the death .. .. .. .. cases; 0 is the “Admission” state; 1 is HI; 2 is the status of the patients who are “Discharged Alive” and information on rest of the patients are 4 0 cens 11 “Censored”. The constant hazard rates that can be calculated from the 17 0 cens 4 model, α is the hazard rate to acquire a HI infection during the .. .. .. .. hospital stay; α is the hazard rate to be discharged alive without the HI; and α is the hazard rate to be discharged alive after the HI .. .. .. .. Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 6 of 10 Model A Model B 0 1 0 1 0 2 0 2 0 3 1 2 1 2 1 3 0 20406080 020 40 60 80 Time Time Fig. 3 Estimated Cumulative hazards rates in the first 80 days for the multi-state models in Figs. 1 and 2. The slope of each line corresponds to the actual hazard rate, e.g a straight line would mean a constant hazard rate. The left figure shows the cumulative hazard functions for model A, when death is considered as competing event. The right figure corresponds to that of model B, when the patients are censored at the time of death [12] based on the Nelson-Aalen estimator. We also esti- Data analysis on the effect of censoring on the estimated mated the cumulative hazard rates for model B where extra length of stay the patients are censored at the time of their death (also In this section we estimate the cLOS of the SIR-3 data shown in Fig. 3). We can clearly see that the censoring sample. We first use the non-parametric approach for does not affect the other hazard rates. This means the dis- time-inhomogeneous Markov models followed by the charge hazards as well as the infection hazard of model parametric approach assuming the hazard rates of the A and B are the same. Note that pneumonia appears to dataset are constant. In both approaches we estimate the have no effect on the death hazard. However, this does cLOS by describing the data with model A and model B not imply that pneumonia has no effect on mortality. The respectively. Then we calculated the magnitude of the bias reason is that pneumonia reduces the discharge hazard occurring in model B. as a consequence patient with pneumonia stay longer in Moreover, we distinguish the cLOS obtained from the ICU. As a consequence, more patients with pneumo- model A between patients being discharged alive and nia are observed to die in the ICU than patients without patients that die. This way we can investigate how pneumonia. The effects of HI on the death and discharge manyextraICUdaysareattributabletopatientsbeing rates can be estimated with two cause-specific hazards discharged alive. This quantity is also compared to model (for death and discharge). The indirect effect on the biased model where the cLOS attributable to dis- mortality due to a decreased discharge hazard is com- charged patients is estimated by treating patients that die monly observed for hospital-acquired infections [13, 14]. as censored. Cumulative Hazard Cumulative Hazard 012345 Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 7 of 10 Non-parametric model and discussion” section, we further estimate the cLOS by We estimate the difference in cLOS associated with HIs assuming that the hazard rates are constant. within the framework of model A (no censoring of death We first estimate the constant hazard rates with Eq. (1). cases) and model B (censoring of patients at the time We obtain, α ˆ = 0.019, α ˆ = 0.074, α ˆ = 0.024, 01 02 03 of their death) by using the R-package etm.The pack- α ˆ = 0.059 and α ˆ = 0.022. Under a homogeneous 12 13 age is based on computing the Aalen-Johansen estima- Markov process, this data-situation is similar to indirect tors assuming a time-inhomogeneous Markov model. For differential mortality. Plugging the estimates into the for- model A, the estimated cLOS due to HIs is greater for ear- mulas in Eqs. (2)and(4), the cLOS from model A is lier days (see the lower graphs in Fig. 4). The average cLOS 1.773. The cLOS due to HIs with censoring of the death over all days is calculated by weighting the differences in cases is 2.699 (model B). Thus, censoring of the death length of stay on each day. This gives an estimated cLOS cases is overestimating the cLOS by 0.926 days in the of 1.975 days. The corresponding weight distributions are time-constant hazards set up. Unlike in the case of etm- also illustrated in Fig. 4. The average expected cLOS esti- estimation (time-inhomogenuous Markov model), in the mated after censoring the information of the patients at time-constant hazards set up, model B is overestimating the time of their death is 0.446 days (model B). So the dif- the cLOS with respect to model A. ference in the cLOS estimated from model B and model A Comparing the two estimation methods, we find that using the R package etm is 1.529 days. the cLOS under constant hazards is similar to the value We note in Fig. 4 that for model B the estimate for the obtained with the etm-package for model A (1.773 days cLOS in hospitals with and without infection cross each and 1.975 days respectively). From model B, we obtain other. This implies that the underlying assumption of a 2.699 days under the constant hazards assumption and homogeneous Markov model, i.e, when the hazard rates 0.446 days with etm. Thus, the values obtained from are constant, may not be a viable assumption for the data model B clearly differ. While in the estimation with etm, set. As Allignol et al. noted, these curves should be parallel model B is underestimating the cLOS with respect to for the homogeneous Markov assumption to be plausible. model A, we observe the opposite under the constant hazards assumption. Parametric model with constant hazards This difference in behavior could be attributed to the To compare and investigate the results from the etm pack- consequence of the violation of the constant hazards age with the analytical expressions derived in “Results assumption. As seen in Fig. 4, the cLOS of patients with Fig. 4 Weights and expected LOS for patients with and without an HI in the first 15 daysof los.data, which is a subset of the SIR-3 study. The left figure corresponds to model A (death cases are considered as competing event). The right figure corresponds to model B (death cases are censored). The estimated cLOS due to model A is 1.975 days and that for model B is 0.446 days Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 8 of 10 and without HI cross for model B indicating a much to those who died in the ICU. The difference in the stronger discrepancy to the assumption than for model estimated cLOS for model B and that attributable to A, where the curves rather touch than cross. These cir- patients discharged alive under model A is therefore about cumstances can further be understood when comparing 1.552 days. Thus, model B also underestimates cLOS the combined hazards with and without HI of model A attributable to patients discharged alive. It is further to and B with their time-constant counterparts shown in the be noted that under estimation with etm (model A) over- Additional file 1:FigureS5. all cLOS is similar to that of the cLOS estimated for For a more detail inspection, we estimate the cause- discharged patients (model A). This is due to the circum- specific hazard rates non-parametrically with B-splines stance that most of the patients are discharged alive. using the R-package bshazard. A detailed description of Using the formula (3) for the constant hazards approach, the method is given by Rebora et al. [15]. The estimated we obtain 1.291 days for discharged patients and 0.4815 death and discharge hazards both with and without HI are days for the deceased patients. In Table 2,wesee that shown in the Additional file 1: Figure S6. The plots show model B is overestimating the cLOS with respect to also the hazard rates obtained by using equation (1), where the cLOS (due to discharge alive) by 1.408 days. This we assume that they are constant. Comparing the esti- means model B is clearly overestimating the estimate mated hazard rates with their time-constant analogues we from cLOS using only the discharged patients assuming a clearly see that the data does not correspond to a homo- time-constant hazards set up. geneous Markov model. The discharge hazard before HI When comparing the time-inhomogeneous to the increases strongly in the first 10 days. After a peak at day homogeneous (constant hazards) approach under model 10 it strongly decreases again and remains on a moder- A we observe that the difference between overall cLOS ate level from day 20 onward. The behavior of the death and cLOS due to patients discharged alive is higher for the hazard before HI is similar but on a much lower level. Fur- homogeneous approach. This is due to the circumstance thermore, it remains below the discharge hazard before HI that under constant hazards the effect of HI is averaged at all time-points. In contrast, the discharge hazard after over the complete time-interval to estimate cLOS. Using HI seems to be almost constant and is well approximated the time-inhomogeneous approach by Allignol et al. cLOS by α ˆ from formula (1). The death hazard after HI con- is weighted according to the different lengths of stay. As tinuously slightly decreases and always remains below the most patients are discharged alive within the first few discharge hazard after HI. days, the weights are highest at these time-points (see Fig. 4). When using the homogeneous approach then the While the constant hazards assumption is not plausible, the time-inhomogeneous Markov assumption is. Testing influence of the discharged patients on the estimate of this assumption by including time of HI as covariate in a cLOS is less strong. Cox regression model showed no effect on the death and The complete R code of the data analysis is provided in discharge hazards after HI. The hazard ratios were 0.98 the Additional file 2. ([0.94 ; 1.01]) and 1.03 ([0.96 ; 1.09]) respectively. Conclusion Distinction between discharged (alive) and dead The major innovation of this study was the systematic Using the clos function in the etm-package, we obtain evaluation of the bias due to censoring of death cases 1.998 days as the estimated cLOS attributable to patients when studying cLOS in the hospital due to HIs. While who are discharged alive and −0.0234 days attributable Allignol et al. [7] provided an appropriate estimator, the Table 2 Estimation of cLOS with respect to model A (no censoring of deaths) as well as cLOS (discharged) and cLOS (death) (based on model A but distinguishing between death and discharge). Moreover, cLOS with respect to model B (censoring of deaths). Additionally we calculate the bias between model A and model B and the bias between model B and cLOS based on model A for discharged patients only. The comparison is done for the estimation of cLOS by assuming constant hazard and by using the etm package (assuming time-dependent hazards) CLOS CLOS CLOS CLOS Bias Bias (Model A) discharged (Model A) death (Model A) (Model B) (Model B - Model A) (Model B - discharged only) constant 1.773 1.291 0.482 2.699 0.926 1.408 hazard etm 1.975 1.998 -0.0234 0.446 -1.529 -1.552 package Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 9 of 10 existence of the bias due to censoring of death cases was Acknowledgements We thank Dr. Klaus Kaier and Thomas Heister for fruitful discussions and neither mentioned nor discussed by the authors. comments. We first evaluated the bias in a mathematically closed form assuming a setting with constant hazards. A similar Funding MW has been funded by the German Research Foundation (Deutsche approach in a simpler setting without competing out- Forschungsgemeinschaft) (grant No WO 1746/1-1); MvC has received support comes has been used by Joly et al. [8]. Our analytical from the Innovative Medicines Initiative Joint Undertaking under grant evaluation has the advantage that we are able to discuss agreement no. 115737-2 (Combatting bacterial resistance in Europe - molecules against Gram negative infections [COMBACTE-MAGNET]). The challenging effects regarding direct and indirect differen- article processing charge was funded by the German Research Foundation tial mortality. Moreover, it allows us to make statements (DFG) and the Albert Ludwigs University Freiburg in the funding programme about the magnitude and direction of the bias. Open Access Publishing. The real data application also showed that effects Availability of data and materials regarding direct and indirect differential mortality do exist R code is available in the supplementary material of this paper. The data is and that the bias influences the estimates of cLOS. In publicly available in the Comprehensive R Archive Network package (R) etm. model A, the cLOS estimation via the time-homogeneous The Additional file 1: Figures S5 and S6 as well as the detailed derivation of the formula of the bias is available as separate file in the online supplementary model gave similar estimates as the one which allows material. time-inhomogeneity, whereas it was different for model B where we treated patients that die as censored obser- Authors’ contributions vations. Although a difference in estimation of cLOS has SR and MW calculated the analytical expression of the bias, SR performed the analysis of the data example; MvC critically revised and rephrased the been observed due to censoring of death cases both by manuscript and contributed to the analysis of the data example; MS gave using the “time-dependent hazard” (via etm package) and insightful contributions to conception of the bias, critically revised the the “time-constant hazard” assumptions, the bias shown manuscript and provided major comments; MW formulated the main problem, supervised the research project and critically revised the manuscript. in the two set ups goes in different direction. Thus, our All authors read and approved the final manuscript. closed formula has limitations if the assumptions are not fulfilled. Therefore, a time-dependent hazards model Ethics approval and consent to participate should be considered for future research. However, before Not applicable. dealing with a complicated time-inhomogeneous model Competing interests one must understand the behavior of the bias for the sim- MW is a member of the editorial board of BMC research methodology. SR, pler constant hazards model. Understanding the bias in MvC and MS declare no competing interests. a simple setting was the aim of this paper. To point out Publisher’s Note the presence of bias in a real world situation we have used Springer Nature remains neutral with regard to jurisdictional claims in the publicly available SIR-data. Even thought the constant published maps and institutional affiliations. hazards assumption is not possible for this data set we Author details could demonstrate the existence of the bias. Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical A further limitation of our study was not considering Center - University of Freiburg, Freiburg, Germany. Freiburg Center of Data Analysis and Modelling, University of Freiburg, Eckerstr. 1, 79104 Freiburg, confounding factors as the length of stay of the patients Germany. Department of Statistics, Texas A&M University, 3143 TAMU, may depend on the underlying morbidity of the patient. 77843-3143, College Station, Texas, USA. We emphasize that the bias due to censoring the death Received: 13 October 2017 Accepted: 30 April 2018 cases is a type of survival bias and systematically differ- ent from confounding. Based on our findings, we can conclude that censoring the deaths should be avoided. References Moreover, the formula we derived can be used to describe 1. Barnett AG, Beyersmann J, Allignol A, Rosenthal VD, Graves N, the bias for settings with constant hazards. Wolkewitz M. Value Health. 2011;14(2):381–6. 2. Brock GN, Barnes C, Ramirez JA, Myers J. How to handle mortality when investigating length of hospital stay and time to clinical stability. BMC Med Res Methodol. 2011;11(1):144. 3. Noll DR, Degenhardt BF, Johnson JC. Multicenter osteopathic Additional files pneumonia study in the elderly: Subgroup analysis on hospital length of stay, ventilator-dependent respiratory failure rate, and in-hospital mortality rate. J Am Osteopath Assoc. 2016;116(9):574–87. Additional file 1: The document contains Figures S5 and S6 mentioned 4. Sousa A, Guerra RS, Fonseca I, Pichel F, Amaral T. Sarcopenia and length in section ’Results and discussion’ as well as the detailed mathematical of hospital stay. Eur J Clin Nutr. 2016;70(5):595–601. derivation of the bias formula presented in section ’Methods’. (PDF 192 kb) 5. Zúñiga MFS, Delgado OEC, Merchán-Galvis AM, Caicedo JCC, Calvache Additional file 2: The document is the complete R script used in section JA, Delgado-Noguera M. Factors associated with length of hospital stay 4 for the data analysis. (R 6 kb) in minor and moderate burns at popayan, colombia. analysis of a cohort study. Burns. 2016;42(1):190–5. Abbreviations 6. Schulgen G, Schumacher M. Estimation of prolongation of hospital stay ICU: Intensive care unit; HI: Hospital-acquired infection; cLOS: change in length attributable to nosocomial infections: new approaches based on of hospital/ICU stay multistate models. Lifetime Data Anal. 1996;2(3):219–40. Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 10 of 10 7. Allignol A, Schumacher M, Beyersmann J. Estimating summary functionals in multistate models with an application to hospital infection data. Comput Stat. 2011;26(2):181–97. 8. Joly P, Commenges D, Helmer C, Letenneur L. A penalized likelihood approach for an illness–death model with interval-censored data: application to age-specific incidence of dementia. Biostatistics. 2002;3(3): 433–43. 9. Binder N, Schumacher M. Missing information caused by death leads to bias in relative risk estimates. J Clin Epidemiol. 2014;67(10):1111–20. 10. Allignol A, Schumacher M, Beyersmann J, et al. Empirical transition matrix of multi-state models: the etm package. J Stat Softw. 2011;38(4):1–15. 11. Beyersmann J, Gastmeier P, Grundmann H, Bärwolff S, Geffers C, Behnke M, Rüden H, Schumacher M. Use of multistate models to assess prolongation of intensive care unit stay due to nosocomial infection. Infect Control. 2006;27(05):493–9. 12. Allignol A, Beyersmann J, Schumacher M. mvna: An R package for the nelson-aalen estimator in multistate models. R news. 2008;8(2):48–50. 13. Schumacher M, Allignol A, Beyersmann J, Binder N, Wolkewitz M. Hospital-acquired infections—appropriate statistical treatment is urgently needed!. Int J Epidemiol. 2013;42(5):1502–8. 14. Melsen WG, Rovers MM, Groenwold RH, Bergmans DC, Camus C, Bauer TT, Hanisch EW, Klarin B, Koeman M, Krueger WA, et al. Lancet Infect Dis. 2013;13(8):665–71. 15. Rebora P, Salim A, Reilly M. R Journal. 2014;6(2):114–22. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png BMC Medical Research Methodology Springer Journals

Bias due to censoring of deaths when calculating extra length of stay for patients acquiring a hospital infection

Free
10 pages
Loading next page...
 
/lp/springer_journal/bias-due-to-censoring-of-deaths-when-calculating-extra-length-of-stay-DD0MK8zpX0
Publisher
BioMed Central
Copyright
Copyright © 2018 by The Author(s)
Subject
Medicine & Public Health; Theory of Medicine/Bioethics; Statistical Theory and Methods; Statistics for Life Sciences, Medicine, Health Sciences
eISSN
1471-2288
D.O.I.
10.1186/s12874-018-0500-3
Publisher site
See Article on Publisher Site

Abstract

Background: In many studies the information of patients who are dying in the hospital is censored when examining the change in length of hospital stay (cLOS) due to hospital-acquired infections (HIs). While appropriate estimators of cLOS are available in literature, the existence of the bias due to censoring of deaths was neither mentioned nor discussed by the according authors. Methods: Using multi-state models, we systematically evaluate the bias when estimating cLOS in such a way. We first evaluate the bias in a mathematically closed form assuming a setting with constant hazards. To estimate the cLOS due to HIs non-parametrically, we relax the assumption of constant hazards and consider a time-inhomogeneous Markov model. Results: In our analytical evaluation we are able to discuss challenging effects of the bias on cLOS. These are in regard to direct and indirect differential mortality. Moreover, we can make statements about the magnitude and direction of the bias. For real-world relevance, we illustrate the bias on a publicly available prospective cohort study on hospital-acquired pneumonia in intensive-care. Conclusion: Based on our findings, we can conclude that censoring the death cases in the hospital and considering only patients discharged alive should be avoided when estimating cLOS. Moreover, we found that the closed mathematical form can be used to describe the bias for settings with constant hazards. Keywords: Bias, Censored deaths, Extra length of stay, Hospital acquired infection, Multistate model Background also because there are two possible controversy outcomes, Change in length of stay (cLOS) in hospital is a key out- namely in-hospital death and discharge from the hospital come when studying the health impact and economic con- alive. sequences of hospital acquired infections (HIs). A patient Barnettetal. [1] used a multi-state model to show the with an HI is likely to stay longer in the hospital, incur- occurrence of substantial bias in estimating cLOS when ring extra costs. Thus, appropriately quantifying the cLOS studies fail to treat HIs as a time-dependent exposure (this in hospitals (in days) due to HIs is crucial for economical bias is known as ’time-dependent bias’). and policy decision making. However, a correct estima- Brock et al. [2] found that the way in which mor- tion of cLOS is challenging and prone to bias. This is tality is handled while investigating other time-related not only because HIs are time-dependent covariates but outcomes (such as discharge alive) influences the esti- mate of cLOS. They contrasted two ad-hoc approaches. In the first approach they restricted the analysis to the *Correspondence: cube@imbi.uni-freiburg.de patients who survived. In the second approach, individu- Equal contributors Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical als who died were right-censored at the longest possible Center - University of Freiburg, Freiburg, Germany follow-up time. They concluded that the two methods can Freiburg Center of Data Analysis and Modelling, University of Freiburg, potentially give different results for the same data. Brock Eckerstr. 1, 79104 Freiburg, Germany Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 2 of 10 et al. argue that this could lead to conflicting conclu- cLOS by censoring patients that die. Motivated by the sions, unless the investigators are aware of the differences work of Joly et al. [8] and Binder and Schumacher [9], we between the estimators. systematically investigate the bias with respect to “dif- In many studies patients who are dying in the hos- ferential mortality”. In their setting differential mortal- pital are censored at the time of death to study the ity is a term which defines the difference in the rate of mortality of the patients with and without the infec- cLOS in hospitals due to HIs. One recent example is a study by Noll et al. [3]. They calculated cLOS by cen- tion. They consider an illness-death model where HI is soring the outcome of patients who died in the hospital, an intermediate event between admission and death. In had ventilator dependent respiratory failure, or with- our setting we have two competing outcomes (death in drew from the study. Another recent study by Guerra hospital or discharge alive). HI is an intermediate event et al. [4] censored the patients who were not discharged between admission and death or discharge which ever from the hospital to their usual residence within the comes first. Therefore, we have considered two kinds study period, namely death cases or patients that were of “differential mortality” in the time-constant hazards transferred, to investigate the cLOS due to HIs. Zuniga set up, which affects the absolute mortality risk of a et al. [5] censored death cases and analysed the cLOS patient: 1. “direct differential mortality”, when the death considering information only of the patients who were hazards with and without the infection differ while the discharged alive. discharge hazards with and without the infection remain However, death in the hospital is informative censoring the same. 2. “indirect differential mortality”, when dis- and should be treated in a competing risks framework as charge hazard rates with and without the infection differ proposed by Schulgen et al. [6]. In this article, we show that while death hazards with and without the infection remain treating death-cases in the hospital as non-informative the same. The type of differential mortality can be stud- censoring can lead to biased estimates of cLOS. ied with cause-specific Cox proportional hazards mod- It may be argued that the mortality rates in hospitals els for death and discharge with HI as time-dependent are usually not very high, as most of the patients are dis- covariate. charged alive. Thus, using only the information of the Moreover, we compare the estimate of cLOS patients discharged alive might lead to reasonable esti- from the biased model with the cLOS attributed to patients discharged alive. To do so, we use the for- mation of cLOS in many cases. However, the efficiency mula derived by Allignol et al. [10]. They propose a of such an estimator might be questionable. Moreover, in intensive-care units (ICUs) where HIs are a serious prob- simple method to split the extra days due to HIs in lem, the mortality can reach up to 30-50%. This is for the hospital into days attributable to patients that die instance the case for ventilated and critically-ill patients. and attributable to those that are discharged alive. Since cLOS is often used to calculate costs as costs are This can be done for both homogeneous Markov driven by bed days, we argue that the costs of a hospital models and for time-inhomogeneous Markov mod- stay are not affected by the status of the patient at the end els. The methods for the time-inhomogeneous model of stay. are implemented in the R-packge etm,developedby A reason for censoring the death cases may be the Allignol et al. [7]. wish to give cLOS (and cost) estimates for a hospital- In “Methods” section part 1 we shortly discuss the for- population which is discharged alive. Therefore, we pro- mulas to estimate cLOS with the two approaches under pose to follow the approach by Allignol et al. [7]. They the constant hazards assumption. In “Methods”section suggest to first use the combined endpoint ’discharge part 2, we aim to provide a proper analytical expression (dead or alive)’ to calculate the overall cLOS (which can of the potential bias in estimating the cLOS due to HIs also be used for a cost analysis) and second, to distinguish when the information on the death cases in the hospital is the impact of HIs on cLOS between patients discharged censored. Assuming a time-homogeneous Markov model, alive and patients deceased. Based on this approach, stud- where the transition hazards are time-independent, we ies censoring the patients at the time of their death are systematically explore the amount and direction of the pronetobias. bias. In “Results and discussion” section, we illustrate the To understand and quantify the difference of the com- real-world relevance of the bias by analysing a random peting risks and the censoring approach, we assume the subset of the SIR-3 prospective cohort study on hospi- simplified setting of constant hazards. For this setting, we tal acquired pneumonia in ICUs in Berlin, Germany. For derive an analytical expression for the difference of cLOS the real data analysis, we estimate the cLOS by apply- estimated as proposed by Allignol et al. [7]and cLOS ing the method for time-inhomogenuous Markov models estimated by censoring the deceased patients. This ana- developed by Allignol et al. [7], which is based on the lytical expression can be used to investigate and analyse Aalen-Johansen estimator. The paper ends with a short the magnitude of the bias that occurs when estimating discussion in “Conclusion”section. Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 3 of 10 Methods status X ∈{0, 1, 2, 3}. By definition, all individuals start Multi-state model for hospital infections in the initial state 0 of being alive in the hospital and free We focus on estimating the cLOS in the hospital due of HI, i.e., X = 0. We denote T as the smallest time at to HIs. We study the amount of bias which can occur which the process is in an absorbing state, T = inf{t : X ∈ when estimating the cLOS by treating patients that die as {2, 3}}. Eventually, end of the hospital stay occurs when censored. X ∈{2, 3}. To evaluate the impact of HIs on the subsequent hos- To do so, we describe the data setting with a multi- state model as proposed by e.g. [7]. Figure 1 displays this pital stay, Schulgen and Schumacher (1996) [6] suggested model (model A), which is a multi-state model with states, to consider the difference of the expected subsequent stay 0 = admission, 1 = infection, 2 = discharge alive and given infectious status at time s, φ(s) = E(T |X = 1) − 3 = death. For simplicity we assume that the hazard rates E(T |X = 0). Schulgen and Schumacher called φ(s) the are constant over time so that we can focus on the key ’expected extra hospitalization time of an infected indi- points concerning the censoring of the death cases. We vidual dependent on time s’. In our setting, the process denote α (t) = α as the hazard of moving from state “i” follows a homogeneous Markov model. Allignol et al. [7] ij ij to state “j”. An example hazard is, studied the cLOS for model A (Fig. 1) mathematically and found that cLOS does not depend on the time s in the α (t)·t ≈P(HI acquired by time t+t|no HI up to time t). homogeneous case. The cLOS can therefore be expressed as The actual hazard α (t) is obtained by taking limits as α + α 1 02 03 t → 0. We define the hazard rates, α = infection haz- CLOS = φ(s) = − 1 × true α + α α + α + α ard rate; α = discharge hazard rate without infection; 12 13 01 02 03 α = death hazard rate without infection; α = discharge (2) 03 12 hazard rate with infection and α = deathhazardrate Furthermore, Allignol et al. provided a formula to sepa- with infection. Under a constant hazards assumption, one rate the estimation of the cLOS for the discharged patients estimates α by using the maximum likelihood estimator ij and the deceased patients under the constant hazard set number of i → j transitions up. This formula is given by α ˆ = .(1) ij person-time in state i CLOS = CLOS(due to discharged alive) Under this model the mean sojourn time of an infected + CLOS(due to deaths) patient in the hospital is and of an uninfected (3) α +α 12 13 α α 12 13 patient it is .Wewrite X for the state occu- = × CLOS + × CLOS α +α +α 01 02 03 α + α α + α 12 13 12 13 pied by the patient at time t. At a time point t,the patient Fig. 1 Model A: The four state Multistate Model; 0 is “Admission” without hospital acquired infection (HI); 1 is hospital acquired “Infection”; 2 is the status of the patients who are “Discharged Alive” and 3 is the “Death” of the patient in the hospital. The constant hazard rates, α is the hazard rate to acquire the hospital infection during the hospital stay; α is the hazard rate to be discharged alive without the HI; α is the hazard rate to dead 02 03 without the HI and α is the hazard rate to be discharged alive after the HI; α is the hazard rate to be dead after the HI 12 13 Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 4 of 10 Hence, we can separately estimate cLOS attributable to No differential mortality patients discharged alive and cLOS attributable to death The bias predominately depends on the hazard rates. In cases by plugging in the estimates of the constant hazards the following we study the magnitude of the bias under obtained with (1). differential mortality. When there is no differential mor- Model B results from model A when treating death cases tality, that is, no difference between the death hazards as censored. In contrast to model A, patients that die with and without infection and no difference between the are assumed to remain under the same risk of being dis- discharge hazards with and without infection,  = α − 1 13 charged alive as patients that are still in the hospital. While α = 0and  = α − α = 0, the bias becomes 0. 03 2 02 12 the discharge hazards of model A and B are the same, the The following formula can be used to obtain an idea of the absolute chance of discharge alive in model A depends on magnitude and the direction of the bias for given values of the competing risk death and therefore differs from the the hazard functions when the death cases are censored. discharge probability modelled in model B. To derive the Direct differential mortality cLOS that results from model B, we apply the formula Under direct differential mortality, there is a non-zero proposed by Allignol et al. which is then difference between the death hazards with and without infection while the discharge hazards with and without α 1 infection are the same, that is  = α − α = 0and CLOS = − 1 .(4) 2 02 12 α α + α 12 01 02 = α − α = 0. Then, the bias can be expressed as 1 13 03 Analytic expression for the bias 1 1 CLOS − CLOS = (α − α ) · · (6) true 13 03 Our focus is on investigating the bias in cLOS when the α α 0· 1· information of the patients that die is censored. Using the 1 1 =  · · . formulas in Eqs. (2)and(4), we deduce that the bias in α α 0· 1· cLOS due to censoring is, 1 1 The bias changes with  .Moreover, as and are α α 0· 1· theaverage sojourntimeinstate0andstate1ofunin- α (α − α ) 03 02 12 CLOS − CLOS = fected and respectively infected patients, the bias also true α (α + α + α )(α + α ) 12 01 02 03 01 02 increases when the average sojourn times increase. (α α − α α ) 02 13 03 12 α (α + α + α )(α + α ) Indirect differential mortality 12 01 02 03 12 13 α (α −α ) (α α −α α ) Under indirect differential mortality, there is a non-zero 03 02 12 02 13 03 12 = + , difference between the discharge hazards with and with- α α α α α α 12 0· 0· 1· 12 0· out infection while the death intensities with and without (5) infection are the same, that is  = α − α = 0and 1 13 03 = α − α = 0. Then, the bias is 2 02 12 where α = α + α + α , α = α + α , α = 0· 01 02 03 01 02 1· 0· 1 1 α (α + α ) α + α and α = α . The formula shows that the bias 03 0· 12 12 13 12 1· CLOS −CLOS = (α − α )· · · true 02 12 depends on the product of the mean LOS in state 0 (α ) α α α 0. 1· 12 0· and a term depending on all hazards. The second term 1 1 α (α + α ) 03 0· 12 =  · · · .(7) determines the direction of the bias which could be pos- ∗ α α α α 0· 1· 12 0· itive or negative. In the following, we study the bias in specific settings which we call differential mortality. We The bias changes with  . The bias also increases with define “direct differential mortality” as the setting where the average waiting time in state 0 and in state 1. Again, the discharge hazards α and α are the same but the in most of the real world situations, we observe  > 0, 02 12 death hazards α and α differ. In contrast, “indirect which means the infected patients have lower discharge 03 13 differential mortality” is described by equal death haz- rates than the uninfected ones. Then, the bias is positive ards but different discharge hazards. Of note, due to the which leads to an overestimation of the cLOS. competing risk situation both settings influence - directly The derived analytical expressions demonstrate for a or indirectly - the overall hospital mortality. We define simplified setting (constant hazards, differential mortal- = α − α and  = α − α and emphasize that ity) how estimation of cLOS is influenced when informa- 1 13 03 2 02 12 both quantities are likely to be positive because infected tion of the death cases is censored. Only in the situation patients often have a higher mortality hazard and a lower where HIs have neither an effect on the death hazards discharge hazard, i.e., they stay longer in the hospital. nor on the discharge hazards, the bias is avoided. Other- A formal mathematical derivation of the bias can be wise, the bias increases with increasing magnitude of the found in Additional file 1. differential mortality. Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 5 of 10 Results and discussion cLOS under the “censored model” (model B), the informa- To show the real world relevance of our findings, we tion of the patients who died is censored at the time of apply the method to a data example. The constant hazards their death. Table 1 shows an extract of the dataset under assumption is a facilitating way to compare the estimands each model. of cLOS resulting from model A and model B. However, To obtain first insights into the data structure, we esti- for real data application it is often too restrictive. There- mate the cause-specific cumulative hazards for model A fore, in our data example we compare models A (Fig. 1) (shown in Fig. 3). The graph indicates that the cumula- and B (Fig. 2) both under the constant hazards assumption tive discharge hazards are not straight lines (which implies (time-homogeneous Markov model) and more generally that they are not constant). Moreover, we observe that under a time-inhomogeneous Markov model. the discharge hazard is consistently reduced for patients We consider a subset of the SIR-3 cohort study from with an HI. The cumulative hazards are estimated using the Charite university hospital in Berlin, Germany, with the R-package mvna, developed by Allignol et al. (2008) prospective assessment of data to examine the effect of HIs in intensive care (Beyersmann et al. 2006a) [11]. The Table 1 Extract of the data showing the artificial censoring of the aim of this study was to investigate the effect of pneu- patients who died in the hospital at the time of their death, monia which may be acquired by the patients during denoted by “cens”. It shows the patient identification number their stay in the ICU. The data is publicly available in (“id”), transition state (“from” and “to”), time taken by the patient the format of los.data from the etm R package. Briefly, to move from state 0 to the current state “to” is given by “time”. los.data includes 756 patients who are admitted to the State “1” defines when the patient is infected, state “2” defines ICU between February 2000 and July 2001. After having when the patient is discharged alive and in model A, state “3” been admitted to the ICU, 124 (16.4%) patients acquired defines death of the patient at the hospital while the same pneumonia (infection) in the hospital. Among those who patients are artificially censored in model B got infected, 34 (27.4%) patients died. Overall, 191 patient Id From To Time died after ICU admission which is 25.3%. None of the Model A patients were censored. 22 0 1 4 For the analysis, we first modify the data structure such 22 1 2 16 that it corresponds to model A. Moreover, to analyze the 29 0 1 6 29 1 3 22 .. .. .. .. 245 0 2 28 250 0 2 9 .. .. .. .. 40 3 11 17 0 3 4 .. .. .. .. .. .. .. .. Model B 22 0 1 4 22 1 2 16 29 0 1 6 29 1 cens 22 .. .. .. .. 245 0 2 28 250 0 2 9 Fig. 2 Model B: Multistate Model resulting from censoring the death .. .. .. .. cases; 0 is the “Admission” state; 1 is HI; 2 is the status of the patients who are “Discharged Alive” and information on rest of the patients are 4 0 cens 11 “Censored”. The constant hazard rates that can be calculated from the 17 0 cens 4 model, α is the hazard rate to acquire a HI infection during the .. .. .. .. hospital stay; α is the hazard rate to be discharged alive without the HI; and α is the hazard rate to be discharged alive after the HI .. .. .. .. Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 6 of 10 Model A Model B 0 1 0 1 0 2 0 2 0 3 1 2 1 2 1 3 0 20406080 020 40 60 80 Time Time Fig. 3 Estimated Cumulative hazards rates in the first 80 days for the multi-state models in Figs. 1 and 2. The slope of each line corresponds to the actual hazard rate, e.g a straight line would mean a constant hazard rate. The left figure shows the cumulative hazard functions for model A, when death is considered as competing event. The right figure corresponds to that of model B, when the patients are censored at the time of death [12] based on the Nelson-Aalen estimator. We also esti- Data analysis on the effect of censoring on the estimated mated the cumulative hazard rates for model B where extra length of stay the patients are censored at the time of their death (also In this section we estimate the cLOS of the SIR-3 data shown in Fig. 3). We can clearly see that the censoring sample. We first use the non-parametric approach for does not affect the other hazard rates. This means the dis- time-inhomogeneous Markov models followed by the charge hazards as well as the infection hazard of model parametric approach assuming the hazard rates of the A and B are the same. Note that pneumonia appears to dataset are constant. In both approaches we estimate the have no effect on the death hazard. However, this does cLOS by describing the data with model A and model B not imply that pneumonia has no effect on mortality. The respectively. Then we calculated the magnitude of the bias reason is that pneumonia reduces the discharge hazard occurring in model B. as a consequence patient with pneumonia stay longer in Moreover, we distinguish the cLOS obtained from the ICU. As a consequence, more patients with pneumo- model A between patients being discharged alive and nia are observed to die in the ICU than patients without patients that die. This way we can investigate how pneumonia. The effects of HI on the death and discharge manyextraICUdaysareattributabletopatientsbeing rates can be estimated with two cause-specific hazards discharged alive. This quantity is also compared to model (for death and discharge). The indirect effect on the biased model where the cLOS attributable to dis- mortality due to a decreased discharge hazard is com- charged patients is estimated by treating patients that die monly observed for hospital-acquired infections [13, 14]. as censored. Cumulative Hazard Cumulative Hazard 012345 Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 7 of 10 Non-parametric model and discussion” section, we further estimate the cLOS by We estimate the difference in cLOS associated with HIs assuming that the hazard rates are constant. within the framework of model A (no censoring of death We first estimate the constant hazard rates with Eq. (1). cases) and model B (censoring of patients at the time We obtain, α ˆ = 0.019, α ˆ = 0.074, α ˆ = 0.024, 01 02 03 of their death) by using the R-package etm.The pack- α ˆ = 0.059 and α ˆ = 0.022. Under a homogeneous 12 13 age is based on computing the Aalen-Johansen estima- Markov process, this data-situation is similar to indirect tors assuming a time-inhomogeneous Markov model. For differential mortality. Plugging the estimates into the for- model A, the estimated cLOS due to HIs is greater for ear- mulas in Eqs. (2)and(4), the cLOS from model A is lier days (see the lower graphs in Fig. 4). The average cLOS 1.773. The cLOS due to HIs with censoring of the death over all days is calculated by weighting the differences in cases is 2.699 (model B). Thus, censoring of the death length of stay on each day. This gives an estimated cLOS cases is overestimating the cLOS by 0.926 days in the of 1.975 days. The corresponding weight distributions are time-constant hazards set up. Unlike in the case of etm- also illustrated in Fig. 4. The average expected cLOS esti- estimation (time-inhomogenuous Markov model), in the mated after censoring the information of the patients at time-constant hazards set up, model B is overestimating the time of their death is 0.446 days (model B). So the dif- the cLOS with respect to model A. ference in the cLOS estimated from model B and model A Comparing the two estimation methods, we find that using the R package etm is 1.529 days. the cLOS under constant hazards is similar to the value We note in Fig. 4 that for model B the estimate for the obtained with the etm-package for model A (1.773 days cLOS in hospitals with and without infection cross each and 1.975 days respectively). From model B, we obtain other. This implies that the underlying assumption of a 2.699 days under the constant hazards assumption and homogeneous Markov model, i.e, when the hazard rates 0.446 days with etm. Thus, the values obtained from are constant, may not be a viable assumption for the data model B clearly differ. While in the estimation with etm, set. As Allignol et al. noted, these curves should be parallel model B is underestimating the cLOS with respect to for the homogeneous Markov assumption to be plausible. model A, we observe the opposite under the constant hazards assumption. Parametric model with constant hazards This difference in behavior could be attributed to the To compare and investigate the results from the etm pack- consequence of the violation of the constant hazards age with the analytical expressions derived in “Results assumption. As seen in Fig. 4, the cLOS of patients with Fig. 4 Weights and expected LOS for patients with and without an HI in the first 15 daysof los.data, which is a subset of the SIR-3 study. The left figure corresponds to model A (death cases are considered as competing event). The right figure corresponds to model B (death cases are censored). The estimated cLOS due to model A is 1.975 days and that for model B is 0.446 days Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 8 of 10 and without HI cross for model B indicating a much to those who died in the ICU. The difference in the stronger discrepancy to the assumption than for model estimated cLOS for model B and that attributable to A, where the curves rather touch than cross. These cir- patients discharged alive under model A is therefore about cumstances can further be understood when comparing 1.552 days. Thus, model B also underestimates cLOS the combined hazards with and without HI of model A attributable to patients discharged alive. It is further to and B with their time-constant counterparts shown in the be noted that under estimation with etm (model A) over- Additional file 1:FigureS5. all cLOS is similar to that of the cLOS estimated for For a more detail inspection, we estimate the cause- discharged patients (model A). This is due to the circum- specific hazard rates non-parametrically with B-splines stance that most of the patients are discharged alive. using the R-package bshazard. A detailed description of Using the formula (3) for the constant hazards approach, the method is given by Rebora et al. [15]. The estimated we obtain 1.291 days for discharged patients and 0.4815 death and discharge hazards both with and without HI are days for the deceased patients. In Table 2,wesee that shown in the Additional file 1: Figure S6. The plots show model B is overestimating the cLOS with respect to also the hazard rates obtained by using equation (1), where the cLOS (due to discharge alive) by 1.408 days. This we assume that they are constant. Comparing the esti- means model B is clearly overestimating the estimate mated hazard rates with their time-constant analogues we from cLOS using only the discharged patients assuming a clearly see that the data does not correspond to a homo- time-constant hazards set up. geneous Markov model. The discharge hazard before HI When comparing the time-inhomogeneous to the increases strongly in the first 10 days. After a peak at day homogeneous (constant hazards) approach under model 10 it strongly decreases again and remains on a moder- A we observe that the difference between overall cLOS ate level from day 20 onward. The behavior of the death and cLOS due to patients discharged alive is higher for the hazard before HI is similar but on a much lower level. Fur- homogeneous approach. This is due to the circumstance thermore, it remains below the discharge hazard before HI that under constant hazards the effect of HI is averaged at all time-points. In contrast, the discharge hazard after over the complete time-interval to estimate cLOS. Using HI seems to be almost constant and is well approximated the time-inhomogeneous approach by Allignol et al. cLOS by α ˆ from formula (1). The death hazard after HI con- is weighted according to the different lengths of stay. As tinuously slightly decreases and always remains below the most patients are discharged alive within the first few discharge hazard after HI. days, the weights are highest at these time-points (see Fig. 4). When using the homogeneous approach then the While the constant hazards assumption is not plausible, the time-inhomogeneous Markov assumption is. Testing influence of the discharged patients on the estimate of this assumption by including time of HI as covariate in a cLOS is less strong. Cox regression model showed no effect on the death and The complete R code of the data analysis is provided in discharge hazards after HI. The hazard ratios were 0.98 the Additional file 2. ([0.94 ; 1.01]) and 1.03 ([0.96 ; 1.09]) respectively. Conclusion Distinction between discharged (alive) and dead The major innovation of this study was the systematic Using the clos function in the etm-package, we obtain evaluation of the bias due to censoring of death cases 1.998 days as the estimated cLOS attributable to patients when studying cLOS in the hospital due to HIs. While who are discharged alive and −0.0234 days attributable Allignol et al. [7] provided an appropriate estimator, the Table 2 Estimation of cLOS with respect to model A (no censoring of deaths) as well as cLOS (discharged) and cLOS (death) (based on model A but distinguishing between death and discharge). Moreover, cLOS with respect to model B (censoring of deaths). Additionally we calculate the bias between model A and model B and the bias between model B and cLOS based on model A for discharged patients only. The comparison is done for the estimation of cLOS by assuming constant hazard and by using the etm package (assuming time-dependent hazards) CLOS CLOS CLOS CLOS Bias Bias (Model A) discharged (Model A) death (Model A) (Model B) (Model B - Model A) (Model B - discharged only) constant 1.773 1.291 0.482 2.699 0.926 1.408 hazard etm 1.975 1.998 -0.0234 0.446 -1.529 -1.552 package Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 9 of 10 existence of the bias due to censoring of death cases was Acknowledgements We thank Dr. Klaus Kaier and Thomas Heister for fruitful discussions and neither mentioned nor discussed by the authors. comments. We first evaluated the bias in a mathematically closed form assuming a setting with constant hazards. A similar Funding MW has been funded by the German Research Foundation (Deutsche approach in a simpler setting without competing out- Forschungsgemeinschaft) (grant No WO 1746/1-1); MvC has received support comes has been used by Joly et al. [8]. Our analytical from the Innovative Medicines Initiative Joint Undertaking under grant evaluation has the advantage that we are able to discuss agreement no. 115737-2 (Combatting bacterial resistance in Europe - molecules against Gram negative infections [COMBACTE-MAGNET]). The challenging effects regarding direct and indirect differen- article processing charge was funded by the German Research Foundation tial mortality. Moreover, it allows us to make statements (DFG) and the Albert Ludwigs University Freiburg in the funding programme about the magnitude and direction of the bias. Open Access Publishing. The real data application also showed that effects Availability of data and materials regarding direct and indirect differential mortality do exist R code is available in the supplementary material of this paper. The data is and that the bias influences the estimates of cLOS. In publicly available in the Comprehensive R Archive Network package (R) etm. model A, the cLOS estimation via the time-homogeneous The Additional file 1: Figures S5 and S6 as well as the detailed derivation of the formula of the bias is available as separate file in the online supplementary model gave similar estimates as the one which allows material. time-inhomogeneity, whereas it was different for model B where we treated patients that die as censored obser- Authors’ contributions vations. Although a difference in estimation of cLOS has SR and MW calculated the analytical expression of the bias, SR performed the analysis of the data example; MvC critically revised and rephrased the been observed due to censoring of death cases both by manuscript and contributed to the analysis of the data example; MS gave using the “time-dependent hazard” (via etm package) and insightful contributions to conception of the bias, critically revised the the “time-constant hazard” assumptions, the bias shown manuscript and provided major comments; MW formulated the main problem, supervised the research project and critically revised the manuscript. in the two set ups goes in different direction. Thus, our All authors read and approved the final manuscript. closed formula has limitations if the assumptions are not fulfilled. Therefore, a time-dependent hazards model Ethics approval and consent to participate should be considered for future research. However, before Not applicable. dealing with a complicated time-inhomogeneous model Competing interests one must understand the behavior of the bias for the sim- MW is a member of the editorial board of BMC research methodology. SR, pler constant hazards model. Understanding the bias in MvC and MS declare no competing interests. a simple setting was the aim of this paper. To point out Publisher’s Note the presence of bias in a real world situation we have used Springer Nature remains neutral with regard to jurisdictional claims in the publicly available SIR-data. Even thought the constant published maps and institutional affiliations. hazards assumption is not possible for this data set we Author details could demonstrate the existence of the bias. Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical A further limitation of our study was not considering Center - University of Freiburg, Freiburg, Germany. Freiburg Center of Data Analysis and Modelling, University of Freiburg, Eckerstr. 1, 79104 Freiburg, confounding factors as the length of stay of the patients Germany. Department of Statistics, Texas A&M University, 3143 TAMU, may depend on the underlying morbidity of the patient. 77843-3143, College Station, Texas, USA. We emphasize that the bias due to censoring the death Received: 13 October 2017 Accepted: 30 April 2018 cases is a type of survival bias and systematically differ- ent from confounding. Based on our findings, we can conclude that censoring the deaths should be avoided. References Moreover, the formula we derived can be used to describe 1. Barnett AG, Beyersmann J, Allignol A, Rosenthal VD, Graves N, the bias for settings with constant hazards. Wolkewitz M. Value Health. 2011;14(2):381–6. 2. Brock GN, Barnes C, Ramirez JA, Myers J. How to handle mortality when investigating length of hospital stay and time to clinical stability. BMC Med Res Methodol. 2011;11(1):144. 3. Noll DR, Degenhardt BF, Johnson JC. Multicenter osteopathic Additional files pneumonia study in the elderly: Subgroup analysis on hospital length of stay, ventilator-dependent respiratory failure rate, and in-hospital mortality rate. J Am Osteopath Assoc. 2016;116(9):574–87. Additional file 1: The document contains Figures S5 and S6 mentioned 4. Sousa A, Guerra RS, Fonseca I, Pichel F, Amaral T. Sarcopenia and length in section ’Results and discussion’ as well as the detailed mathematical of hospital stay. Eur J Clin Nutr. 2016;70(5):595–601. derivation of the bias formula presented in section ’Methods’. (PDF 192 kb) 5. Zúñiga MFS, Delgado OEC, Merchán-Galvis AM, Caicedo JCC, Calvache Additional file 2: The document is the complete R script used in section JA, Delgado-Noguera M. Factors associated with length of hospital stay 4 for the data analysis. (R 6 kb) in minor and moderate burns at popayan, colombia. analysis of a cohort study. Burns. 2016;42(1):190–5. Abbreviations 6. Schulgen G, Schumacher M. Estimation of prolongation of hospital stay ICU: Intensive care unit; HI: Hospital-acquired infection; cLOS: change in length attributable to nosocomial infections: new approaches based on of hospital/ICU stay multistate models. Lifetime Data Anal. 1996;2(3):219–40. Rahman et al. BMC Medical Research Methodology (2018) 18:49 Page 10 of 10 7. Allignol A, Schumacher M, Beyersmann J. Estimating summary functionals in multistate models with an application to hospital infection data. Comput Stat. 2011;26(2):181–97. 8. Joly P, Commenges D, Helmer C, Letenneur L. A penalized likelihood approach for an illness–death model with interval-censored data: application to age-specific incidence of dementia. Biostatistics. 2002;3(3): 433–43. 9. Binder N, Schumacher M. Missing information caused by death leads to bias in relative risk estimates. J Clin Epidemiol. 2014;67(10):1111–20. 10. Allignol A, Schumacher M, Beyersmann J, et al. Empirical transition matrix of multi-state models: the etm package. J Stat Softw. 2011;38(4):1–15. 11. Beyersmann J, Gastmeier P, Grundmann H, Bärwolff S, Geffers C, Behnke M, Rüden H, Schumacher M. Use of multistate models to assess prolongation of intensive care unit stay due to nosocomial infection. Infect Control. 2006;27(05):493–9. 12. Allignol A, Beyersmann J, Schumacher M. mvna: An R package for the nelson-aalen estimator in multistate models. R news. 2008;8(2):48–50. 13. Schumacher M, Allignol A, Beyersmann J, Binder N, Wolkewitz M. Hospital-acquired infections—appropriate statistical treatment is urgently needed!. Int J Epidemiol. 2013;42(5):1502–8. 14. Melsen WG, Rovers MM, Groenwold RH, Bergmans DC, Camus C, Bauer TT, Hanisch EW, Klarin B, Koeman M, Krueger WA, et al. Lancet Infect Dis. 2013;13(8):665–71. 15. Rebora P, Salim A, Reilly M. R Journal. 2014;6(2):114–22.

Journal

BMC Medical Research MethodologySpringer Journals

Published: May 30, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off