# Modeling the cumulative incidence function of multivariate competing risks data allowing for within-cluster dependence of risk and timing

Modeling the cumulative incidence function of multivariate competing risks data allowing for... Summary We propose to model the cause-specific cumulative incidence function of multivariate competing risks data using a random effects model that allows for within-cluster dependence of both risk and timing. The model contains parameters that makes it possible to assess how the two are connected, e.g. if high-risk is related to early onset. Under the proposed model, the cumulative incidences of all failure causes are modeled and all cause-specific and cross-cause associations specified. Consequently, left-truncation and right-censoring are easily dealt with. The proposed model is assessed using simulation studies and applied in analysis of Danish register-based family data on breast cancer. 1. Introduction Multivariate competing risks data arise often in clinical and epidemiological research, for example in multicenter clinical trials, i.e. clinical trials with participants from more than one medical center, and in family studies examining disease occurrence in related individuals. Study subjects within the same cluster (e.g. medical center or family) cannot be assumed independent and one must take this within-cluster dependence into account when analyzing the data. Focusing on family studies, the within-cluster dependence, which is here a within-family dependence, is often the key point of interest or at least as important as determining the role of different risk factors (Bandeen-Roche, 2014). This is because the within-family dependence can be viewed as an expression of familial aggregation and may reflect both disease heritability and the impact of shared environmental effects (Burton and others, 2005; Bandeen-Roche, 2014). In the competing risks setting, one has the choice between analyzing the multivariate data on the hazard scale focusing on the cause-specific hazard or on the probability scale focusing on the cause-specific cumulative incidence. Both approaches are equally relevant and may complement each other (Andersen and others, 2012). However, in analysis of family studies, there is often a strong interest in describing age at disease onset including within-family dependence. See for example, Brandt and others (2008, 2010) and Kharazmi and others (2012). The distribution of age at disease onset is directly described by the cause-specific cumulative incidence, which is the focus of this article. We propose a model for the cumulative incidence, which provides a framework for exploring and making inference about the distribution of age at disease onset. For example, the model makes it possible to investigate how absolute risk of disease is related to age at onset. We find in our presented application of the model that breast cancer with an early onset is associated with a higher absolute risk, which is consistent with specific genetic risk profiles such as mutations in the BRCA1 and BRCA2 genes (Petrucelli and others, 2010). Thus, we quantify the dependence of both risk and timing for breast cancer and find that these are significantly related. Despite considerable interest, the number of available statistical models for assessing the cumulative incidence and the within-cluster dependence of multivariate competing risks data is limited. The within-cluster dependence of the cause-specific cumulative incidence can be assessed by direct modeling of different dependence measures. For example, the ratio between the bivariate cumulative incidence function and the product of the marginal counterparts (Cheng and others, 2007, 2009), the conditional cumulative incidence function (Shih and Albert, 2010) or the cross-odds ratio (Scheike and Sun, 2012). Alternatively, one can employ a frailty-based two-stage approach, where the marginal cumulative incidence functions are estimated in the first stage and a dependence parameter is estimated in the second stage using an Archimedean copula (Scheike and others, 2010; Cheng and Fine, 2012). The marginal cumulative incidence functions are not generally distribution functions. Consequently, the interpretation of the dependence parameter estimated in the two-stage approach as a type of Kendall’s $$\tau$$ does not apply (Scheike and others, 2010). However, under certain conditions the estimated dependence parameter is closely related to the cross-odds ratio (Scheike and others, 2010; Scheike and Sun, 2012). Under the two-stage approach, it is necessary to adjust for right-censoring. This is done through modeling of the censoring distribution and employment of inverse probability of censoring weights (IPCWs) (Scheike and others, 2010; Cheng and Fine, 2012). If the censoring distribution is misspecified, the weighting may introduce bias. In addition to the models described above, a semiparametric mixture model exists (Naskar and others, 2005). This model allows for within-cluster dependence with regard to timing but not absolute risk. We propose to model the cause-specific cumulative incidence function of multivariate competing risks data using a random effects model that allows for within-cluster dependence with regard to both risk and timing. That is, the model allows the risk level (in terms of absolute risk) and the failure time distribution to vary between clusters. This is in contrast to the model by Naskar and others (2005) mentioned above. Removal of the random effects from the proposed model gives a model that is similar to existing parametric and semiparametric mixture models for competing risks data such as Larson and Dinse (1985), Kuk (1992) and Shi and others (2013). Under the proposed model, the cumulative incidence functions of all causes are modeled. As a consequence, both left-truncation and right-censoring are easily dealt with. Thus, specification of censoring distributions is avoided. Generally, direct regression modeling of cause-specific cumulative incidence functions can be challenging since the sum of all predicted cause-specific cumulative incidences should be less than or equal to $$1$$. This can be difficult to enforce (Gerds and others, 2012). We employ a multinomial logistic regression model with random effects that ensures that the sum of the predicted cumulative incidences do not exceed $$1$$. In addition, application of a pairwise composite likelihood readily enables analysis of data with large clusters of varying size. Unfortunately, no closed form expression for the marginal likelihood exists. Yet, as we show the dimensionality of the integral can be markedly reduced by employing Bayes’ rule and changing the order of integration. The remainder of this article is structured as follows. The proposed model is described in Section 2 including a presentation of the likelihood function in Section 2.2. Simulation results are presented in Section 3. In Section 4, the model is used to analyze Danish register-based family data on breast cancer. A discussion in Section 5 ends the article. 2. The model Let $$n$$ be the number of clusters and $$n_i$$ the number of individuals within the $$i$$th cluster. Let $$T_{ij}^{*}$$, $$\epsilon^{*}_{ij}$$ and $$C_{ij}$$ denote the failure time, the cause of failure and the censoring time, respectively, of the $$j$$th individual in the $$i$$th cluster. For simplicity, we consider a situation with two competing causes of failure, where $$\epsilon^{*}_{ij}=1$$ is the failure cause of primary interest and $$\epsilon^{*}_{ij}=2$$ is the competing failure cause. However, the notation and model generalises to situations with more than two competing causes of failure. Let $$X_{ij}=(1,X_{ij,1},...,X_{ij,p})$$ be a vector of associated covariates. For simplicity, we let the vector $$X_{ij}$$ be the same for all causes of failure and allow the regression coefficients’ vectors to vary. The observed follow-up time of the $$j$$th individual in the $$i$$th cluster is $$T_{ij}=\min(T_{ij}^{*},C_{ij})$$. The observed failure cause is $$\epsilon_{ij}=\Delta_{ij}\epsilon_{ij}^{*}$$, where $$\Delta_{ij}$$ is an indicator of no censoring $$\Delta_{ij}=I(T_{ij}^{*}\leq C_{ij})$$. Thus, $$\epsilon_{ij}=0$$ denotes censoring. The failure times are observed in the time interval $$[0,\tau]$$. We propose to model the cluster-specific cumulative incidence function as a product of a cluster-specific risk level and a cluster-specific failure time trajectory. For two competing causes of failure, let the cluster-specific cumulative incidence functions be given as \begin{eqnarray}\label{eq:cif} F_{1}(t|X,\eta_{1},u_{1},u_{2})&=&\pi_{1}(X,u_{1},u_{2})\Phi\left[\alpha_{1}\{g(t)\}-X^{T}\gamma_{1}-\eta_{1}\right]\nonumber\\ F_{2}(t|X,\eta_{2},u_{1},u_{2})&=&\pi_{2}(X,u_{1},u_{2})\Phi\left[\alpha_{2}\{g(t)\}-X^{T}\gamma_{2}-\eta_{2}\right]\!, \end{eqnarray} (2.1) where $$\boldsymbol{\eta}=\{\eta_{1},\eta_{2}\}$$ and $$\boldsymbol{u}=\{u_{1},u_{2}\}$$ are normally distributed random effects with mean zero and potentially correlated, i.e. \begin{eqnarray}\label{eq:randrist} \begin{pmatrix} \eta_{1} \\ \eta_{2} \\ u_{1}\\ u_{2} \end{pmatrix} \sim \mathcal{N} \left\{ \begin{pmatrix} 0 \\ 0 \\ 0 \\ 0 \end{pmatrix} , \begin{pmatrix} \sigma^2_{\eta_{1}} & \textrm{cov}(\eta_{1},\eta_{2}) & \textrm{cov}(\eta_{1},u_{1}) & \textrm{cov}(\eta_{1},u_{2}) \\ & \sigma^2_{\eta_{2}} & \textrm{cov}(\eta_{2},u_{1}) & \textrm{cov}(\eta_{2},u_{2}) \\ & & \sigma^2_{u_{1}} & \textrm{cov}(u_{1},u_{2}) \\ & & & \sigma^2_{u_{2}} \end{pmatrix} \right\}. \end{eqnarray} (2.2) The cluster-specific survival function is given as $$S(t|X,\boldsymbol{\eta},\boldsymbol{u})=1-F_{1}(t|X,\eta_{1},\boldsymbol{u})-F_{2}(t|X,\eta_{2},\boldsymbol{u})$$. The cluster-specific cumulative incidence functions (2.1) are modeled up to a fixed time point $$\delta$$ at which all individuals still at risk are censored. The value of $$\delta$$ depends on the data and cannot exceed the maximum observed follow-up time, i.e. $$\delta \leq \tau$$. Transformation of the time variable $$t$$ enables separation of the cumulative incidence function (2.1) into a cluster-specific risk level, $$\pi_{h}(X,\boldsymbol{u})$$, and a cluster-specific failure time trajectory, $$\Phi\left[\alpha_{h}\{g(t)\}-X^{T}\gamma_{h}-\eta_{h}\right]$$, where $$\Phi$$ is the cumulative distribution function of the standard normal distribution. The transformation, $$g(t)$$, is given as \begin{eqnarray*}\label{eq:trans} g(t)={\rm arctanh}\left(\frac{t-\delta/2}{\delta/2}\right) \quad \textrm{for} \quad t \in ]0,\delta[. \end{eqnarray*} With this transformation $$g(t)\in ]-\infty,\infty[$$ and the value of the cluster-specific failure time trajectory will equal 1 at time $$\delta$$. Consequently, $$F_{h}(\delta|X,\eta_{h},\boldsymbol{u})=\pi_{h}(X,\boldsymbol{u})$$ and we can therefore interpret $$\pi_{1}(X,\boldsymbol{u})$$ and $$\pi_{2}(X,\boldsymbol{u})$$ as the cause-specific cluster-specific risk levels at time $$\delta$$. The cluster-specific risk levels are modeled using a multinomial logistic regression model with random effects, i.e. \begin{eqnarray}\label{eq:pi} \pi_{1}(X,\boldsymbol{u})&=&\frac{\exp(X^T\beta_{1}+u_{1})}{1+\exp(X^T\beta_{1}+u_{1})+\exp(X^T\beta_{2}+u_{2})}\nonumber\\ \pi_{2}(X,\boldsymbol{u})&=&\frac{\exp(X^T\beta_{2}+u_{2})}{1+\exp(X^T\beta_{1}+u_{1})+\exp(X^T\beta_{2}+u_{2})}, \end{eqnarray} (2.3) where $$\beta_{1}$$ and $$\beta_{2}$$ are regression coefficients’ vectors of the two causes of failure. The random effects in $$\boldsymbol{u}$$ induce within-cluster dependence of the cause-specific risk levels. The cluster-specific failure time trajectories are modeled via the second part of the models in (2.1), i.e. \begin{eqnarray}\label{eq:trac} \Phi\left[\alpha_{h}\{g(t)\}-X^{T}\gamma_{h}-\eta_{h}\right] \quad \textrm{for} \quad h=1,2, \end{eqnarray} (2.4) where $$\Phi$$ is the cumulative distribution function of the standard normal distribution, and $$\alpha_{1}(x)$$ and $$\alpha_{2}(x)$$ are monotonically increasing functions of $$x$$ and both known up to a finite-dimensional parameter vector, $$\omega_{1}$$ and $$\omega_{2}$$, respectively. For example, monotonically increasing B-spline or piecewise linear functions. The random effects in $$\boldsymbol{\eta}$$ induce within-cluster dependence of the cause-specific failure time trajectories. We assume that the random effects are independent across clusters and shared by individuals within the same cluster. Individuals within the same cluster are assumed independent conditional on the random effects and on covariates. Furthermore, we assume that the failure time and the cause of failure are independent of the censoring time given covariates and random effects ($$\{T_{ij}^{*},\epsilon_{ij}^{*}\}\perp C_{ij}|X_{ij},\boldsymbol{\eta}_{i},\boldsymbol{u}_{i}$$) and that $$C_{ij}$$ does not depend on the random effects. 2.1. Interpretation of parameters Modeling the cluster-specific risk levels using a multinomial logistic regression model (2.3) ensures that the sum of the predicted cause-specific cumulative incidences does not exceed 1. Furthermore, at time $$\delta$$, where $$F_{h}(\delta|X,\eta_{h},\boldsymbol{u})=\pi_{h}(X,\boldsymbol{u})$$, the cluster-specific survival function $$S(\delta|X,\boldsymbol{\eta},\boldsymbol{u})$$ equals $$1/\left\{1+\exp(X^T\beta_{1}+u_{1})+\exp(X^T\beta_{2}+u_{2})\right\}$$. This results in the following interpretation of the regression coefficients in $$\beta_{1}$$ and $$\beta_{2}$$. Let $$X_{ij}=(1,X_{ij,1},...,X_{ij,p})$$ and $$\beta_{h}=(\beta_{0,h},...,\beta_{p,h})^{T}$$. Comparing two individuals $$j$$ and $$k$$ from the $$i$$th cluster whose values of $$X_{ij,r}$$ differ by one unit and where all other covariates are held constant, $$\exp(\beta_{r,h})$$ equals the ratio between “odds” of the form $$F_{h}(\delta|X_{ij},\eta_{h,i},\boldsymbol{u}_{i})/S(\delta|X_{ij},\boldsymbol{\eta}_{i},\boldsymbol{u}_{i})$$. That is, the probability of failure from cause $$h$$ at time $$\delta$$ in relation to the probability of no failure at time $$\delta$$ (Gerds and others, 2012). The regression coefficients $$\gamma_{1}=(\gamma_{0,1},...,\gamma_{p,1})^{T}$$ and $$\gamma_{2}=(\gamma_{0,2},...,\gamma_{p,2})^{T}$$ reflect how covariates affect the failure time trajectories (2.4), i.e. the shape of the cumulative incidence, and thereby how quickly the cluster-specific risk levels observed at time $$\delta$$ are reached (2.3). For a covariate with a negative effect, the cause-specific cumulative incidence function makes an advance towards the cluster-specific risk level, whereas a covariate with a positive effect causes a delay. The random effects in $$\boldsymbol{\eta}$$ have a similar interpretation. This is illustrated in Figure 1, where the cluster-specific cumulative incidence function of a given failure cause, denoted failure cause 1, is visualised for different combinations of random effects $$\eta_{1}$$ and $$u_{1}$$. Interpretation of random effects in the general logistic regression model is no easy task (Larsen and others, 2000). This also applies to the random effects $$u_{1}$$ and $$u_{2}$$ in (2.3). In contrast to $$\eta_{1}$$ and $$\eta_{2}$$, the random effects $$u_{1}$$ and $$u_{2}$$ always appear together and have a joint effect on the cumulative incidence of both causes, cf. (2.3). However, keeping everything else fixed, an increase in $$u_{h}$$ will increase the risk of failure from cause $$h$$ and vice versa. This is also illustrated in Figure 1. The interpretation of the covariance of $$\eta_{1}$$ and $$\eta_{2}$$ and the covariance of $$u_{1}$$ and $$u_{2}$$ is more or less straightforward. With regard to the interpretation of the covariance of the random effects in $$\boldsymbol{\eta}$$ and the random effects in $$\boldsymbol{u}$$, $$\textrm{cov}(\eta_{h},u_{1})$$ and $$\textrm{cov}(\eta_{h},u_{2})$$, they should be considered simultaneously. Again keeping everything else fixed, a negative correlation between $$\eta_{h}$$ and $$u_{h}$$ imply that when $$\eta_{h}$$ decreases, $$u_{h}$$ increases and conversely when $$\eta_{h}$$ increases, $$u_{h}$$ decreases. Broadly speaking, this corresponds to an increased risk level that is reached quickly and a decreased risk level that is reached later, respectively. We find it difficult to think of a situation with a positive within-cause correlation between $$\eta$$ and $$u$$, i.e. where an increased risk level is associated with a late onset and vice versa. However, a positive cross-cause correlation between $$\eta$$ and $$u$$ sounds more plausible. i.e. where late onset of one failure cause is associated with a high absolute risk of another failure cause. Fig. 1. View largeDownload slide Illustration of the proposed model showing the cluster-specific cumulative incidences of failure cause 1. From a model with $$X=1$$ for all subjects and with $$\beta_{1}=-1.9$$, $$\beta_{2}=-0.2$$, $$\gamma_{1}=1$$, $$\alpha_{1}\{g(t)\}=3g(t)$$ and $$u_{2}=0$$. In plot A, $$u_{1}=-1$$. In plot B, $$u_{1}=-0.5$$. In plot C, $$u_{1}=0.5$$. In plot D, $$u_{1}=1$$. With regard to the curves, ($$\Huge {-}$$) corresponds to $$\eta_{1}=-2$$, (− − − −) corresponds to $$\eta_{1}=-1$$, ($$\cdots\cdots\cdots$$) corresponds to $$\eta_{1}=0$$, ($$-\cdot -\cdot -\cdot$$) corresponds to $$\eta_{1}=1$$, and (— — —) corresponds to $$\eta_{1}=2$$. Fig. 1. View largeDownload slide Illustration of the proposed model showing the cluster-specific cumulative incidences of failure cause 1. From a model with $$X=1$$ for all subjects and with $$\beta_{1}=-1.9$$, $$\beta_{2}=-0.2$$, $$\gamma_{1}=1$$, $$\alpha_{1}\{g(t)\}=3g(t)$$ and $$u_{2}=0$$. In plot A, $$u_{1}=-1$$. In plot B, $$u_{1}=-0.5$$. In plot C, $$u_{1}=0.5$$. In plot D, $$u_{1}=1$$. With regard to the curves, ($$\Huge {-}$$) corresponds to $$\eta_{1}=-2$$, (− − − −) corresponds to $$\eta_{1}=-1$$, ($$\cdots\cdots\cdots$$) corresponds to $$\eta_{1}=0$$, ($$-\cdot -\cdot -\cdot$$) corresponds to $$\eta_{1}=1$$, and (— — —) corresponds to $$\eta_{1}=2$$. 2.2. Likelihood In the estimation of the parameters of interest, we propose to use a pairwise composite likelihood (Cox and Reid, 2004; Varin and others, 2011) given as \begin{eqnarray}\label{eq:complik} \mathcal{L}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X},\boldsymbol{\eta},\boldsymbol{u})&=& \prod_{i=1}^{n}\prod_{j=1}^{n_{i}-1}\prod_{k=j+1}^{n_{i}}\mathcal{L}_{i,jk}(\theta;T_{ij},\epsilon_{ij},X_{ij},T_{ik},\epsilon_{ik},X_{ik},\boldsymbol{\eta}_{i},\boldsymbol{u}_{i}), \end{eqnarray} (2.5) where $$\theta=\{\beta_{1},\beta_{2},\gamma_{1},\gamma_{2},\omega_{1},\omega_{2},\Sigma_{\boldsymbol{\eta}\boldsymbol{u}}\}^{T}$$ are the model parameters of interest and $$\Sigma_{\boldsymbol{\eta}\boldsymbol{u}}$$ denotes the variance-covariance matrix of the random effects in $$\boldsymbol{\eta}$$ and $$\boldsymbol{u}$$ (2.2). The pairwise composite likelihood readily enables analysis of data with large clusters of varying size. Furthermore, it is less complex and less computationally intensive than a high-dimensional full cluster likelihood. For parameter estimation we need the marginal pairwise composite likelihood where the unobserved random effects in $$\boldsymbol{\eta}$$ and $$\boldsymbol{u}$$ have been integrated out. i.e., \begin{eqnarray*} \mathcal{L}_{M}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X}) &=&\int\int p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{\eta},\boldsymbol{u})p^{\theta}(\boldsymbol{\eta},\boldsymbol{u})\textrm{d}\boldsymbol{\eta}\textrm{d}\boldsymbol{u}, \end{eqnarray*} where $$p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{\eta},\boldsymbol{u})$$ is the conditional density of $$\boldsymbol{T}$$ and $$\boldsymbol{\epsilon}$$ given covariates and the random effects in $$\boldsymbol{\eta}$$ and $$\boldsymbol{u}$$, and $$p^{\theta}(\boldsymbol{\eta},\boldsymbol{u})$$ is the density of the random effects (2.2). Unfortunately, no closed form expression for the marginal likelihood exists. Yet, we can reduce the dimensionality of the integral. This is done by employing Bayes’ rule, utilising that $$p^{\theta}(\boldsymbol{\eta},\boldsymbol{u})=p^{\theta}(\boldsymbol{\eta}|\boldsymbol{u})p^{\theta}(\boldsymbol{u})$$, changing the order of integration and integrating with respect to $$\boldsymbol{\eta}$$ over the conditional density $$p^{\theta}(\boldsymbol{\eta}|\boldsymbol{u})$$. A closed form expression for the conditional density of $$\boldsymbol{T}$$ and $$\boldsymbol{\epsilon}$$ given covariates and the random effects in $$\boldsymbol{u}$$, i.e. $$p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{u})=\int p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{\eta},\boldsymbol{u})p^{\theta}(\boldsymbol{\eta}|\boldsymbol{u})\textrm{d}\boldsymbol{\eta}$$, exists. Thus, \begin{eqnarray}\label{eq:marlik} \mathcal{L}_{M}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X}) =\int\int p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{\eta},\boldsymbol{u})p^{\theta}(\boldsymbol{\eta}|\boldsymbol{u})p^{\theta}(\boldsymbol{u})\textrm{d}\boldsymbol{\eta}\textrm{d}\boldsymbol{u} =\int p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{u})p^{\theta}(\boldsymbol{u})\textrm{d}\boldsymbol{u}. \quad \end{eqnarray} (2.6) In the following, we show how the expression for $$p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{u})$$ is derived. For ease of notation, we ignore the cluster subscript $$i$$. The likelihood contribution of the pair $$j,k$$ to the pairwise composite likelihood (2.5) is given as \begin{eqnarray}\label{eq:complik_det} \mathcal{L}_{jk}(\theta;T_{j},\epsilon_{j},X_{j},T_{k},\epsilon_{k},X_{k},\boldsymbol{\eta},\boldsymbol{u})&=&\left[\prod_{h=1}^{2} \prod_{l=1}^{2}\left\{f_{h}^{\theta}(T_{j}|X_{j},\eta_{h},\boldsymbol{u})f_{l}^{\theta}(T_{k}|X_{k},\eta_{l},\boldsymbol{u})\right\}^{(\epsilon_{j}=h)(\epsilon_{k}=l)}\right]\nonumber\\ &&\times\left[\prod_{h=1}^{2}\left\{ f_{h}^{\theta}(T_{j}|X_{j},\eta_{h},\boldsymbol{u})S^{\theta}(T_{k}|X_{k},\boldsymbol{\eta},\boldsymbol{u})\right\}^{(\epsilon_{j}=h)(\epsilon_{k}=0)}\right]\nonumber\\ &&\times\left[\prod_{l=1}^{2} \left\{S^{\theta}(T_{j}|X_{j},\boldsymbol{\eta},\boldsymbol{u})f_{l}^{\theta}(T_{k}|X_{k},\eta_{l},\boldsymbol{u})\right\}^{(\epsilon_{j}=0)(\epsilon_{k}=l)}\right]\nonumber\\ &&\times \left[\left\{S^{\theta}(T_{j}|X_{j},\boldsymbol{\eta},\boldsymbol{u})S^{\theta}(T_{k}|X_{k},\boldsymbol{\eta},\boldsymbol{u})\right\}^{(\epsilon_{j}=0)(\epsilon_{k}=0)}\right], \end{eqnarray} (2.7) where \begin{eqnarray*} f_{h}^{\theta}(t|X,\eta_{h},\boldsymbol{u})=\frac{\partial F_{h}^{\theta}(t|X,\eta_{h},\boldsymbol{u})}{\partial t}=\pi_{h}^{\theta}(X,\boldsymbol{u})\frac{d\alpha_{h}^{\theta}\{g(t)\}}{dt}\phi\left[\alpha_{h}^{\theta}\{g(t)\}-X^{T}\gamma_{h}-\eta_{h}\right], \end{eqnarray*} $$\phi$$ being the probability density function of the standard univariate normal distribution and $$S^{\theta}(t{\,|\,}X,\boldsymbol{\eta},\boldsymbol{u})=1-\sum_{h=1}^{2}F_{h}^{\theta}(t|X,\eta_{h},\boldsymbol{u})$$. In the following, let $$\alpha_{h}^{*}(t)=\alpha_{h}^{\theta}\{g(t)\}$$ and $$\alpha_{h}^{*\boldsymbol{\prime}}(t)=d\alpha_{h}^{\theta}\{g(t)\}/dt$$. The first term in the likelihood (2.7) equals the likelihood contribution from pairs where both individuals experience failure (either cause). We have that \begin{eqnarray*} f_{h}^{\theta}(t|X_{j},\eta_{h},\boldsymbol{u})f_{l}^{\theta}(s|X_{k},\eta_{l},\boldsymbol{u})&=& \pi_{h}^{\theta}(X_{j},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\phi\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h}\right\} \nonumber \\ &&\times \pi_{l}^{\theta}(X_{k},\boldsymbol{u})\alpha_{l}^{*\boldsymbol{\prime}}(s)\phi\left\{\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}\right\}. \qquad \end{eqnarray*} The right-hand side of the expression above can be written as \begin{eqnarray*} \pi_{h}^{\theta}(X_{j},\boldsymbol{u})\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\alpha_{l}^{*\boldsymbol{\prime}}(s)\boldsymbol{\phi}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h},\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}\right\}, \end{eqnarray*} where $$\boldsymbol{\phi}$$ is the probability density function of the standard bivariate normal distribution with correlation coefficient $$\rho=0$$. Let $$\boldsymbol{\eta}_{h,l}=\{\eta_{h},\eta_{l}\}$$ and let $$\boldsymbol{\mu}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$ and $$\Sigma_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$ denote the mean and the variance-covariance matrix, respectively, of the conditional distribution of $$\boldsymbol{\eta}_{h,l}$$ given $$\boldsymbol{u}$$. To obtain an expression for the contribution to the conditional density of $$\boldsymbol{T}$$ and $$\boldsymbol{\epsilon}$$ given the random effects in $$\boldsymbol{u}$$, $$p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{u})$$, we integrate with respect to $$\boldsymbol{\eta}_{h,l}$$ over the conditional distribution $$p^{\theta}(\boldsymbol{\eta}_{h,l}|\boldsymbol{u})$$, i.e. \begin{eqnarray*} \pi_{h}^{\theta}(X_{j},\boldsymbol{u})\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\alpha_{l}^{*\boldsymbol{\prime}}(s)\int \boldsymbol{\phi}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h},\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}\right\} p^{\theta}(\boldsymbol{\eta}_{h,l}|\boldsymbol{u})\textrm{d}\boldsymbol{\eta}_{h,l}. \end{eqnarray*} Since the random effects in $$\boldsymbol{\eta}_{h,l}$$ are normally distributed, the integral is straightforward to evaluate and simplifies to \begin{eqnarray*} \pi_{h}^{\theta}(X_{j},\boldsymbol{u})\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\alpha_{l}^{*\boldsymbol{\prime}}(s)\boldsymbol{\phi}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}^{\theta}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h},\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}\right\}, \end{eqnarray*} where $$\boldsymbol{\phi}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}^{\theta}$$ is the probability density function of the bivariate normal distribution with mean $$\boldsymbol{\mu}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$ and variance-covariance matrix $$I_{2}+\Sigma_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$. The second and third term in the likelihood (2.7) are the contributions from pairs where one of the two individuals experience failure. As above, we integrate over the conditional distribution $$p^{\theta}(\boldsymbol{\eta}_{h,l}|\boldsymbol{u})$$ to obtain the contribution to the conditional density of $$\boldsymbol{T}$$ and $$\boldsymbol{\epsilon}$$ given the random effects in $$\boldsymbol{u}$$. We only show the expression for the integral of the second term. The expression for the integral of the third term is equivalent as they are symmetrical. We have that \begin{eqnarray*} f_{h}^{\theta}(t|X_{j},\eta_{h},\boldsymbol{u})S^{\theta}(s|X_{k},\boldsymbol{\eta},\boldsymbol{u})&=& \pi_{h}^{\theta}(X_{j},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\phi\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h}\right\} \nonumber \\ &&\times \left[1-\sum_{l=1}^{2}\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\Phi\left\{\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}\right\}\right],\qquad \end{eqnarray*} where $$\Phi$$ is the cumulative distribution function of the standard univariate normal distribution. Utilising that $$\Phi(x)=\int_{-\infty}^{x}\phi(w)\textrm{d}w$$, we can write the right-hand side as \begin{eqnarray*} &&\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\phi\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h}\right\} \\ &&\quad-\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\sum_{l=1}^{2}\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\int_{-\infty}^{\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}}\boldsymbol{\phi}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h},w \right\}\textrm{d}w, \end{eqnarray*} where $$\boldsymbol{\phi}$$ is the probability density function of the standard bivariate normal distribution with correlation coefficient $$\rho=0$$. We then integrate with respect to $$\boldsymbol{\eta}_{h,l}$$ over the conditional distribution $$p^{\theta}(\boldsymbol{\eta}_{h,l}|\boldsymbol{u})$$ and obtain \begin{eqnarray*} &&\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\phi_{\eta_{h}|\boldsymbol{u}}^{\theta}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}\right\} \\ &&\quad -\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\sum_{l=1}^{2}\pi_{l}^{\theta}(X_{l},\boldsymbol{u})\int_{-\infty}^{\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}}\boldsymbol{\phi}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}^{\theta}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h},w\right\}\textrm{d}w, \end{eqnarray*} where $$\phi_{\eta_{h}|\boldsymbol{u}}^{\theta}$$ is the probability density function of the univariate normal distribution with mean $$\mu_{\eta_{h}|\boldsymbol{u}}$$ and variance $$1+\sigma_{\eta_{h}|\boldsymbol{u}}^{2}$$, and $$\boldsymbol{\phi}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}^{\theta}$$ is the probability density function of the bivariate normal distribution with mean $$\boldsymbol{\mu}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$ and variance-covariance matrix $$I_{2}+\Sigma_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$. The last term in the likelihood (2.7) is the contribution from pairs where neither experience failure. It can be written as \begin{eqnarray*} &&1-\sum_{h=1}^{2}\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\Phi\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h}\right\}-\sum_{l=1}^{2}\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\Phi\left\{\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}\right\}\\ &&\quad+\sum_{h=1}^{2}\sum_{l=1}^{2}\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\boldsymbol{\Phi}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h},\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}\right\}, \end{eqnarray*} where $$\boldsymbol{\Phi}$$ is the cumulative distribution function of the standard bivariate normal distribution with correlation coefficient $$\rho=0$$. Following along the same lines as above, we integrate with respect to $$\boldsymbol{\eta}_{h,l}$$ over the conditional distribution $$p^{\theta}(\boldsymbol{\eta}_{h,l}|\boldsymbol{u})$$ and obtain \begin{eqnarray*} && 1-\sum_{h=1}^{2}\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\Phi_{\eta_{h}|\boldsymbol{u}}^{\theta}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}\right\}-\sum_{l=1}^{2}\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\Phi_{\eta_{l}|\boldsymbol{u}}^{\theta}\left\{\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}\right\}\\ &&\quad+\sum_{h=1}^{2}\sum_{l=1}^{2}\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\boldsymbol{\Phi}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}^{\theta}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h},\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}\right\}, \end{eqnarray*} where $$\Phi_{\eta_{h}|\boldsymbol{u}}^{\theta}$$ is the cumulative distribution function of the univariate normal distribution with mean $$\mu_{\eta_{h}|\boldsymbol{u}}$$ and variance $$1+\sigma_{\eta_{h}|\boldsymbol{u}}^{2}$$, $$\Phi_{\eta_{l}|\boldsymbol{u}}^{\theta}$$ is the cumulative distribution function of the univariate normal distribution with mean $$\mu_{\eta_{l}|\boldsymbol{u}}$$ and variance $$1+\sigma_{\eta_{l}|\boldsymbol{u}}^{2}$$, and $$\boldsymbol{\Phi}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}^{\theta}$$ is the cumulative distribution function of the bivariate normal distribution with mean $$\boldsymbol{\mu}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$ and variance-covariance matrix $$I_{2}+\Sigma_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$. 2.3. Estimation Since no closed form expression for the marginal likelihood with respect to the random effects in $$\boldsymbol{u}$$ (2.6) exists, we propose that adaptive Gaussian quadrature (AGQ) approximation is used in parameter estimation. In AGQ approximation, Gaussian quadrature rules are used to approximate the integral by a weighted average of the integrand evaluated at $$Q$$ predetermined quadrature points. As the random effects are normally distributed, we use Gauss-Hermite quadrature rules (Pinheiro and Chao, 2006; Hartzel and others, 2001). The accuracy of the AGQ approximation increases with the number of quadrature points. Yet, so does the computational intensity. This is because the $$Q$$ quadrature points are used to estimate the AGQ approximation of each pair in the pairwise composite likelihood. Usually, five or less quadrature points are adequate but it depends on the likelihood (Pinheiro and Bates, 1995). As the approximation of the derivatives may be less precise than the approximation of the marginal likelihood (Lesaffre and Spiessens, 2001), we propose numerical derivatives for estimation of the variance of the parameter estimates. Furthermore, as the likelihood contributions to the composite likelihood (2.5) from pairs within the same cluster are not independent, the Fisher information needs to be substituted by the so-called sandwich estimator when estimating the variance of the parameter estimates (Varin and others, 2011). Let $$\mathcal{U}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})$$ denote the numerically derived score function of the marginal log-likelihood and $$H(\hat{\theta})=\frac{\partial}{\partial\theta} \mathcal{U}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})$$. And let $$V(\hat{\theta})=\sum_{i}^{n}\mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})^{T}\mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})$$ where $$\mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})=\sum_{j=1}^{n_{i}-1}\sum_{k=j+1}^{n_{i}}\mathcal{U}_{i,jk}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})$$, i.e. the score contribution from all pairs in the $$i$$th family. The variance of the parameter estimates can be estimated by $$\{-H(\hat{\theta})\}^{-1}V(\hat{\theta})\{-H(\hat{\theta})\}^{-1}$$. 2.4. Left-truncation and right-censoring Under the proposed model, all cause-specific cumulative incidence functions are specified. As a consequence, left-truncation and right-censoring are easily dealt with. The proposed model readily adjusts for right-censoring by means of the second, third and fourth term of the likelihood (2.7). With regard to left-truncation, let $$V_{ij}$$ denote the truncation time of the $$j$$th individual in the $$i$$th cluster. The truncation probability of the pair $$j,k$$ within the $$i$$th cluster equals the joint survival at time $$(V_{ij},V_{ik})$$, i.e. $$S^{\theta}(V_{ij},V_{ik}|X_{ij},X_{ik})$$. When adjusting for left-truncation, the likehood contribution of the pair $$j,k$$ is given as (ignoring the cluster subscript $$i$$) \begin{eqnarray*} \mathcal{L}_{\textrm{T},jk}(\theta;T_{j},\epsilon_{j},V_{j},X_{j},T_{k},\epsilon_{k},V_{k},X_{k})&=&\left[\prod_{h=1}^{2} \prod_{l=1}^{2}\left\{\frac{f_{hl}^{\theta}(T_{j},T_{k}|X_{j},X_{k})}{S^{\theta}(V_{j},V_{k}|X_{j},X_{k})}\right\}^{(\epsilon_{j}=h)(\epsilon_{k}=l)}\right]\nonumber\\ &&\times\left[\prod_{h=1}^{2}\left\{\frac{ f_{h0}^{\theta}(T_{j},T_{k}|X_{j},X_{k})}{S^{\theta}(V_{j},V_{k}|X_{j},X_{k})}\right\}^{(\epsilon_{j}=h)(\epsilon_{k}=0)}\right]\nonumber\\ &&\times\left[\prod_{l=1}^{2} \left\{\frac{f_{0l}^{\theta}(T_{j},T_{k}|X_{j},X_{k})}{S^{\theta}(V_{j},V_{k}|X_{j},X_{k})}\right\}^{(\epsilon_{j}=0)(\epsilon_{k}=l)}\right]\nonumber\\ &&\times \left[\left\{\frac{S^{\theta}(T_{j},T_{k}|X_{j},X_{k})}{S^{\theta}(V_{j},V_{k}|X_{j},X_{k})}\right\}^{(\epsilon_{j}=0)(\epsilon_{k}=0)}\right], \end{eqnarray*} where $$f_{hl}^{\theta}(T_{j},T_{k}|X_{j},X_{k})=\mathbb{P}(T_{j}=t,\epsilon_{j}=h,T_{k}=s,\epsilon_{k}=l|X_{j},X_{k})$$, $$f_{h0}^{\theta}(T_{j},T_{k}|X_{j},X_{k})=\mathbb{P}(T_{j}=t,\epsilon_{j}=h,T_{k}>s|X_{j},X_{k})$$ and $$f_{0l}^{\theta}(T_{j},T_{k}|X_{j},X_{k})=\mathbb{P}(T_{j}>t,T_{k}=s,\epsilon_{k}=l|X_{j},X_{k})$$. 2.5. Sampling In our analysis of the Danish register-based family data on breast cancer presented in Section 4, we analyze a type of case-cohort sample of all available families. This is because of the size of the cohort (1 292 051 families) and the computational intensity involved in parameter estimation using AGQ approximation. The applied sampling approach follows the method presented by Moger and others (2008). We divide the clusters into $$G$$ disjoint strata and then sample $$m_{g}$$ clusters from stratum $$g$$. Sampling is done without replacement. Under the applied sampling approach, the estimator of the pairwise composite log-likelihood is given as (Kalbfleisch and Lawless, 1988) \begin{eqnarray}\label{eq:stratll} \log\mathcal{L}_{G}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X})=\sum_{g=1}^{G}\frac{n_{g}}{m_{g}} \sum_{i \in D_{g}}\log\mathcal{L}_{i}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X}), \end{eqnarray} (2.8) where $$\log\mathcal{L}_{i}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X})=\sum_{j=1}^{n_{i}-1} \sum_{k=j+1}^{n_{1}}\log\mathcal{L}_{i,jk}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X})$$ is the log-likelihood contribution to the pairwise composite log-likelihood from the $$i$$th cluster, $$n_{g}$$ is the number of clusters in stratum $$g$$ and $$D_{g}$$ is the set of clusters sampled from stratum $$g$$. Due to the sampling, we cannot estimate the variance of the parameters as described in Section 2.3. Instead, the following approach is applied (Kalbfleisch and Lawless, 1988). The score function of the log-likelihood in (2.8) can be written as \begin{eqnarray*} \mathcal{U}_{G}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\theta)&=&\sum_{g=1}^{G}\frac{n_{g}}{m_{g}} \sum_{i \in D_{g}}\mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\theta)\\ &=&\sum_{i=1}^{n} \mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\theta)+\sum_{g=1}^{G}\left(\frac{n_{g}}{m_{g}}-1\right)\sum_{i \in D_{g}} \mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\theta). \end{eqnarray*} Because sampling is done without replacement, families sampled from a given stratum are not independent. In this case, the variance of the parameter estimates are calculated by means of the sandwich estimator $$H_{G}(\hat{\theta})^{-1}\left\{V_{G}(\hat{\theta})+V_{G}^{*}(\hat{\theta})\right\}H_{G}(\hat{\theta})^{-1}$$, where \begin{eqnarray*} H_{G}(\hat{\theta})&=&\sum_{g=1}^{G}\frac{n_{g}}{m_{g}}\sum_{i \in D_{g}}\frac{\partial}{\partial\theta} \mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta}), \quad V_{G}(\hat{\theta})=\sum_{g=1}^{G}\frac{n_{g}}{m_{g}}\sum_{i \in D_{g}}\mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})^{\otimes2} \quad \textrm{and} \quad\\ V_{G}^{*}(\hat{\theta})&=&\sum_{g=1}^{G}\frac{n_{g}(n_{g}-m_{g})}{m_{g}^{2}}\sum_{i\in D_{g}}\left\{\mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})-\bar{\mathcal{U}}_{g}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})\right\}^{\otimes2} \end{eqnarray*} with $$\bar{\mathcal{U}}_{g}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})$$ being the estimated average value of the score in stratum $$g$$. 2.6. Extension of random effects structure We focus on a random effects structure, where the random effects are shared by all individuals within a cluster. However, it is possible to extend the random effects structure to allow for different levels of dependence within a cluster. This may be relevant in analysis of family studies. In family studies, the dependence patterns between different types of relatives (siblings, parent-offspring, etc.) may differ. One way of taking this into account is by decomposing the random effect into a genetic effect and a shared environmental effect following the work of Fisher (1918). Typically, it is assumed that the genetic effect and the shared environmental effect are independent (Khoury and others, 1993; Thomas, 2004). As a result, the variance of the random effect may be decomposed into the corresponding separate components of variance. These variance components can be estimated from the covariance between different types of relatives (Khoury and others, 1993; Thomas, 2004). Under the proposed model, several random effects are involved. In theory all of these can be decomposed, however specification of the covariance of decomposed cross-cause random effects is challenging and the model quickly becomes large and the interpretation of the parameters complicated. Consequently, one may choose to decompose only a few of the random effects. For example, the random effects affecting the failure cause of primary interest. 3. Simulation studies Simulation studies have been used to assess the finite-sample properties of the proposed model and the estimation procedure. We simulated multivariate competing risks data similar to the Danish register-based family data on breast cancer analyzed in Section 4. We simulated 1000 populations of 50 000 clusters of size three. The number of competing causes of failure was two, which gave rise to four cluster-specific random effects. We ran simulations for two different scenarios: (1) a scenario with positive covariance between all random effects and (2) a scenario with zero covariance between the random effects of the two competing causes of failure. The variance-covariance matrices of the random effects in the two scenarios were given as \begin{gather*} \Sigma_{\boldsymbol{\eta}\boldsymbol{u},1}= \begin{pmatrix} 1 & 0.4 & 0.5 & 0.4 \\ & 1 & 0.4 & 0.3 \\ & & 1 & 0.4 \\ & & & 1 \\ \end{pmatrix} \qquad \textrm{and} \qquad \Sigma_{\boldsymbol{\eta}\boldsymbol{u},2}= \begin{pmatrix} 1 & 0.0 & 0.5 & 0.0 \\ & 1 & 0.0 & 0.5 \\ & & 1 & 0.0 \\ & & & 1 \\ \end{pmatrix}. \end{gather*} We used $$X=1$$ for all individuals and set $$\beta_{1}=-1.9$$, $$\beta_{2}=-0.2$$, $$\gamma_{1}=1$$, $$\gamma_{2}=1.2$$, and $$\alpha_{1}(x)=\omega_{1}x$$ and $$\alpha_{2}(x)=\omega_{2}x$$ with $$\omega_{1}=3$$ and $$\omega_{2}=2$$, respectively. The cause of failure was simulated from a multinomial distribution. The probability of failure due to cause $$h$$ was given as $$p_{h}=\exp(X\beta_{h}+u_{h})/\{1+\exp(X\beta_{1}+u_{1})+\exp(X\beta_{2}+u_{2})\}$$ and the probability of no failure at the end of follow-up (censored at end of follow-up) was given as $$p_{0}=1-p_{1}-p_{2}$$. For failure cause 1 and 2, a random number $$\zeta$$ from the standard uniform distribution $$U(0,1)$$ was sampled and the failure time found by isolating $$t$$ in the expression $$\zeta=\Phi\left[\alpha_{h}\{g(t)\}-X\gamma_{h}-\eta_{h}\right]$$. We set $$\delta=80$$. Any failure at time $$\delta$$ was censored. In addition, we let 80% of the simulated data be subject to censoring in the interval $$[0,\delta]$$. Censoring times were drawn from the uniform distribution $$U(0,\delta+30)$$. This resulted in approximately 50% censoring. The clusters were divided into six disjoint strata in a stepwise manner according to pair types within each cluster following the same procedure that was applied in the analysis of the register-based data presented in Section 4. See Figure 2 for an illustration of the stratification procedure. The sampling probabilities of the clusters in the six strata were 1, 0.6, 0.1, 0.05, 0.005, and 0.02, respectively. Sampling was done without replacement. The average distribution of pair types in the simulated populations and analyzed samples is shown in Appendix A of the supplementary material available at Biostatistics online. Parameter estimation was based on the weighted log-likelihood (2.8) and the variance of the parameter estimates was calculated as described in Section 2.5. We used AGQ approximation with five quadrature points in parameter estimation. All computations were performed using the statistical software R (R Core Team, 2017). Fig. 2. View largeDownload slide Stratification procedure applied in the data example. The same procedure is used in the simulations though the numbers differ. Fig. 2. View largeDownload slide Stratification procedure applied in the data example. The same procedure is used in the simulations though the numbers differ. The results from the simulation studies are shown in Table 1. Overall, the model performed well, the parameter estimates were unbiased and the coverage rates were good. However, the averages of the standard errors were too large compared to the standard deviations of the parameter estimates. Further investigation revealed that a number of large values distorted the mean for all parameters. Consequently, we have chosen to report the median of the standard errors in Table 1. We suspect that the irregularities were caused by the numerical derivative of the AGQ approximation that was used for estimation of the variance of the parameter estimates or the number of quadrature points. Table 1. Results from the two simulation studies. Shown are the average of estimates (Av.), the standard deviations (SD), the median of the standard errors $$($$Median SE$$)$$, and the coverage probabilities of the 95% confidence interval $$($$CI$$)$$ True value Av. SD Median SE Cov Scenario 1 $$\beta_{1}$$ $$-$$1.90 $$-$$1.90 0.08 0.09 0.96 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.00 0.11 0.12 0.96 $$\gamma_{2}$$ 1.20 1.20 0.14 0.14 0.94 $$\omega_{1}$$ 3.00 3.01 0.08 0.09 0.97 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.95 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.13 0.14 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.14 0.14 0.94 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.19 0.20 0.96 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.32 0.30 0.94 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.40 0.40 0.07 0.07 0.95 $$\textrm{cov}(u_{1},u_{2})$$ 0.40 0.42 0.18 0.19 0.97 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.16 0.16 0.95 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.40 0.40 0.16 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.40 0.40 0.19 0.19 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.30 0.31 0.23 0.23 0.94 Scenario 2 $$\beta_{1}$$ $$-$$1.90 $$-$$1.89 0.08 0.09 0.98 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.01 0.11 0.12 0.97 $$\gamma_{2}$$ 1.20 1.21 0.14 0.14 0.93 $$\omega_{1}$$ 3.00 3.01 0.08 0.08 0.96 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.96 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.12 0.12 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.15 0.15 0.93 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.18 0.20 0.97 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.31 0.30 0.93 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.00 0.00 0.06 0.07 0.98 $$\textrm{cov}(u_{1},u_{2})$$ 0.00 0.02 0.15 0.17 0.95 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.14 0.15 0.97 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.00 0.00 0.13 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.00 0.01 0.18 0.18 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.50 0.50 0.23 0.23 0.94 True value Av. SD Median SE Cov Scenario 1 $$\beta_{1}$$ $$-$$1.90 $$-$$1.90 0.08 0.09 0.96 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.00 0.11 0.12 0.96 $$\gamma_{2}$$ 1.20 1.20 0.14 0.14 0.94 $$\omega_{1}$$ 3.00 3.01 0.08 0.09 0.97 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.95 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.13 0.14 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.14 0.14 0.94 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.19 0.20 0.96 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.32 0.30 0.94 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.40 0.40 0.07 0.07 0.95 $$\textrm{cov}(u_{1},u_{2})$$ 0.40 0.42 0.18 0.19 0.97 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.16 0.16 0.95 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.40 0.40 0.16 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.40 0.40 0.19 0.19 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.30 0.31 0.23 0.23 0.94 Scenario 2 $$\beta_{1}$$ $$-$$1.90 $$-$$1.89 0.08 0.09 0.98 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.01 0.11 0.12 0.97 $$\gamma_{2}$$ 1.20 1.21 0.14 0.14 0.93 $$\omega_{1}$$ 3.00 3.01 0.08 0.08 0.96 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.96 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.12 0.12 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.15 0.15 0.93 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.18 0.20 0.97 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.31 0.30 0.93 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.00 0.00 0.06 0.07 0.98 $$\textrm{cov}(u_{1},u_{2})$$ 0.00 0.02 0.15 0.17 0.95 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.14 0.15 0.97 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.00 0.00 0.13 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.00 0.01 0.18 0.18 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.50 0.50 0.23 0.23 0.94 Table 1. Results from the two simulation studies. Shown are the average of estimates (Av.), the standard deviations (SD), the median of the standard errors $$($$Median SE$$)$$, and the coverage probabilities of the 95% confidence interval $$($$CI$$)$$ True value Av. SD Median SE Cov Scenario 1 $$\beta_{1}$$ $$-$$1.90 $$-$$1.90 0.08 0.09 0.96 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.00 0.11 0.12 0.96 $$\gamma_{2}$$ 1.20 1.20 0.14 0.14 0.94 $$\omega_{1}$$ 3.00 3.01 0.08 0.09 0.97 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.95 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.13 0.14 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.14 0.14 0.94 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.19 0.20 0.96 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.32 0.30 0.94 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.40 0.40 0.07 0.07 0.95 $$\textrm{cov}(u_{1},u_{2})$$ 0.40 0.42 0.18 0.19 0.97 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.16 0.16 0.95 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.40 0.40 0.16 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.40 0.40 0.19 0.19 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.30 0.31 0.23 0.23 0.94 Scenario 2 $$\beta_{1}$$ $$-$$1.90 $$-$$1.89 0.08 0.09 0.98 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.01 0.11 0.12 0.97 $$\gamma_{2}$$ 1.20 1.21 0.14 0.14 0.93 $$\omega_{1}$$ 3.00 3.01 0.08 0.08 0.96 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.96 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.12 0.12 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.15 0.15 0.93 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.18 0.20 0.97 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.31 0.30 0.93 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.00 0.00 0.06 0.07 0.98 $$\textrm{cov}(u_{1},u_{2})$$ 0.00 0.02 0.15 0.17 0.95 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.14 0.15 0.97 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.00 0.00 0.13 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.00 0.01 0.18 0.18 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.50 0.50 0.23 0.23 0.94 True value Av. SD Median SE Cov Scenario 1 $$\beta_{1}$$ $$-$$1.90 $$-$$1.90 0.08 0.09 0.96 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.00 0.11 0.12 0.96 $$\gamma_{2}$$ 1.20 1.20 0.14 0.14 0.94 $$\omega_{1}$$ 3.00 3.01 0.08 0.09 0.97 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.95 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.13 0.14 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.14 0.14 0.94 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.19 0.20 0.96 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.32 0.30 0.94 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.40 0.40 0.07 0.07 0.95 $$\textrm{cov}(u_{1},u_{2})$$ 0.40 0.42 0.18 0.19 0.97 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.16 0.16 0.95 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.40 0.40 0.16 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.40 0.40 0.19 0.19 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.30 0.31 0.23 0.23 0.94 Scenario 2 $$\beta_{1}$$ $$-$$1.90 $$-$$1.89 0.08 0.09 0.98 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.01 0.11 0.12 0.97 $$\gamma_{2}$$ 1.20 1.21 0.14 0.14 0.93 $$\omega_{1}$$ 3.00 3.01 0.08 0.08 0.96 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.96 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.12 0.12 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.15 0.15 0.93 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.18 0.20 0.97 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.31 0.30 0.93 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.00 0.00 0.06 0.07 0.98 $$\textrm{cov}(u_{1},u_{2})$$ 0.00 0.02 0.15 0.17 0.95 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.14 0.15 0.97 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.00 0.00 0.13 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.00 0.01 0.18 0.18 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.50 0.50 0.23 0.23 0.94 4. Data example We have applied the proposed model in analysis of Danish register-based family data on breast cancer among women. Breast cancer is the most common malignancy in women (McPherson and others, 2000) and it is well-established that women with a family history of the disease have an increased risk, especially women with a family history of early disease onset (Claus and others, 1990; Brandt and others, 2010). The aim of the analysis was to investigate the cumulative incidence of breast cancer and the dependence between mothers and daughters. Hence, we did not look at the dependence between sisters. The study cohort was formed by merging data from the Danish Civil Registration System (Pedersen, 2011) and data from the Danish Cancer Registry (Gjerstorff, 2011) by means of the unique personal identifier, which all Danish residents have received at birth or when obtaining residential permit since 1968. Information on death, disappearance and emigration as well as parental information is contained within the Danish Civil Registration System. The study cohort consisted of mothers and up to three daughters born between 1 January 1960 and 31 December 2012. If the mother had more than three daughters born in that time period, the three oldest were included. Women born before 1968 had to be alive in 1968 to be included. This applied to both mothers and daughters. Start of follow-up was 1 January 1968 or at date of birth, whichever came last. Accordingly, data were subject to left-truncation. End of follow-up was date of breast cancer diagnosis, date of other cancer diagnosis, date of emigration or disappearance, date of 80th birthday, date of death or 31 December 2013, whichever came first. Age was used as timescale. The failure cause of interest was breast cancer and competing causes of failures were death and other cancers, which were grouped together. The cohort consisted of 1 292 051 families and 3 029 653 individuals: 908 002 (70.3%) families with a mother and a single daughter, 322 547 (25.0%) families with a mother and two daughters, and 61 502 (4.8%) families with a mother and three daughters. Due to the size of the cohort and the computational intensity involved in parameter estimation using AGQ approximation, we analyzed a sample of the cohort of families. Our sampling followed the method described in Section 2.5. Based on the occurrence of breast cancer and the competing cause of failure, we divided the families of mothers and daughters into six disjoint strata. The strata were formed in a stepwise manner as illustrated in Figure 2. The sampling probabilities of the families in the six strata were 1, 0.5, 0.25, 0.05, 0.025, and 0.0025, respectively. In estimation, we used AGQ approximation with five quadrature points. We chose $$\alpha_{1}(x)=\omega_{1}x$$ and $$\alpha_{2}(x)=\omega_{2}x$$, respectively, with the restriction $$\omega_{1},\omega_{2}\geq0$$. We adjusted for left-truncation and set $$X=1$$ for all individuals. Parameter estimation was based on the weighted log-likelihood (2.8) and the variance of the parameter estimates was calculated as described in Section 2.5. In Appendix B of the supplementary material available at Biostatistics online, the cohort is summarised and the distribution of pair types entering the model shown. The results are shown in Table 2. We found within-cluster dependence of breast cancer with regard to both risk and timing. The variance of the random effect affecting timing was 0.07 [95% confidence interval (CI) 0.01; 0.35]. The variance of the random effect affecting risk of breast cancer was 1.16 (95% CI 0.60; 2.23). The variance of the random effect affecting risk level was quite high, indicating a great risk heterogeneity of breast cancer among families. The estimated covariances between the random effects in $$\boldsymbol{\eta}$$ and $$\boldsymbol{u}$$ were insignificant except for the covariance between $$\eta_{1}$$ and $$u_{1}$$. The two random effects $$\eta_{1}$$ and $$u_{1}$$ were negatively correlated. Their covariance was $$-0.27$$ (95% CI $$-0.41$$; $$-0.12$$), corresponding to a correlation of approximately $$-0.95$$. This corresponds to an increased absolute risk of breast cancer that is reached quickly and vice versa. We did a visual comparison of the cumulative incidence of breast cancer based on the parameter estimates to that of the non-parametric Aalen–Johansen estimator (Aalen and Johansen, 1978), which is shown in Appendix B of the supplementary material available at Biostatistics online. Overall, the estimates agreed. We expect that had we used a more flexible form of $$\alpha_{1}(x)$$, they would have been more alike. The estimated marginal cumulative incidence of breast cancer reached at age 80 was 9.3% (95% CI 8.7; 9.9). Based on the parameter estimates, we calculated the marginal cumulative incidence of breast cancer as well as the casewise concordance, i.e. the cumulative incidence of breast cancer at a given age given that a mother or a daughter has been diagnosed with breast cancer before or at the same age, and the relative recurrence risk ratio, i.e. the ratio between the casewise concordance and the marginal cumulative incidence of breast cancer. These are shown in Figure 3A–C. The estimated casewise concordance and relative recurrence risk ratio of breast cancer were in concordance with that of dizygotic female twins estimated in a recent study (Møller and others, 2016). Furthermore, to investigate the variation in risk level as well as failure time distribution of breast cancer caused by the estimated variance and covariance of the random effects in our model, we drew a sample of 1 000 000 observations from the estimated multivariate normal distribution of $$\boldsymbol{\eta}$$ and $$\boldsymbol{u}$$. The risk level of breast cancer at age 80 for each of the observations was estimated as $$p_{1}=\exp(X\hat{\beta}_{1}+u_{1})/\{1+\exp(X\hat{\beta}_{1}+u_{1})+\exp(X\hat{\beta}_{2}+u_{2})\}$$. The mean of the estimated probabilities was 9.3% corresponding to the estimated marginal cumulative incidence at age 80. In addition, 95% of the probabilities were within the interval $$[1.3; 29.9]$$. For each observation, we estimated the cumulative incidence in the interval $$[0; 80]$$ using the proposed model. Based on the estimated risk levels of breast cancer at age 80, we divided our observations into quartiles and calculated for each quartile the average cumulative incidence in the interval $$[0; 80]$$. These are shown in Figure 3D. To compare the failure time distributions of the quartiles, we identified the time points where the cumulative incidences were equal to half of the risk levels reached at age 80. These are marked with a black dot in Figure 3D. As seen, a high absolute risk of breast cancer at age 80 was associated with an earlier onset, which resulted from the negative correlation between $$\eta_{1}$$ and $$u_{1}$$. Fig. 3. View largeDownload slide The estimated marginal cumulative incidence of breast cancer (A), the estimated casewise concordance (B) and the relative recurrence risk ratio (C) as well as the average cumulative incidence of 1 000 000 observations simulated from a model based on the parameter estimates (D). The observations in plot D have been divided into quartiles based on the estimated risk levels reached at age 80. The black dots mark the time points where the cumulative incidences equal half of the risk levels reached at age 80. In plots A–C, ($$\Huge {-}$$) corresponds to the estimates and (− − − −) corresponds to 95% confidence intervals. In plot D, ($$\Huge {-}$$) corresponds to the 1st quartile of the estimated risk level, (− − − −) corresponds to the 2nd quartile of the estimated risk level, ($$\cdots\cdots\cdots$$) corresponds to the 3rd quartile of the estimated risk level, and ($$-\cdot -\cdot -\cdot$$) corresponds to the 4th quartile of the estimated risk level. Fig. 3. View largeDownload slide The estimated marginal cumulative incidence of breast cancer (A), the estimated casewise concordance (B) and the relative recurrence risk ratio (C) as well as the average cumulative incidence of 1 000 000 observations simulated from a model based on the parameter estimates (D). The observations in plot D have been divided into quartiles based on the estimated risk levels reached at age 80. The black dots mark the time points where the cumulative incidences equal half of the risk levels reached at age 80. In plots A–C, ($$\Huge {-}$$) corresponds to the estimates and (− − − −) corresponds to 95% confidence intervals. In plot D, ($$\Huge {-}$$) corresponds to the 1st quartile of the estimated risk level, (− − − −) corresponds to the 2nd quartile of the estimated risk level, ($$\cdots\cdots\cdots$$) corresponds to the 3rd quartile of the estimated risk level, and ($$-\cdot -\cdot -\cdot$$) corresponds to the 4th quartile of the estimated risk level. Table 2. Results from the analysis of data on breast cancer among Danish women. Shown are the estimates and the lower and upper limits of the 95% CIs. Estimate Lower Upper $$\beta_{1}$$ $$-1.89$$ $$-2.11$$ $$-1.66$$ $$\beta_{2}$$ $$-0.13$$ $$-0.29$$ $$-0.03$$ $$\gamma_{1}$$ $$1.82$$ $$1.59$$ $$2.06$$ $$\gamma_{2}$$ $$1.42$$ $$1.33$$ $$1.50$$ $$\omega_{1}$$ $$2.44$$ $$2.27$$ $$2.61$$ $$\omega_{2}$$ $$1.12$$ $$1.05$$ $$1.18$$ $$\sigma_{\eta_{1}}^{2}$$ $$0.07$$ $$0.01$$ $$0.35$$ $$\sigma_{\eta_{2}}^{2}$$ $$0.01$$ $$0.00$$ $$0.35$$ $$\sigma_{u_{1}}^{2}$$ $$1.16$$ $$0.60$$ $$2.23$$ $$\sigma_{u_{2}}^{2}$$ $$0.52$$ $$0.10$$ $$2.79$$ $$\textrm{cov}(\eta_{1},\eta_{2})$$ $$0.01$$ $$-0.06$$ $$0.07$$ $$\textrm{cov}(\eta_{1},u_{1})$$ $$-0.27$$ $$-0.41$$ $$-0.12$$ $$\textrm{cov}(\eta_{1},u_{2})$$ $$-0.07$$ $$-0.42$$ $$0.27$$ $$\textrm{cov}(\eta_{2},u_{1})$$ $$-0.06$$ $$-0.15$$ $$0.04$$ $$\textrm{cov}(\eta_{2},u_{2})$$ $$-0.07$$ $$-0.29$$ $$0.01$$ $$\textrm{cov}(u_{1},u_{2})$$ $$0.51$$ $$-0.29$$ $$1.30$$ Estimate Lower Upper $$\beta_{1}$$ $$-1.89$$ $$-2.11$$ $$-1.66$$ $$\beta_{2}$$ $$-0.13$$ $$-0.29$$ $$-0.03$$ $$\gamma_{1}$$ $$1.82$$ $$1.59$$ $$2.06$$ $$\gamma_{2}$$ $$1.42$$ $$1.33$$ $$1.50$$ $$\omega_{1}$$ $$2.44$$ $$2.27$$ $$2.61$$ $$\omega_{2}$$ $$1.12$$ $$1.05$$ $$1.18$$ $$\sigma_{\eta_{1}}^{2}$$ $$0.07$$ $$0.01$$ $$0.35$$ $$\sigma_{\eta_{2}}^{2}$$ $$0.01$$ $$0.00$$ $$0.35$$ $$\sigma_{u_{1}}^{2}$$ $$1.16$$ $$0.60$$ $$2.23$$ $$\sigma_{u_{2}}^{2}$$ $$0.52$$ $$0.10$$ $$2.79$$ $$\textrm{cov}(\eta_{1},\eta_{2})$$ $$0.01$$ $$-0.06$$ $$0.07$$ $$\textrm{cov}(\eta_{1},u_{1})$$ $$-0.27$$ $$-0.41$$ $$-0.12$$ $$\textrm{cov}(\eta_{1},u_{2})$$ $$-0.07$$ $$-0.42$$ $$0.27$$ $$\textrm{cov}(\eta_{2},u_{1})$$ $$-0.06$$ $$-0.15$$ $$0.04$$ $$\textrm{cov}(\eta_{2},u_{2})$$ $$-0.07$$ $$-0.29$$ $$0.01$$ $$\textrm{cov}(u_{1},u_{2})$$ $$0.51$$ $$-0.29$$ $$1.30$$ Table 2. Results from the analysis of data on breast cancer among Danish women. Shown are the estimates and the lower and upper limits of the 95% CIs. Estimate Lower Upper $$\beta_{1}$$ $$-1.89$$ $$-2.11$$ $$-1.66$$ $$\beta_{2}$$ $$-0.13$$ $$-0.29$$ $$-0.03$$ $$\gamma_{1}$$ $$1.82$$ $$1.59$$ $$2.06$$ $$\gamma_{2}$$ $$1.42$$ $$1.33$$ $$1.50$$ $$\omega_{1}$$ $$2.44$$ $$2.27$$ $$2.61$$ $$\omega_{2}$$ $$1.12$$ $$1.05$$ $$1.18$$ $$\sigma_{\eta_{1}}^{2}$$ $$0.07$$ $$0.01$$ $$0.35$$ $$\sigma_{\eta_{2}}^{2}$$ $$0.01$$ $$0.00$$ $$0.35$$ $$\sigma_{u_{1}}^{2}$$ $$1.16$$ $$0.60$$ $$2.23$$ $$\sigma_{u_{2}}^{2}$$ $$0.52$$ $$0.10$$ $$2.79$$ $$\textrm{cov}(\eta_{1},\eta_{2})$$ $$0.01$$ $$-0.06$$ $$0.07$$ $$\textrm{cov}(\eta_{1},u_{1})$$ $$-0.27$$ $$-0.41$$ $$-0.12$$ $$\textrm{cov}(\eta_{1},u_{2})$$ $$-0.07$$ $$-0.42$$ $$0.27$$ $$\textrm{cov}(\eta_{2},u_{1})$$ $$-0.06$$ $$-0.15$$ $$0.04$$ $$\textrm{cov}(\eta_{2},u_{2})$$ $$-0.07$$ $$-0.29$$ $$0.01$$ $$\textrm{cov}(u_{1},u_{2})$$ $$0.51$$ $$-0.29$$ $$1.30$$ Estimate Lower Upper $$\beta_{1}$$ $$-1.89$$ $$-2.11$$ $$-1.66$$ $$\beta_{2}$$ $$-0.13$$ $$-0.29$$ $$-0.03$$ $$\gamma_{1}$$ $$1.82$$ $$1.59$$ $$2.06$$ $$\gamma_{2}$$ $$1.42$$ $$1.33$$ $$1.50$$ $$\omega_{1}$$ $$2.44$$ $$2.27$$ $$2.61$$ $$\omega_{2}$$ $$1.12$$ $$1.05$$ $$1.18$$ $$\sigma_{\eta_{1}}^{2}$$ $$0.07$$ $$0.01$$ $$0.35$$ $$\sigma_{\eta_{2}}^{2}$$ $$0.01$$ $$0.00$$ $$0.35$$ $$\sigma_{u_{1}}^{2}$$ $$1.16$$ $$0.60$$ $$2.23$$ $$\sigma_{u_{2}}^{2}$$ $$0.52$$ $$0.10$$ $$2.79$$ $$\textrm{cov}(\eta_{1},\eta_{2})$$ $$0.01$$ $$-0.06$$ $$0.07$$ $$\textrm{cov}(\eta_{1},u_{1})$$ $$-0.27$$ $$-0.41$$ $$-0.12$$ $$\textrm{cov}(\eta_{1},u_{2})$$ $$-0.07$$ $$-0.42$$ $$0.27$$ $$\textrm{cov}(\eta_{2},u_{1})$$ $$-0.06$$ $$-0.15$$ $$0.04$$ $$\textrm{cov}(\eta_{2},u_{2})$$ $$-0.07$$ $$-0.29$$ $$0.01$$ $$\textrm{cov}(u_{1},u_{2})$$ $$0.51$$ $$-0.29$$ $$1.30$$ 5. Discussion We have proposed a new model for the cause-specific cumulative incidence of multivariate competing risks data. The model is a random effects model that allows for within-cluster dependence with regard to both risk and timing. We have assumed that the random effects are shared by all individuals in a cluster. Consequently, the model in its present form only captures positive dependence. However, extensions to different settings with different random effects structures are possible. Moreover, we have assumed that the random effects are constant over time but they may time-dependent. This is another extension that could be considered. Under the proposed model, all cause-specific and cross-cause associations are specified. Thus, left-truncation and right-censoring are easily dealt and no specification of a censoring distribution is required. In addition, the model has the potential to be applied in analysis of family studies focusing on coaggregation of diseases, e.g. breast and ovarian cancer. The proposed model calls for parameter estimation by numerical integration since no closed form expression for the marginal likelihood exists. We propose that AGQ approximation is used. In the presented simulations and data example, we used Gauss-Hermite quadrature rules with five quadrature points. We have assessed the proposed model through simulation studies. The model performed well and the parameter estimates were unbiased. However, a number of the estimated standard errors were too big. This may be caused by the use of the numerical derivative of the AGQ approximation in the estimation of the variance of the parameter estimates or the number of quadrature points and is being further investigated. The proposed model provides a framework for exploring and making inference about the distribution of age at disease onset and to investigate how absolute risk of disease is related to age at onset. We have used the proposed model to investigate the cumulative risk of breast cancer and the within-family dependence among women in Denmark. We found a within-family dependence with regard to both absolute risk and failure time distribution. The two were negatively correlated suggesting that an increased absolute risk was associated with an early onset and vice versa. We found no significant association between breast cancer and the competing cause of failure which included death and other cancers. However, these dependence measures may be difficult to identify. In general, development of procedures for model checking and selection is needed. Statistical tests for model selection may be based on the composite likelihood; in this respect, it is important to note that statistical tests concerning the variance of the random effects, i.e. $$\sigma^{2}_{\eta_{h}}=0$$ and $$\sigma^{2}_{u_{h}}=0$$, are on the boundary of the parameter space. This should be taken into account. For example, following along the lines of Self and Liang (1987), Liang and Self (1996), or Greven and others (2008). If the variance of any of the random effects in $$\boldsymbol{u}$$ is equal to zero, the model becomes much simpler and parameter estimation easier. If $$\sigma_{u_1}^{2}=\sigma_{u_2}^{2}=0$$, parameter estimation can be done without the use of numerical methods. 6. Code Availability The proposed model has been implemented in R (R Core Team, 2017) in the package mcif, which is available for download at https://github.com/kkholst. An example of how to apply the code is provided in Appendix C of the supplementary material available at Biostatistics online. Supplementary material Supplementary material is available at http://biostatistics.oxfordjournals.org. Acknowledgments This work was supported by the Danish Childhood Cancer Foundation (grant number 2011-46), the Danish Health Foundation (grant number 20128011), and the Danish Agency for Science, Technology and Innovation (grant number 1355-00053B). Conflict of Interest: None declared. References Aalen O. O. and Johansen S. ( 1978 ). An empirical transition matrix for non-homogeneous markov chains based on censored observations. Scandinavian Journal of Statistics 5 , 141 – 150 . Andersen P. K. , Geskus R. B. , de Witte T. and Putter H. ( 2012 ). Competing risks in epidemiology: possibilities and pitfalls. International Journal of Epidemiology 31 , 861 – 870 . Google Scholar CrossRef Search ADS Bandeen-Roche K. ( 2014 ). Familial studies. In: Klein J. P. , van Houwelingen H. C. , Ibrahim J. G. and Scheike T. H. (editors), Handbook of Survival Analysis , Chapter 27 . Boca Raton : Chapman & Hall/CRC , pp. 549 – 568 . Brandt A. , Bermejo J. L. , Sundquist J. and Hemminki K. ( 2008 ). Age of onset in familial cancer. Annals of Oncology 19 , 2084 – 2088 . Google Scholar CrossRef Search ADS PubMed Brandt A. , Bermejo J. L. , Sundquist J. and Hemminki K. ( 2010 ). Age of onset in familial breast cancer as background data for medical surveillance. British Journal of Cancer 102 , 42 – 47 . Google Scholar CrossRef Search ADS PubMed Burton Paul R. , Tobin Martin D. and Hopper John L. ( 2005 ). Key concepts in genetic epidemiology. Lancet 366 , 941 – 951 . Google Scholar CrossRef Search ADS PubMed Cheng Y. and Fine J. ( 2012 ). Cumulative incidence association models for bivariate competing risks data. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 74 , 183 – 202 . Google Scholar CrossRef Search ADS Cheng Y. , Fine J. P. and Kosorok M. R. ( 2007 ). Nonparametric association analysis of bivariate competing-risks data. Journal of the American Statistical Association 102 , 1407 – 1415 . Google Scholar CrossRef Search ADS Cheng Y. , Fine J. P. and Kosorok M. R. ( 2009 ). Nonparametric association analysis of exchangeable clustered competing risks data. Biometrics 65 , 385 – 393 . Google Scholar CrossRef Search ADS PubMed Claus E. B. , Risch N. J. and Thompson W. D. ( 1990 ). Age at onset as an indicator of familial risk of breast cancer. American Journal of Epidemiology 131 , 961 – 972 . Google Scholar CrossRef Search ADS PubMed Cox D. R. and Reid N. ( 2004 ). A note on pseudolikelihood constructed from marginal densities. Biometrika 91 , 729 – 737 . Google Scholar CrossRef Search ADS Fisher R. A. ( 1918 ). The correlation between relatives on the supposition of Mendelian inheritance. Transactions of the Royal Society of Edinburgh 52 , 399 – 433 . Google Scholar CrossRef Search ADS Gerds T. A. , Scheike T. H. and Andersen P. K. ( 2012 ). Absolute risk regression for competing risks: interpretation, link functions, and prediction. Statistics in Medicine 31 , 3921 – 3930 . Google Scholar CrossRef Search ADS PubMed Gjerstorff M. L. ( 2011 ). The Danish Cancer Registry. Scandinavian Journal of Public Health 39 , 42 – 45 . Google Scholar CrossRef Search ADS PubMed Greven S. , Crainiceanu C. M. , Küchenhoff H. and Peters A. ( 2008 ). Restricted likelihood ratio testing for zero variance components in linear mixed models. Journal of Computational and Graphical Statistics 17 , 870 – 891 . Google Scholar CrossRef Search ADS Hartzel J. , Agresti A. and Caffo B. ( 2001 ). Multinomial logit random effects models. Statistical Modelling 1 , 81 – 102 . Google Scholar CrossRef Search ADS Kalbfleisch J. D. and Lawless J. F. ( 1988 ). Likelihood analysis of multi-state models for disease incidence and mortality. Statistics in Medicine 7 , 149 – 160 . Google Scholar CrossRef Search ADS PubMed Kharazmi E. , Fallah M. , Sundquist K. and Hemminki K. ( 2012 ). Familial risk of early and late onset cancer: nationwide prospective cohort study. BMJ 345 , 1 – 8 . Google Scholar CrossRef Search ADS Khoury M. J. , Beaty T. H. and Cohen B. H. ( 1993 ). Fundamentals of Genetic Epidemiology . New York : Oxford University Press . Kuk A. Y.C. ( 1992 ). A semiparametric mixture model for the analysis of competing risks data. Australian Journal of Statistics 34 , 169 – 180 . Google Scholar CrossRef Search ADS Larsen K. , Petersen J. H. , Budtz-Jørgensen E. and Endahl L. ( 2000 ). Interpreting parameters in the logistic regression model with random effects. Biometrics 56 , 909 – 914 . Google Scholar CrossRef Search ADS PubMed Larson M. G. and Dinse G. E. ( 1985 ). A mixture model for the regression analysis of competing risks data. Journal of the Royal Statistical Society. Series C (Applied Statistics) 34 , 201 – 211 . Lesaffre E. and Spiessens B. ( 2001 ). On the effect of the number of quadrature points in a logistic random-effects model: an example. Applied Statistics 50 , 325 – 335 . Liang K.-Y. and Self S. G. ( 1996 ). On the asymptotic behaviour of the pseudolikelihood ratio test statistic. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 58 , 785 – 796 . McPherson K. , Steel C. M. and Dixon J. M. ( 2000 ). ABC of breast diseases. Breast cancer–epidemiology, risk factors, and genetics. BMJ 321 , 624 – 628 . Google Scholar CrossRef Search ADS PubMed Moger T. A. , Pawitan Y. and Borgan Ø. ( 2008 ). Case-cohort methods for survival data on families from routine registers. Statistics in Medicine 27 , 1062 – 1074 . Google Scholar CrossRef Search ADS PubMed Møller S. , Mucci L. A. , Harris J. R. , Scheike T. , Holst K. , Halekoh U. , Adami H.-O. , Czene K. , Christensen K. , Holm N. V. , and others. ( 2016 ). The heritability of breast cancer among women in the nordic twin study of cancer. Cancer Epidemiology Biomarkers & Prevention 25 , 145 – 150 . Google Scholar CrossRef Search ADS Naskar M. , Das K. and Ibrahim J. G. ( 2005 ). A semiparametric mixture model for analyzing clustered competing risks data. Biometrics 61 , 729 – 737 . Google Scholar CrossRef Search ADS PubMed Pedersen C. B. ( 2011 ). The Danish Civil Registration System. Scandinavian Journal of Public Health 39 , 22 – 25 . Google Scholar CrossRef Search ADS PubMed Petrucelli N. , Daly M. B. and Feldman G. L. ( 2010 ). Hereditary breast and ovarian cancer due to mutations in BRCA1 and BRCA2. Genetics in Medicine 12 , 245 – 259 . Google Scholar CrossRef Search ADS PubMed Pinheiro J. C. and Bates D. M. ( 1995 ). Approximations to the log-likelihood function in the nonlinear mixed-effects model. Journal of Computational and Graphical Statistics 4 , 12 – 35 . Pinheiro J. C. and Chao E. C. ( 2006 ). Efficient Laplacian and adaptive Gaussian quadrature algorithms for multilevel generalized linear mixed models. Journal of Computational and Graphical Statistics 15 , 58 – 81 . Google Scholar CrossRef Search ADS R Core Team. ( 2017 ). R: A Language and Environment for Statistical Computing . R Foundation for Statistical Computing , Vienna, Austria . Scheike T. H. and Sun Y. ( 2012 ). On cross-odds ratio for multivariate competing risks data. Biostatistics 13 , 680 – 694 . Google Scholar CrossRef Search ADS PubMed Scheike T. H. , Sun Y. , Zhang M.-J. and Jensen T. K. ( 2010 ). A semiparametric random effects model for multivariate competing risks data. Biometrika 97 , 133 – 145 . Google Scholar CrossRef Search ADS PubMed Self S. G. and Liang K.-Y. ( 1987 ). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. Journal of the American Statistical Association 82 , 605 – 610 . Google Scholar CrossRef Search ADS Shi H. , Cheng Y. and Jeong J.-H. ( 2013 ). Constrained parametric model for simultaneous inference of two cumulative incidence functions. Biometrical Journal 55 , 82 – 96 . Google Scholar CrossRef Search ADS PubMed Shih J. H. and Albert P. S. ( 2010 ). Modeling familial association of ages at onset of disease in the presence of competing risk. Biometrics 66 , 1012 – 1023 . Google Scholar CrossRef Search ADS PubMed Thomas D. C. ( 2004 ). Statistical methods in genetic epidemiology . New York : Oxford University Press . Varin C. , Reid N. and Firth D. ( 2011 ). An overview of composite likelihood methods. Statistica Sinica 21 , 5 – 42 . © The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Biostatistics Oxford University Press

# Modeling the cumulative incidence function of multivariate competing risks data allowing for within-cluster dependence of risk and timing

, Volume Advance Article – Jan 4, 2018
19 pages

/lp/ou_press/modeling-the-cumulative-incidence-function-of-multivariate-competing-DyA4TE2Va1
Publisher
Oxford University Press
ISSN
1465-4644
eISSN
1468-4357
D.O.I.
10.1093/biostatistics/kxx072
Publisher site
See Article on Publisher Site

### Abstract

Summary We propose to model the cause-specific cumulative incidence function of multivariate competing risks data using a random effects model that allows for within-cluster dependence of both risk and timing. The model contains parameters that makes it possible to assess how the two are connected, e.g. if high-risk is related to early onset. Under the proposed model, the cumulative incidences of all failure causes are modeled and all cause-specific and cross-cause associations specified. Consequently, left-truncation and right-censoring are easily dealt with. The proposed model is assessed using simulation studies and applied in analysis of Danish register-based family data on breast cancer. 1. Introduction Multivariate competing risks data arise often in clinical and epidemiological research, for example in multicenter clinical trials, i.e. clinical trials with participants from more than one medical center, and in family studies examining disease occurrence in related individuals. Study subjects within the same cluster (e.g. medical center or family) cannot be assumed independent and one must take this within-cluster dependence into account when analyzing the data. Focusing on family studies, the within-cluster dependence, which is here a within-family dependence, is often the key point of interest or at least as important as determining the role of different risk factors (Bandeen-Roche, 2014). This is because the within-family dependence can be viewed as an expression of familial aggregation and may reflect both disease heritability and the impact of shared environmental effects (Burton and others, 2005; Bandeen-Roche, 2014). In the competing risks setting, one has the choice between analyzing the multivariate data on the hazard scale focusing on the cause-specific hazard or on the probability scale focusing on the cause-specific cumulative incidence. Both approaches are equally relevant and may complement each other (Andersen and others, 2012). However, in analysis of family studies, there is often a strong interest in describing age at disease onset including within-family dependence. See for example, Brandt and others (2008, 2010) and Kharazmi and others (2012). The distribution of age at disease onset is directly described by the cause-specific cumulative incidence, which is the focus of this article. We propose a model for the cumulative incidence, which provides a framework for exploring and making inference about the distribution of age at disease onset. For example, the model makes it possible to investigate how absolute risk of disease is related to age at onset. We find in our presented application of the model that breast cancer with an early onset is associated with a higher absolute risk, which is consistent with specific genetic risk profiles such as mutations in the BRCA1 and BRCA2 genes (Petrucelli and others, 2010). Thus, we quantify the dependence of both risk and timing for breast cancer and find that these are significantly related. Despite considerable interest, the number of available statistical models for assessing the cumulative incidence and the within-cluster dependence of multivariate competing risks data is limited. The within-cluster dependence of the cause-specific cumulative incidence can be assessed by direct modeling of different dependence measures. For example, the ratio between the bivariate cumulative incidence function and the product of the marginal counterparts (Cheng and others, 2007, 2009), the conditional cumulative incidence function (Shih and Albert, 2010) or the cross-odds ratio (Scheike and Sun, 2012). Alternatively, one can employ a frailty-based two-stage approach, where the marginal cumulative incidence functions are estimated in the first stage and a dependence parameter is estimated in the second stage using an Archimedean copula (Scheike and others, 2010; Cheng and Fine, 2012). The marginal cumulative incidence functions are not generally distribution functions. Consequently, the interpretation of the dependence parameter estimated in the two-stage approach as a type of Kendall’s $$\tau$$ does not apply (Scheike and others, 2010). However, under certain conditions the estimated dependence parameter is closely related to the cross-odds ratio (Scheike and others, 2010; Scheike and Sun, 2012). Under the two-stage approach, it is necessary to adjust for right-censoring. This is done through modeling of the censoring distribution and employment of inverse probability of censoring weights (IPCWs) (Scheike and others, 2010; Cheng and Fine, 2012). If the censoring distribution is misspecified, the weighting may introduce bias. In addition to the models described above, a semiparametric mixture model exists (Naskar and others, 2005). This model allows for within-cluster dependence with regard to timing but not absolute risk. We propose to model the cause-specific cumulative incidence function of multivariate competing risks data using a random effects model that allows for within-cluster dependence with regard to both risk and timing. That is, the model allows the risk level (in terms of absolute risk) and the failure time distribution to vary between clusters. This is in contrast to the model by Naskar and others (2005) mentioned above. Removal of the random effects from the proposed model gives a model that is similar to existing parametric and semiparametric mixture models for competing risks data such as Larson and Dinse (1985), Kuk (1992) and Shi and others (2013). Under the proposed model, the cumulative incidence functions of all causes are modeled. As a consequence, both left-truncation and right-censoring are easily dealt with. Thus, specification of censoring distributions is avoided. Generally, direct regression modeling of cause-specific cumulative incidence functions can be challenging since the sum of all predicted cause-specific cumulative incidences should be less than or equal to $$1$$. This can be difficult to enforce (Gerds and others, 2012). We employ a multinomial logistic regression model with random effects that ensures that the sum of the predicted cumulative incidences do not exceed $$1$$. In addition, application of a pairwise composite likelihood readily enables analysis of data with large clusters of varying size. Unfortunately, no closed form expression for the marginal likelihood exists. Yet, as we show the dimensionality of the integral can be markedly reduced by employing Bayes’ rule and changing the order of integration. The remainder of this article is structured as follows. The proposed model is described in Section 2 including a presentation of the likelihood function in Section 2.2. Simulation results are presented in Section 3. In Section 4, the model is used to analyze Danish register-based family data on breast cancer. A discussion in Section 5 ends the article. 2. The model Let $$n$$ be the number of clusters and $$n_i$$ the number of individuals within the $$i$$th cluster. Let $$T_{ij}^{*}$$, $$\epsilon^{*}_{ij}$$ and $$C_{ij}$$ denote the failure time, the cause of failure and the censoring time, respectively, of the $$j$$th individual in the $$i$$th cluster. For simplicity, we consider a situation with two competing causes of failure, where $$\epsilon^{*}_{ij}=1$$ is the failure cause of primary interest and $$\epsilon^{*}_{ij}=2$$ is the competing failure cause. However, the notation and model generalises to situations with more than two competing causes of failure. Let $$X_{ij}=(1,X_{ij,1},...,X_{ij,p})$$ be a vector of associated covariates. For simplicity, we let the vector $$X_{ij}$$ be the same for all causes of failure and allow the regression coefficients’ vectors to vary. The observed follow-up time of the $$j$$th individual in the $$i$$th cluster is $$T_{ij}=\min(T_{ij}^{*},C_{ij})$$. The observed failure cause is $$\epsilon_{ij}=\Delta_{ij}\epsilon_{ij}^{*}$$, where $$\Delta_{ij}$$ is an indicator of no censoring $$\Delta_{ij}=I(T_{ij}^{*}\leq C_{ij})$$. Thus, $$\epsilon_{ij}=0$$ denotes censoring. The failure times are observed in the time interval $$[0,\tau]$$. We propose to model the cluster-specific cumulative incidence function as a product of a cluster-specific risk level and a cluster-specific failure time trajectory. For two competing causes of failure, let the cluster-specific cumulative incidence functions be given as \begin{eqnarray}\label{eq:cif} F_{1}(t|X,\eta_{1},u_{1},u_{2})&=&\pi_{1}(X,u_{1},u_{2})\Phi\left[\alpha_{1}\{g(t)\}-X^{T}\gamma_{1}-\eta_{1}\right]\nonumber\\ F_{2}(t|X,\eta_{2},u_{1},u_{2})&=&\pi_{2}(X,u_{1},u_{2})\Phi\left[\alpha_{2}\{g(t)\}-X^{T}\gamma_{2}-\eta_{2}\right]\!, \end{eqnarray} (2.1) where $$\boldsymbol{\eta}=\{\eta_{1},\eta_{2}\}$$ and $$\boldsymbol{u}=\{u_{1},u_{2}\}$$ are normally distributed random effects with mean zero and potentially correlated, i.e. \begin{eqnarray}\label{eq:randrist} \begin{pmatrix} \eta_{1} \\ \eta_{2} \\ u_{1}\\ u_{2} \end{pmatrix} \sim \mathcal{N} \left\{ \begin{pmatrix} 0 \\ 0 \\ 0 \\ 0 \end{pmatrix} , \begin{pmatrix} \sigma^2_{\eta_{1}} & \textrm{cov}(\eta_{1},\eta_{2}) & \textrm{cov}(\eta_{1},u_{1}) & \textrm{cov}(\eta_{1},u_{2}) \\ & \sigma^2_{\eta_{2}} & \textrm{cov}(\eta_{2},u_{1}) & \textrm{cov}(\eta_{2},u_{2}) \\ & & \sigma^2_{u_{1}} & \textrm{cov}(u_{1},u_{2}) \\ & & & \sigma^2_{u_{2}} \end{pmatrix} \right\}. \end{eqnarray} (2.2) The cluster-specific survival function is given as $$S(t|X,\boldsymbol{\eta},\boldsymbol{u})=1-F_{1}(t|X,\eta_{1},\boldsymbol{u})-F_{2}(t|X,\eta_{2},\boldsymbol{u})$$. The cluster-specific cumulative incidence functions (2.1) are modeled up to a fixed time point $$\delta$$ at which all individuals still at risk are censored. The value of $$\delta$$ depends on the data and cannot exceed the maximum observed follow-up time, i.e. $$\delta \leq \tau$$. Transformation of the time variable $$t$$ enables separation of the cumulative incidence function (2.1) into a cluster-specific risk level, $$\pi_{h}(X,\boldsymbol{u})$$, and a cluster-specific failure time trajectory, $$\Phi\left[\alpha_{h}\{g(t)\}-X^{T}\gamma_{h}-\eta_{h}\right]$$, where $$\Phi$$ is the cumulative distribution function of the standard normal distribution. The transformation, $$g(t)$$, is given as \begin{eqnarray*}\label{eq:trans} g(t)={\rm arctanh}\left(\frac{t-\delta/2}{\delta/2}\right) \quad \textrm{for} \quad t \in ]0,\delta[. \end{eqnarray*} With this transformation $$g(t)\in ]-\infty,\infty[$$ and the value of the cluster-specific failure time trajectory will equal 1 at time $$\delta$$. Consequently, $$F_{h}(\delta|X,\eta_{h},\boldsymbol{u})=\pi_{h}(X,\boldsymbol{u})$$ and we can therefore interpret $$\pi_{1}(X,\boldsymbol{u})$$ and $$\pi_{2}(X,\boldsymbol{u})$$ as the cause-specific cluster-specific risk levels at time $$\delta$$. The cluster-specific risk levels are modeled using a multinomial logistic regression model with random effects, i.e. \begin{eqnarray}\label{eq:pi} \pi_{1}(X,\boldsymbol{u})&=&\frac{\exp(X^T\beta_{1}+u_{1})}{1+\exp(X^T\beta_{1}+u_{1})+\exp(X^T\beta_{2}+u_{2})}\nonumber\\ \pi_{2}(X,\boldsymbol{u})&=&\frac{\exp(X^T\beta_{2}+u_{2})}{1+\exp(X^T\beta_{1}+u_{1})+\exp(X^T\beta_{2}+u_{2})}, \end{eqnarray} (2.3) where $$\beta_{1}$$ and $$\beta_{2}$$ are regression coefficients’ vectors of the two causes of failure. The random effects in $$\boldsymbol{u}$$ induce within-cluster dependence of the cause-specific risk levels. The cluster-specific failure time trajectories are modeled via the second part of the models in (2.1), i.e. \begin{eqnarray}\label{eq:trac} \Phi\left[\alpha_{h}\{g(t)\}-X^{T}\gamma_{h}-\eta_{h}\right] \quad \textrm{for} \quad h=1,2, \end{eqnarray} (2.4) where $$\Phi$$ is the cumulative distribution function of the standard normal distribution, and $$\alpha_{1}(x)$$ and $$\alpha_{2}(x)$$ are monotonically increasing functions of $$x$$ and both known up to a finite-dimensional parameter vector, $$\omega_{1}$$ and $$\omega_{2}$$, respectively. For example, monotonically increasing B-spline or piecewise linear functions. The random effects in $$\boldsymbol{\eta}$$ induce within-cluster dependence of the cause-specific failure time trajectories. We assume that the random effects are independent across clusters and shared by individuals within the same cluster. Individuals within the same cluster are assumed independent conditional on the random effects and on covariates. Furthermore, we assume that the failure time and the cause of failure are independent of the censoring time given covariates and random effects ($$\{T_{ij}^{*},\epsilon_{ij}^{*}\}\perp C_{ij}|X_{ij},\boldsymbol{\eta}_{i},\boldsymbol{u}_{i}$$) and that $$C_{ij}$$ does not depend on the random effects. 2.1. Interpretation of parameters Modeling the cluster-specific risk levels using a multinomial logistic regression model (2.3) ensures that the sum of the predicted cause-specific cumulative incidences does not exceed 1. Furthermore, at time $$\delta$$, where $$F_{h}(\delta|X,\eta_{h},\boldsymbol{u})=\pi_{h}(X,\boldsymbol{u})$$, the cluster-specific survival function $$S(\delta|X,\boldsymbol{\eta},\boldsymbol{u})$$ equals $$1/\left\{1+\exp(X^T\beta_{1}+u_{1})+\exp(X^T\beta_{2}+u_{2})\right\}$$. This results in the following interpretation of the regression coefficients in $$\beta_{1}$$ and $$\beta_{2}$$. Let $$X_{ij}=(1,X_{ij,1},...,X_{ij,p})$$ and $$\beta_{h}=(\beta_{0,h},...,\beta_{p,h})^{T}$$. Comparing two individuals $$j$$ and $$k$$ from the $$i$$th cluster whose values of $$X_{ij,r}$$ differ by one unit and where all other covariates are held constant, $$\exp(\beta_{r,h})$$ equals the ratio between “odds” of the form $$F_{h}(\delta|X_{ij},\eta_{h,i},\boldsymbol{u}_{i})/S(\delta|X_{ij},\boldsymbol{\eta}_{i},\boldsymbol{u}_{i})$$. That is, the probability of failure from cause $$h$$ at time $$\delta$$ in relation to the probability of no failure at time $$\delta$$ (Gerds and others, 2012). The regression coefficients $$\gamma_{1}=(\gamma_{0,1},...,\gamma_{p,1})^{T}$$ and $$\gamma_{2}=(\gamma_{0,2},...,\gamma_{p,2})^{T}$$ reflect how covariates affect the failure time trajectories (2.4), i.e. the shape of the cumulative incidence, and thereby how quickly the cluster-specific risk levels observed at time $$\delta$$ are reached (2.3). For a covariate with a negative effect, the cause-specific cumulative incidence function makes an advance towards the cluster-specific risk level, whereas a covariate with a positive effect causes a delay. The random effects in $$\boldsymbol{\eta}$$ have a similar interpretation. This is illustrated in Figure 1, where the cluster-specific cumulative incidence function of a given failure cause, denoted failure cause 1, is visualised for different combinations of random effects $$\eta_{1}$$ and $$u_{1}$$. Interpretation of random effects in the general logistic regression model is no easy task (Larsen and others, 2000). This also applies to the random effects $$u_{1}$$ and $$u_{2}$$ in (2.3). In contrast to $$\eta_{1}$$ and $$\eta_{2}$$, the random effects $$u_{1}$$ and $$u_{2}$$ always appear together and have a joint effect on the cumulative incidence of both causes, cf. (2.3). However, keeping everything else fixed, an increase in $$u_{h}$$ will increase the risk of failure from cause $$h$$ and vice versa. This is also illustrated in Figure 1. The interpretation of the covariance of $$\eta_{1}$$ and $$\eta_{2}$$ and the covariance of $$u_{1}$$ and $$u_{2}$$ is more or less straightforward. With regard to the interpretation of the covariance of the random effects in $$\boldsymbol{\eta}$$ and the random effects in $$\boldsymbol{u}$$, $$\textrm{cov}(\eta_{h},u_{1})$$ and $$\textrm{cov}(\eta_{h},u_{2})$$, they should be considered simultaneously. Again keeping everything else fixed, a negative correlation between $$\eta_{h}$$ and $$u_{h}$$ imply that when $$\eta_{h}$$ decreases, $$u_{h}$$ increases and conversely when $$\eta_{h}$$ increases, $$u_{h}$$ decreases. Broadly speaking, this corresponds to an increased risk level that is reached quickly and a decreased risk level that is reached later, respectively. We find it difficult to think of a situation with a positive within-cause correlation between $$\eta$$ and $$u$$, i.e. where an increased risk level is associated with a late onset and vice versa. However, a positive cross-cause correlation between $$\eta$$ and $$u$$ sounds more plausible. i.e. where late onset of one failure cause is associated with a high absolute risk of another failure cause. Fig. 1. View largeDownload slide Illustration of the proposed model showing the cluster-specific cumulative incidences of failure cause 1. From a model with $$X=1$$ for all subjects and with $$\beta_{1}=-1.9$$, $$\beta_{2}=-0.2$$, $$\gamma_{1}=1$$, $$\alpha_{1}\{g(t)\}=3g(t)$$ and $$u_{2}=0$$. In plot A, $$u_{1}=-1$$. In plot B, $$u_{1}=-0.5$$. In plot C, $$u_{1}=0.5$$. In plot D, $$u_{1}=1$$. With regard to the curves, ($$\Huge {-}$$) corresponds to $$\eta_{1}=-2$$, (− − − −) corresponds to $$\eta_{1}=-1$$, ($$\cdots\cdots\cdots$$) corresponds to $$\eta_{1}=0$$, ($$-\cdot -\cdot -\cdot$$) corresponds to $$\eta_{1}=1$$, and (— — —) corresponds to $$\eta_{1}=2$$. Fig. 1. View largeDownload slide Illustration of the proposed model showing the cluster-specific cumulative incidences of failure cause 1. From a model with $$X=1$$ for all subjects and with $$\beta_{1}=-1.9$$, $$\beta_{2}=-0.2$$, $$\gamma_{1}=1$$, $$\alpha_{1}\{g(t)\}=3g(t)$$ and $$u_{2}=0$$. In plot A, $$u_{1}=-1$$. In plot B, $$u_{1}=-0.5$$. In plot C, $$u_{1}=0.5$$. In plot D, $$u_{1}=1$$. With regard to the curves, ($$\Huge {-}$$) corresponds to $$\eta_{1}=-2$$, (− − − −) corresponds to $$\eta_{1}=-1$$, ($$\cdots\cdots\cdots$$) corresponds to $$\eta_{1}=0$$, ($$-\cdot -\cdot -\cdot$$) corresponds to $$\eta_{1}=1$$, and (— — —) corresponds to $$\eta_{1}=2$$. 2.2. Likelihood In the estimation of the parameters of interest, we propose to use a pairwise composite likelihood (Cox and Reid, 2004; Varin and others, 2011) given as \begin{eqnarray}\label{eq:complik} \mathcal{L}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X},\boldsymbol{\eta},\boldsymbol{u})&=& \prod_{i=1}^{n}\prod_{j=1}^{n_{i}-1}\prod_{k=j+1}^{n_{i}}\mathcal{L}_{i,jk}(\theta;T_{ij},\epsilon_{ij},X_{ij},T_{ik},\epsilon_{ik},X_{ik},\boldsymbol{\eta}_{i},\boldsymbol{u}_{i}), \end{eqnarray} (2.5) where $$\theta=\{\beta_{1},\beta_{2},\gamma_{1},\gamma_{2},\omega_{1},\omega_{2},\Sigma_{\boldsymbol{\eta}\boldsymbol{u}}\}^{T}$$ are the model parameters of interest and $$\Sigma_{\boldsymbol{\eta}\boldsymbol{u}}$$ denotes the variance-covariance matrix of the random effects in $$\boldsymbol{\eta}$$ and $$\boldsymbol{u}$$ (2.2). The pairwise composite likelihood readily enables analysis of data with large clusters of varying size. Furthermore, it is less complex and less computationally intensive than a high-dimensional full cluster likelihood. For parameter estimation we need the marginal pairwise composite likelihood where the unobserved random effects in $$\boldsymbol{\eta}$$ and $$\boldsymbol{u}$$ have been integrated out. i.e., \begin{eqnarray*} \mathcal{L}_{M}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X}) &=&\int\int p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{\eta},\boldsymbol{u})p^{\theta}(\boldsymbol{\eta},\boldsymbol{u})\textrm{d}\boldsymbol{\eta}\textrm{d}\boldsymbol{u}, \end{eqnarray*} where $$p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{\eta},\boldsymbol{u})$$ is the conditional density of $$\boldsymbol{T}$$ and $$\boldsymbol{\epsilon}$$ given covariates and the random effects in $$\boldsymbol{\eta}$$ and $$\boldsymbol{u}$$, and $$p^{\theta}(\boldsymbol{\eta},\boldsymbol{u})$$ is the density of the random effects (2.2). Unfortunately, no closed form expression for the marginal likelihood exists. Yet, we can reduce the dimensionality of the integral. This is done by employing Bayes’ rule, utilising that $$p^{\theta}(\boldsymbol{\eta},\boldsymbol{u})=p^{\theta}(\boldsymbol{\eta}|\boldsymbol{u})p^{\theta}(\boldsymbol{u})$$, changing the order of integration and integrating with respect to $$\boldsymbol{\eta}$$ over the conditional density $$p^{\theta}(\boldsymbol{\eta}|\boldsymbol{u})$$. A closed form expression for the conditional density of $$\boldsymbol{T}$$ and $$\boldsymbol{\epsilon}$$ given covariates and the random effects in $$\boldsymbol{u}$$, i.e. $$p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{u})=\int p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{\eta},\boldsymbol{u})p^{\theta}(\boldsymbol{\eta}|\boldsymbol{u})\textrm{d}\boldsymbol{\eta}$$, exists. Thus, \begin{eqnarray}\label{eq:marlik} \mathcal{L}_{M}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X}) =\int\int p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{\eta},\boldsymbol{u})p^{\theta}(\boldsymbol{\eta}|\boldsymbol{u})p^{\theta}(\boldsymbol{u})\textrm{d}\boldsymbol{\eta}\textrm{d}\boldsymbol{u} =\int p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{u})p^{\theta}(\boldsymbol{u})\textrm{d}\boldsymbol{u}. \quad \end{eqnarray} (2.6) In the following, we show how the expression for $$p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{u})$$ is derived. For ease of notation, we ignore the cluster subscript $$i$$. The likelihood contribution of the pair $$j,k$$ to the pairwise composite likelihood (2.5) is given as \begin{eqnarray}\label{eq:complik_det} \mathcal{L}_{jk}(\theta;T_{j},\epsilon_{j},X_{j},T_{k},\epsilon_{k},X_{k},\boldsymbol{\eta},\boldsymbol{u})&=&\left[\prod_{h=1}^{2} \prod_{l=1}^{2}\left\{f_{h}^{\theta}(T_{j}|X_{j},\eta_{h},\boldsymbol{u})f_{l}^{\theta}(T_{k}|X_{k},\eta_{l},\boldsymbol{u})\right\}^{(\epsilon_{j}=h)(\epsilon_{k}=l)}\right]\nonumber\\ &&\times\left[\prod_{h=1}^{2}\left\{ f_{h}^{\theta}(T_{j}|X_{j},\eta_{h},\boldsymbol{u})S^{\theta}(T_{k}|X_{k},\boldsymbol{\eta},\boldsymbol{u})\right\}^{(\epsilon_{j}=h)(\epsilon_{k}=0)}\right]\nonumber\\ &&\times\left[\prod_{l=1}^{2} \left\{S^{\theta}(T_{j}|X_{j},\boldsymbol{\eta},\boldsymbol{u})f_{l}^{\theta}(T_{k}|X_{k},\eta_{l},\boldsymbol{u})\right\}^{(\epsilon_{j}=0)(\epsilon_{k}=l)}\right]\nonumber\\ &&\times \left[\left\{S^{\theta}(T_{j}|X_{j},\boldsymbol{\eta},\boldsymbol{u})S^{\theta}(T_{k}|X_{k},\boldsymbol{\eta},\boldsymbol{u})\right\}^{(\epsilon_{j}=0)(\epsilon_{k}=0)}\right], \end{eqnarray} (2.7) where \begin{eqnarray*} f_{h}^{\theta}(t|X,\eta_{h},\boldsymbol{u})=\frac{\partial F_{h}^{\theta}(t|X,\eta_{h},\boldsymbol{u})}{\partial t}=\pi_{h}^{\theta}(X,\boldsymbol{u})\frac{d\alpha_{h}^{\theta}\{g(t)\}}{dt}\phi\left[\alpha_{h}^{\theta}\{g(t)\}-X^{T}\gamma_{h}-\eta_{h}\right], \end{eqnarray*} $$\phi$$ being the probability density function of the standard univariate normal distribution and $$S^{\theta}(t{\,|\,}X,\boldsymbol{\eta},\boldsymbol{u})=1-\sum_{h=1}^{2}F_{h}^{\theta}(t|X,\eta_{h},\boldsymbol{u})$$. In the following, let $$\alpha_{h}^{*}(t)=\alpha_{h}^{\theta}\{g(t)\}$$ and $$\alpha_{h}^{*\boldsymbol{\prime}}(t)=d\alpha_{h}^{\theta}\{g(t)\}/dt$$. The first term in the likelihood (2.7) equals the likelihood contribution from pairs where both individuals experience failure (either cause). We have that \begin{eqnarray*} f_{h}^{\theta}(t|X_{j},\eta_{h},\boldsymbol{u})f_{l}^{\theta}(s|X_{k},\eta_{l},\boldsymbol{u})&=& \pi_{h}^{\theta}(X_{j},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\phi\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h}\right\} \nonumber \\ &&\times \pi_{l}^{\theta}(X_{k},\boldsymbol{u})\alpha_{l}^{*\boldsymbol{\prime}}(s)\phi\left\{\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}\right\}. \qquad \end{eqnarray*} The right-hand side of the expression above can be written as \begin{eqnarray*} \pi_{h}^{\theta}(X_{j},\boldsymbol{u})\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\alpha_{l}^{*\boldsymbol{\prime}}(s)\boldsymbol{\phi}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h},\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}\right\}, \end{eqnarray*} where $$\boldsymbol{\phi}$$ is the probability density function of the standard bivariate normal distribution with correlation coefficient $$\rho=0$$. Let $$\boldsymbol{\eta}_{h,l}=\{\eta_{h},\eta_{l}\}$$ and let $$\boldsymbol{\mu}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$ and $$\Sigma_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$ denote the mean and the variance-covariance matrix, respectively, of the conditional distribution of $$\boldsymbol{\eta}_{h,l}$$ given $$\boldsymbol{u}$$. To obtain an expression for the contribution to the conditional density of $$\boldsymbol{T}$$ and $$\boldsymbol{\epsilon}$$ given the random effects in $$\boldsymbol{u}$$, $$p^{\theta}(\boldsymbol{T},\boldsymbol{\epsilon}|\boldsymbol{X},\boldsymbol{u})$$, we integrate with respect to $$\boldsymbol{\eta}_{h,l}$$ over the conditional distribution $$p^{\theta}(\boldsymbol{\eta}_{h,l}|\boldsymbol{u})$$, i.e. \begin{eqnarray*} \pi_{h}^{\theta}(X_{j},\boldsymbol{u})\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\alpha_{l}^{*\boldsymbol{\prime}}(s)\int \boldsymbol{\phi}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h},\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}\right\} p^{\theta}(\boldsymbol{\eta}_{h,l}|\boldsymbol{u})\textrm{d}\boldsymbol{\eta}_{h,l}. \end{eqnarray*} Since the random effects in $$\boldsymbol{\eta}_{h,l}$$ are normally distributed, the integral is straightforward to evaluate and simplifies to \begin{eqnarray*} \pi_{h}^{\theta}(X_{j},\boldsymbol{u})\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\alpha_{l}^{*\boldsymbol{\prime}}(s)\boldsymbol{\phi}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}^{\theta}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h},\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}\right\}, \end{eqnarray*} where $$\boldsymbol{\phi}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}^{\theta}$$ is the probability density function of the bivariate normal distribution with mean $$\boldsymbol{\mu}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$ and variance-covariance matrix $$I_{2}+\Sigma_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$. The second and third term in the likelihood (2.7) are the contributions from pairs where one of the two individuals experience failure. As above, we integrate over the conditional distribution $$p^{\theta}(\boldsymbol{\eta}_{h,l}|\boldsymbol{u})$$ to obtain the contribution to the conditional density of $$\boldsymbol{T}$$ and $$\boldsymbol{\epsilon}$$ given the random effects in $$\boldsymbol{u}$$. We only show the expression for the integral of the second term. The expression for the integral of the third term is equivalent as they are symmetrical. We have that \begin{eqnarray*} f_{h}^{\theta}(t|X_{j},\eta_{h},\boldsymbol{u})S^{\theta}(s|X_{k},\boldsymbol{\eta},\boldsymbol{u})&=& \pi_{h}^{\theta}(X_{j},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\phi\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h}\right\} \nonumber \\ &&\times \left[1-\sum_{l=1}^{2}\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\Phi\left\{\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}\right\}\right],\qquad \end{eqnarray*} where $$\Phi$$ is the cumulative distribution function of the standard univariate normal distribution. Utilising that $$\Phi(x)=\int_{-\infty}^{x}\phi(w)\textrm{d}w$$, we can write the right-hand side as \begin{eqnarray*} &&\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\phi\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h}\right\} \\ &&\quad-\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\sum_{l=1}^{2}\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\int_{-\infty}^{\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}}\boldsymbol{\phi}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h},w \right\}\textrm{d}w, \end{eqnarray*} where $$\boldsymbol{\phi}$$ is the probability density function of the standard bivariate normal distribution with correlation coefficient $$\rho=0$$. We then integrate with respect to $$\boldsymbol{\eta}_{h,l}$$ over the conditional distribution $$p^{\theta}(\boldsymbol{\eta}_{h,l}|\boldsymbol{u})$$ and obtain \begin{eqnarray*} &&\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\phi_{\eta_{h}|\boldsymbol{u}}^{\theta}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}\right\} \\ &&\quad -\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\alpha_{h}^{*\boldsymbol{\prime}}(t)\sum_{l=1}^{2}\pi_{l}^{\theta}(X_{l},\boldsymbol{u})\int_{-\infty}^{\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}}\boldsymbol{\phi}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}^{\theta}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h},w\right\}\textrm{d}w, \end{eqnarray*} where $$\phi_{\eta_{h}|\boldsymbol{u}}^{\theta}$$ is the probability density function of the univariate normal distribution with mean $$\mu_{\eta_{h}|\boldsymbol{u}}$$ and variance $$1+\sigma_{\eta_{h}|\boldsymbol{u}}^{2}$$, and $$\boldsymbol{\phi}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}^{\theta}$$ is the probability density function of the bivariate normal distribution with mean $$\boldsymbol{\mu}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$ and variance-covariance matrix $$I_{2}+\Sigma_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$. The last term in the likelihood (2.7) is the contribution from pairs where neither experience failure. It can be written as \begin{eqnarray*} &&1-\sum_{h=1}^{2}\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\Phi\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h}\right\}-\sum_{l=1}^{2}\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\Phi\left\{\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}\right\}\\ &&\quad+\sum_{h=1}^{2}\sum_{l=1}^{2}\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\boldsymbol{\Phi}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}-\eta_{h},\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}-\eta_{l}\right\}, \end{eqnarray*} where $$\boldsymbol{\Phi}$$ is the cumulative distribution function of the standard bivariate normal distribution with correlation coefficient $$\rho=0$$. Following along the same lines as above, we integrate with respect to $$\boldsymbol{\eta}_{h,l}$$ over the conditional distribution $$p^{\theta}(\boldsymbol{\eta}_{h,l}|\boldsymbol{u})$$ and obtain \begin{eqnarray*} && 1-\sum_{h=1}^{2}\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\Phi_{\eta_{h}|\boldsymbol{u}}^{\theta}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h}\right\}-\sum_{l=1}^{2}\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\Phi_{\eta_{l}|\boldsymbol{u}}^{\theta}\left\{\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}\right\}\\ &&\quad+\sum_{h=1}^{2}\sum_{l=1}^{2}\pi_{h}^{\theta}(X_{j},\boldsymbol{u})\pi_{l}^{\theta}(X_{k},\boldsymbol{u})\boldsymbol{\Phi}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}^{\theta}\left\{\alpha_{h}^{*}(t)-X_{j}^{T}\gamma_{h},\alpha_{l}^{*}(s)-X_{k}^{T}\gamma_{l}\right\}, \end{eqnarray*} where $$\Phi_{\eta_{h}|\boldsymbol{u}}^{\theta}$$ is the cumulative distribution function of the univariate normal distribution with mean $$\mu_{\eta_{h}|\boldsymbol{u}}$$ and variance $$1+\sigma_{\eta_{h}|\boldsymbol{u}}^{2}$$, $$\Phi_{\eta_{l}|\boldsymbol{u}}^{\theta}$$ is the cumulative distribution function of the univariate normal distribution with mean $$\mu_{\eta_{l}|\boldsymbol{u}}$$ and variance $$1+\sigma_{\eta_{l}|\boldsymbol{u}}^{2}$$, and $$\boldsymbol{\Phi}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}^{\theta}$$ is the cumulative distribution function of the bivariate normal distribution with mean $$\boldsymbol{\mu}_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$ and variance-covariance matrix $$I_{2}+\Sigma_{\boldsymbol{\eta}_{h,l}|\boldsymbol{u}}$$. 2.3. Estimation Since no closed form expression for the marginal likelihood with respect to the random effects in $$\boldsymbol{u}$$ (2.6) exists, we propose that adaptive Gaussian quadrature (AGQ) approximation is used in parameter estimation. In AGQ approximation, Gaussian quadrature rules are used to approximate the integral by a weighted average of the integrand evaluated at $$Q$$ predetermined quadrature points. As the random effects are normally distributed, we use Gauss-Hermite quadrature rules (Pinheiro and Chao, 2006; Hartzel and others, 2001). The accuracy of the AGQ approximation increases with the number of quadrature points. Yet, so does the computational intensity. This is because the $$Q$$ quadrature points are used to estimate the AGQ approximation of each pair in the pairwise composite likelihood. Usually, five or less quadrature points are adequate but it depends on the likelihood (Pinheiro and Bates, 1995). As the approximation of the derivatives may be less precise than the approximation of the marginal likelihood (Lesaffre and Spiessens, 2001), we propose numerical derivatives for estimation of the variance of the parameter estimates. Furthermore, as the likelihood contributions to the composite likelihood (2.5) from pairs within the same cluster are not independent, the Fisher information needs to be substituted by the so-called sandwich estimator when estimating the variance of the parameter estimates (Varin and others, 2011). Let $$\mathcal{U}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})$$ denote the numerically derived score function of the marginal log-likelihood and $$H(\hat{\theta})=\frac{\partial}{\partial\theta} \mathcal{U}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})$$. And let $$V(\hat{\theta})=\sum_{i}^{n}\mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})^{T}\mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})$$ where $$\mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})=\sum_{j=1}^{n_{i}-1}\sum_{k=j+1}^{n_{i}}\mathcal{U}_{i,jk}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})$$, i.e. the score contribution from all pairs in the $$i$$th family. The variance of the parameter estimates can be estimated by $$\{-H(\hat{\theta})\}^{-1}V(\hat{\theta})\{-H(\hat{\theta})\}^{-1}$$. 2.4. Left-truncation and right-censoring Under the proposed model, all cause-specific cumulative incidence functions are specified. As a consequence, left-truncation and right-censoring are easily dealt with. The proposed model readily adjusts for right-censoring by means of the second, third and fourth term of the likelihood (2.7). With regard to left-truncation, let $$V_{ij}$$ denote the truncation time of the $$j$$th individual in the $$i$$th cluster. The truncation probability of the pair $$j,k$$ within the $$i$$th cluster equals the joint survival at time $$(V_{ij},V_{ik})$$, i.e. $$S^{\theta}(V_{ij},V_{ik}|X_{ij},X_{ik})$$. When adjusting for left-truncation, the likehood contribution of the pair $$j,k$$ is given as (ignoring the cluster subscript $$i$$) \begin{eqnarray*} \mathcal{L}_{\textrm{T},jk}(\theta;T_{j},\epsilon_{j},V_{j},X_{j},T_{k},\epsilon_{k},V_{k},X_{k})&=&\left[\prod_{h=1}^{2} \prod_{l=1}^{2}\left\{\frac{f_{hl}^{\theta}(T_{j},T_{k}|X_{j},X_{k})}{S^{\theta}(V_{j},V_{k}|X_{j},X_{k})}\right\}^{(\epsilon_{j}=h)(\epsilon_{k}=l)}\right]\nonumber\\ &&\times\left[\prod_{h=1}^{2}\left\{\frac{ f_{h0}^{\theta}(T_{j},T_{k}|X_{j},X_{k})}{S^{\theta}(V_{j},V_{k}|X_{j},X_{k})}\right\}^{(\epsilon_{j}=h)(\epsilon_{k}=0)}\right]\nonumber\\ &&\times\left[\prod_{l=1}^{2} \left\{\frac{f_{0l}^{\theta}(T_{j},T_{k}|X_{j},X_{k})}{S^{\theta}(V_{j},V_{k}|X_{j},X_{k})}\right\}^{(\epsilon_{j}=0)(\epsilon_{k}=l)}\right]\nonumber\\ &&\times \left[\left\{\frac{S^{\theta}(T_{j},T_{k}|X_{j},X_{k})}{S^{\theta}(V_{j},V_{k}|X_{j},X_{k})}\right\}^{(\epsilon_{j}=0)(\epsilon_{k}=0)}\right], \end{eqnarray*} where $$f_{hl}^{\theta}(T_{j},T_{k}|X_{j},X_{k})=\mathbb{P}(T_{j}=t,\epsilon_{j}=h,T_{k}=s,\epsilon_{k}=l|X_{j},X_{k})$$, $$f_{h0}^{\theta}(T_{j},T_{k}|X_{j},X_{k})=\mathbb{P}(T_{j}=t,\epsilon_{j}=h,T_{k}>s|X_{j},X_{k})$$ and $$f_{0l}^{\theta}(T_{j},T_{k}|X_{j},X_{k})=\mathbb{P}(T_{j}>t,T_{k}=s,\epsilon_{k}=l|X_{j},X_{k})$$. 2.5. Sampling In our analysis of the Danish register-based family data on breast cancer presented in Section 4, we analyze a type of case-cohort sample of all available families. This is because of the size of the cohort (1 292 051 families) and the computational intensity involved in parameter estimation using AGQ approximation. The applied sampling approach follows the method presented by Moger and others (2008). We divide the clusters into $$G$$ disjoint strata and then sample $$m_{g}$$ clusters from stratum $$g$$. Sampling is done without replacement. Under the applied sampling approach, the estimator of the pairwise composite log-likelihood is given as (Kalbfleisch and Lawless, 1988) \begin{eqnarray}\label{eq:stratll} \log\mathcal{L}_{G}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X})=\sum_{g=1}^{G}\frac{n_{g}}{m_{g}} \sum_{i \in D_{g}}\log\mathcal{L}_{i}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X}), \end{eqnarray} (2.8) where $$\log\mathcal{L}_{i}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X})=\sum_{j=1}^{n_{i}-1} \sum_{k=j+1}^{n_{1}}\log\mathcal{L}_{i,jk}(\theta;\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X})$$ is the log-likelihood contribution to the pairwise composite log-likelihood from the $$i$$th cluster, $$n_{g}$$ is the number of clusters in stratum $$g$$ and $$D_{g}$$ is the set of clusters sampled from stratum $$g$$. Due to the sampling, we cannot estimate the variance of the parameters as described in Section 2.3. Instead, the following approach is applied (Kalbfleisch and Lawless, 1988). The score function of the log-likelihood in (2.8) can be written as \begin{eqnarray*} \mathcal{U}_{G}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\theta)&=&\sum_{g=1}^{G}\frac{n_{g}}{m_{g}} \sum_{i \in D_{g}}\mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\theta)\\ &=&\sum_{i=1}^{n} \mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\theta)+\sum_{g=1}^{G}\left(\frac{n_{g}}{m_{g}}-1\right)\sum_{i \in D_{g}} \mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\theta). \end{eqnarray*} Because sampling is done without replacement, families sampled from a given stratum are not independent. In this case, the variance of the parameter estimates are calculated by means of the sandwich estimator $$H_{G}(\hat{\theta})^{-1}\left\{V_{G}(\hat{\theta})+V_{G}^{*}(\hat{\theta})\right\}H_{G}(\hat{\theta})^{-1}$$, where \begin{eqnarray*} H_{G}(\hat{\theta})&=&\sum_{g=1}^{G}\frac{n_{g}}{m_{g}}\sum_{i \in D_{g}}\frac{\partial}{\partial\theta} \mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta}), \quad V_{G}(\hat{\theta})=\sum_{g=1}^{G}\frac{n_{g}}{m_{g}}\sum_{i \in D_{g}}\mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})^{\otimes2} \quad \textrm{and} \quad\\ V_{G}^{*}(\hat{\theta})&=&\sum_{g=1}^{G}\frac{n_{g}(n_{g}-m_{g})}{m_{g}^{2}}\sum_{i\in D_{g}}\left\{\mathcal{U}_{i}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})-\bar{\mathcal{U}}_{g}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})\right\}^{\otimes2} \end{eqnarray*} with $$\bar{\mathcal{U}}_{g}(\boldsymbol{T},\boldsymbol{\epsilon},\boldsymbol{X};\hat{\theta})$$ being the estimated average value of the score in stratum $$g$$. 2.6. Extension of random effects structure We focus on a random effects structure, where the random effects are shared by all individuals within a cluster. However, it is possible to extend the random effects structure to allow for different levels of dependence within a cluster. This may be relevant in analysis of family studies. In family studies, the dependence patterns between different types of relatives (siblings, parent-offspring, etc.) may differ. One way of taking this into account is by decomposing the random effect into a genetic effect and a shared environmental effect following the work of Fisher (1918). Typically, it is assumed that the genetic effect and the shared environmental effect are independent (Khoury and others, 1993; Thomas, 2004). As a result, the variance of the random effect may be decomposed into the corresponding separate components of variance. These variance components can be estimated from the covariance between different types of relatives (Khoury and others, 1993; Thomas, 2004). Under the proposed model, several random effects are involved. In theory all of these can be decomposed, however specification of the covariance of decomposed cross-cause random effects is challenging and the model quickly becomes large and the interpretation of the parameters complicated. Consequently, one may choose to decompose only a few of the random effects. For example, the random effects affecting the failure cause of primary interest. 3. Simulation studies Simulation studies have been used to assess the finite-sample properties of the proposed model and the estimation procedure. We simulated multivariate competing risks data similar to the Danish register-based family data on breast cancer analyzed in Section 4. We simulated 1000 populations of 50 000 clusters of size three. The number of competing causes of failure was two, which gave rise to four cluster-specific random effects. We ran simulations for two different scenarios: (1) a scenario with positive covariance between all random effects and (2) a scenario with zero covariance between the random effects of the two competing causes of failure. The variance-covariance matrices of the random effects in the two scenarios were given as \begin{gather*} \Sigma_{\boldsymbol{\eta}\boldsymbol{u},1}= \begin{pmatrix} 1 & 0.4 & 0.5 & 0.4 \\ & 1 & 0.4 & 0.3 \\ & & 1 & 0.4 \\ & & & 1 \\ \end{pmatrix} \qquad \textrm{and} \qquad \Sigma_{\boldsymbol{\eta}\boldsymbol{u},2}= \begin{pmatrix} 1 & 0.0 & 0.5 & 0.0 \\ & 1 & 0.0 & 0.5 \\ & & 1 & 0.0 \\ & & & 1 \\ \end{pmatrix}. \end{gather*} We used $$X=1$$ for all individuals and set $$\beta_{1}=-1.9$$, $$\beta_{2}=-0.2$$, $$\gamma_{1}=1$$, $$\gamma_{2}=1.2$$, and $$\alpha_{1}(x)=\omega_{1}x$$ and $$\alpha_{2}(x)=\omega_{2}x$$ with $$\omega_{1}=3$$ and $$\omega_{2}=2$$, respectively. The cause of failure was simulated from a multinomial distribution. The probability of failure due to cause $$h$$ was given as $$p_{h}=\exp(X\beta_{h}+u_{h})/\{1+\exp(X\beta_{1}+u_{1})+\exp(X\beta_{2}+u_{2})\}$$ and the probability of no failure at the end of follow-up (censored at end of follow-up) was given as $$p_{0}=1-p_{1}-p_{2}$$. For failure cause 1 and 2, a random number $$\zeta$$ from the standard uniform distribution $$U(0,1)$$ was sampled and the failure time found by isolating $$t$$ in the expression $$\zeta=\Phi\left[\alpha_{h}\{g(t)\}-X\gamma_{h}-\eta_{h}\right]$$. We set $$\delta=80$$. Any failure at time $$\delta$$ was censored. In addition, we let 80% of the simulated data be subject to censoring in the interval $$[0,\delta]$$. Censoring times were drawn from the uniform distribution $$U(0,\delta+30)$$. This resulted in approximately 50% censoring. The clusters were divided into six disjoint strata in a stepwise manner according to pair types within each cluster following the same procedure that was applied in the analysis of the register-based data presented in Section 4. See Figure 2 for an illustration of the stratification procedure. The sampling probabilities of the clusters in the six strata were 1, 0.6, 0.1, 0.05, 0.005, and 0.02, respectively. Sampling was done without replacement. The average distribution of pair types in the simulated populations and analyzed samples is shown in Appendix A of the supplementary material available at Biostatistics online. Parameter estimation was based on the weighted log-likelihood (2.8) and the variance of the parameter estimates was calculated as described in Section 2.5. We used AGQ approximation with five quadrature points in parameter estimation. All computations were performed using the statistical software R (R Core Team, 2017). Fig. 2. View largeDownload slide Stratification procedure applied in the data example. The same procedure is used in the simulations though the numbers differ. Fig. 2. View largeDownload slide Stratification procedure applied in the data example. The same procedure is used in the simulations though the numbers differ. The results from the simulation studies are shown in Table 1. Overall, the model performed well, the parameter estimates were unbiased and the coverage rates were good. However, the averages of the standard errors were too large compared to the standard deviations of the parameter estimates. Further investigation revealed that a number of large values distorted the mean for all parameters. Consequently, we have chosen to report the median of the standard errors in Table 1. We suspect that the irregularities were caused by the numerical derivative of the AGQ approximation that was used for estimation of the variance of the parameter estimates or the number of quadrature points. Table 1. Results from the two simulation studies. Shown are the average of estimates (Av.), the standard deviations (SD), the median of the standard errors $$($$Median SE$$)$$, and the coverage probabilities of the 95% confidence interval $$($$CI$$)$$ True value Av. SD Median SE Cov Scenario 1 $$\beta_{1}$$ $$-$$1.90 $$-$$1.90 0.08 0.09 0.96 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.00 0.11 0.12 0.96 $$\gamma_{2}$$ 1.20 1.20 0.14 0.14 0.94 $$\omega_{1}$$ 3.00 3.01 0.08 0.09 0.97 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.95 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.13 0.14 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.14 0.14 0.94 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.19 0.20 0.96 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.32 0.30 0.94 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.40 0.40 0.07 0.07 0.95 $$\textrm{cov}(u_{1},u_{2})$$ 0.40 0.42 0.18 0.19 0.97 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.16 0.16 0.95 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.40 0.40 0.16 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.40 0.40 0.19 0.19 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.30 0.31 0.23 0.23 0.94 Scenario 2 $$\beta_{1}$$ $$-$$1.90 $$-$$1.89 0.08 0.09 0.98 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.01 0.11 0.12 0.97 $$\gamma_{2}$$ 1.20 1.21 0.14 0.14 0.93 $$\omega_{1}$$ 3.00 3.01 0.08 0.08 0.96 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.96 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.12 0.12 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.15 0.15 0.93 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.18 0.20 0.97 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.31 0.30 0.93 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.00 0.00 0.06 0.07 0.98 $$\textrm{cov}(u_{1},u_{2})$$ 0.00 0.02 0.15 0.17 0.95 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.14 0.15 0.97 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.00 0.00 0.13 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.00 0.01 0.18 0.18 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.50 0.50 0.23 0.23 0.94 True value Av. SD Median SE Cov Scenario 1 $$\beta_{1}$$ $$-$$1.90 $$-$$1.90 0.08 0.09 0.96 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.00 0.11 0.12 0.96 $$\gamma_{2}$$ 1.20 1.20 0.14 0.14 0.94 $$\omega_{1}$$ 3.00 3.01 0.08 0.09 0.97 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.95 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.13 0.14 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.14 0.14 0.94 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.19 0.20 0.96 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.32 0.30 0.94 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.40 0.40 0.07 0.07 0.95 $$\textrm{cov}(u_{1},u_{2})$$ 0.40 0.42 0.18 0.19 0.97 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.16 0.16 0.95 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.40 0.40 0.16 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.40 0.40 0.19 0.19 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.30 0.31 0.23 0.23 0.94 Scenario 2 $$\beta_{1}$$ $$-$$1.90 $$-$$1.89 0.08 0.09 0.98 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.01 0.11 0.12 0.97 $$\gamma_{2}$$ 1.20 1.21 0.14 0.14 0.93 $$\omega_{1}$$ 3.00 3.01 0.08 0.08 0.96 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.96 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.12 0.12 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.15 0.15 0.93 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.18 0.20 0.97 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.31 0.30 0.93 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.00 0.00 0.06 0.07 0.98 $$\textrm{cov}(u_{1},u_{2})$$ 0.00 0.02 0.15 0.17 0.95 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.14 0.15 0.97 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.00 0.00 0.13 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.00 0.01 0.18 0.18 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.50 0.50 0.23 0.23 0.94 Table 1. Results from the two simulation studies. Shown are the average of estimates (Av.), the standard deviations (SD), the median of the standard errors $$($$Median SE$$)$$, and the coverage probabilities of the 95% confidence interval $$($$CI$$)$$ True value Av. SD Median SE Cov Scenario 1 $$\beta_{1}$$ $$-$$1.90 $$-$$1.90 0.08 0.09 0.96 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.00 0.11 0.12 0.96 $$\gamma_{2}$$ 1.20 1.20 0.14 0.14 0.94 $$\omega_{1}$$ 3.00 3.01 0.08 0.09 0.97 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.95 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.13 0.14 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.14 0.14 0.94 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.19 0.20 0.96 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.32 0.30 0.94 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.40 0.40 0.07 0.07 0.95 $$\textrm{cov}(u_{1},u_{2})$$ 0.40 0.42 0.18 0.19 0.97 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.16 0.16 0.95 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.40 0.40 0.16 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.40 0.40 0.19 0.19 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.30 0.31 0.23 0.23 0.94 Scenario 2 $$\beta_{1}$$ $$-$$1.90 $$-$$1.89 0.08 0.09 0.98 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.01 0.11 0.12 0.97 $$\gamma_{2}$$ 1.20 1.21 0.14 0.14 0.93 $$\omega_{1}$$ 3.00 3.01 0.08 0.08 0.96 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.96 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.12 0.12 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.15 0.15 0.93 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.18 0.20 0.97 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.31 0.30 0.93 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.00 0.00 0.06 0.07 0.98 $$\textrm{cov}(u_{1},u_{2})$$ 0.00 0.02 0.15 0.17 0.95 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.14 0.15 0.97 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.00 0.00 0.13 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.00 0.01 0.18 0.18 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.50 0.50 0.23 0.23 0.94 True value Av. SD Median SE Cov Scenario 1 $$\beta_{1}$$ $$-$$1.90 $$-$$1.90 0.08 0.09 0.96 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.00 0.11 0.12 0.96 $$\gamma_{2}$$ 1.20 1.20 0.14 0.14 0.94 $$\omega_{1}$$ 3.00 3.01 0.08 0.09 0.97 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.95 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.13 0.14 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.14 0.14 0.94 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.19 0.20 0.96 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.32 0.30 0.94 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.40 0.40 0.07 0.07 0.95 $$\textrm{cov}(u_{1},u_{2})$$ 0.40 0.42 0.18 0.19 0.97 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.16 0.16 0.95 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.40 0.40 0.16 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.40 0.40 0.19 0.19 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.30 0.31 0.23 0.23 0.94 Scenario 2 $$\beta_{1}$$ $$-$$1.90 $$-$$1.89 0.08 0.09 0.98 $$\beta_{2}$$ $$-$$0.20 $$-$$0.20 0.04 0.06 0.99 $$\gamma_{1}$$ 1.00 1.01 0.11 0.12 0.97 $$\gamma_{2}$$ 1.20 1.21 0.14 0.14 0.93 $$\omega_{1}$$ 3.00 3.01 0.08 0.08 0.96 $$\omega_{2}$$ 2.00 2.01 0.06 0.06 0.96 $$\sigma_{\eta_{1}}^{2}$$ 1.00 1.01 0.12 0.12 0.96 $$\sigma_{\eta_{2}}^{2}$$ 1.00 1.01 0.15 0.15 0.93 $$\sigma_{u_{1}}^{2}$$ 1.00 1.01 0.18 0.20 0.97 $$\sigma_{u_{2}}^{2}$$ 1.00 1.04 0.31 0.30 0.93 $$\textrm{cov}(\eta_{1},\eta_{2})$$ 0.00 0.00 0.06 0.07 0.98 $$\textrm{cov}(u_{1},u_{2})$$ 0.00 0.02 0.15 0.17 0.95 $$\textrm{cov}(\eta_{1},u_{1})$$ 0.50 0.50 0.14 0.15 0.97 $$\textrm{cov}(\eta_{1},u_{2})$$ 0.00 0.00 0.13 0.15 0.96 $$\textrm{cov}(\eta_{2},u_{1})$$ 0.00 0.01 0.18 0.18 0.94 $$\textrm{cov}(\eta_{2},u_{2})$$ 0.50 0.50 0.23 0.23 0.94 4. Data example We have applied the proposed model in analysis of Danish register-based family data on breast cancer among women. Breast cancer is the most common malignancy in women (McPherson and others, 2000) and it is well-established that women with a family history of the disease have an increased risk, especially women with a family history of early disease onset (Claus and others, 1990; Brandt and others, 2010). The aim of the analysis was to investigate the cumulative incidence of breast cancer and the dependence between mothers and daughters. Hence, we did not look at the dependence between sisters. The study cohort was formed by merging data from the Danish Civil Registration System (Pedersen, 2011) and data from the Danish Cancer Registry (Gjerstorff, 2011) by means of the unique personal identifier, which all Danish residents have received at birth or when obtaining residential permit since 1968. Information on death, disappearance and emigration as well as parental information is contained within the Danish Civil Registration System. The study cohort consisted of mothers and up to three daughters born between 1 January 1960 and 31 December 2012. If the mother had more than three daughters born in that time period, the three oldest were included. Women born before 1968 had to be alive in 1968 to be included. This applied to both mothers and daughters. Start of follow-up was 1 January 1968 or at date of birth, whichever came last. Accordingly, data were subject to left-truncation. End of follow-up was date of breast cancer diagnosis, date of other cancer diagnosis, date of emigration or disappearance, date of 80th birthday, date of death or 31 December 2013, whichever came first. Age was used as timescale. The failure cause of interest was breast cancer and competing causes of failures were death and other cancers, which were grouped together. The cohort consisted of 1 292 051 families and 3 029 653 individuals: 908 002 (70.3%) families with a mother and a single daughter, 322 547 (25.0%) families with a mother and two daughters, and 61 502 (4.8%) families with a mother and three daughters. Due to the size of the cohort and the computational intensity involved in parameter estimation using AGQ approximation, we analyzed a sample of the cohort of families. Our sampling followed the method described in Section 2.5. Based on the occurrence of breast cancer and the competing cause of failure, we divided the families of mothers and daughters into six disjoint strata. The strata were formed in a stepwise manner as illustrated in Figure 2. The sampling probabilities of the families in the six strata were 1, 0.5, 0.25, 0.05, 0.025, and 0.0025, respectively. In estimation, we used AGQ approximation with five quadrature points. We chose $$\alpha_{1}(x)=\omega_{1}x$$ and $$\alpha_{2}(x)=\omega_{2}x$$, respectively, with the restriction $$\omega_{1},\omega_{2}\geq0$$. We adjusted for left-truncation and set $$X=1$$ for all individuals. Parameter estimation was based on the weighted log-likelihood (2.8) and the variance of the parameter estimates was calculated as described in Section 2.5. In Appendix B of the supplementary material available at Biostatistics online, the cohort is summarised and the distribution of pair types entering the model shown. The results are shown in Table 2. We found within-cluster dependence of breast cancer with regard to both risk and timing. The variance of the random effect affecting timing was 0.07 [95% confidence interval (CI) 0.01; 0.35]. The variance of the random effect affecting risk of breast cancer was 1.16 (95% CI 0.60; 2.23). The variance of the random effect affecting risk level was quite high, indicating a great risk heterogeneity of breast cancer among families. The estimated covariances between the random effects in $$\boldsymbol{\eta}$$ and $$\boldsymbol{u}$$ were insignificant except for the covariance between $$\eta_{1}$$ and $$u_{1}$$. The two random effects $$\eta_{1}$$ and $$u_{1}$$ were negatively correlated. Their covariance was $$-0.27$$ (95% CI $$-0.41$$; $$-0.12$$), corresponding to a correlation of approximately $$-0.95$$. This corresponds to an increased absolute risk of breast cancer that is reached quickly and vice versa. We did a visual comparison of the cumulative incidence of breast cancer based on the parameter estimates to that of the non-parametric Aalen–Johansen estimator (Aalen and Johansen, 1978), which is shown in Appendix B of the supplementary material available at Biostatistics online. Overall, the estimates agreed. We expect that had we used a more flexible form of $$\alpha_{1}(x)$$, they would have been more alike. The estimated marginal cumulative incidence of breast cancer reached at age 80 was 9.3% (95% CI 8.7; 9.9). Based on the parameter estimates, we calculated the marginal cumulative incidence of breast cancer as well as the casewise concordance, i.e. the cumulative incidence of breast cancer at a given age given that a mother or a daughter has been diagnosed with breast cancer before or at the same age, and the relative recurrence risk ratio, i.e. the ratio between the casewise concordance and the marginal cumulative incidence of breast cancer. These are shown in Figure 3A–C. The estimated casewise concordance and relative recurrence risk ratio of breast cancer were in concordance with that of dizygotic female twins estimated in a recent study (Møller and others, 2016). Furthermore, to investigate the variation in risk level as well as failure time distribution of breast cancer caused by the estimated variance and covariance of the random effects in our model, we drew a sample of 1 000 000 observations from the estimated multivariate normal distribution of $$\boldsymbol{\eta}$$ and $$\boldsymbol{u}$$. The risk level of breast cancer at age 80 for each of the observations was estimated as $$p_{1}=\exp(X\hat{\beta}_{1}+u_{1})/\{1+\exp(X\hat{\beta}_{1}+u_{1})+\exp(X\hat{\beta}_{2}+u_{2})\}$$. The mean of the estimated probabilities was 9.3% corresponding to the estimated marginal cumulative incidence at age 80. In addition, 95% of the probabilities were within the interval $$[1.3; 29.9]$$. For each observation, we estimated the cumulative incidence in the interval $$[0; 80]$$ using the proposed model. Based on the estimated risk levels of breast cancer at age 80, we divided our observations into quartiles and calculated for each quartile the average cumulative incidence in the interval $$[0; 80]$$. These are shown in Figure 3D. To compare the failure time distributions of the quartiles, we identified the time points where the cumulative incidences were equal to half of the risk levels reached at age 80. These are marked with a black dot in Figure 3D. As seen, a high absolute risk of breast cancer at age 80 was associated with an earlier onset, which resulted from the negative correlation between $$\eta_{1}$$ and $$u_{1}$$. Fig. 3. View largeDownload slide The estimated marginal cumulative incidence of breast cancer (A), the estimated casewise concordance (B) and the relative recurrence risk ratio (C) as well as the average cumulative incidence of 1 000 000 observations simulated from a model based on the parameter estimates (D). The observations in plot D have been divided into quartiles based on the estimated risk levels reached at age 80. The black dots mark the time points where the cumulative incidences equal half of the risk levels reached at age 80. In plots A–C, ($$\Huge {-}$$) corresponds to the estimates and (− − − −) corresponds to 95% confidence intervals. In plot D, ($$\Huge {-}$$) corresponds to the 1st quartile of the estimated risk level, (− − − −) corresponds to the 2nd quartile of the estimated risk level, ($$\cdots\cdots\cdots$$) corresponds to the 3rd quartile of the estimated risk level, and ($$-\cdot -\cdot -\cdot$$) corresponds to the 4th quartile of the estimated risk level. Fig. 3. View largeDownload slide The estimated marginal cumulative incidence of breast cancer (A), the estimated casewise concordance (B) and the relative recurrence risk ratio (C) as well as the average cumulative incidence of 1 000 000 observations simulated from a model based on the parameter estimates (D). The observations in plot D have been divided into quartiles based on the estimated risk levels reached at age 80. The black dots mark the time points where the cumulative incidences equal half of the risk levels reached at age 80. In plots A–C, ($$\Huge {-}$$) corresponds to the estimates and (− − − −) corresponds to 95% confidence intervals. In plot D, ($$\Huge {-}$$) corresponds to the 1st quartile of the estimated risk level, (− − − −) corresponds to the 2nd quartile of the estimated risk level, ($$\cdots\cdots\cdots$$) corresponds to the 3rd quartile of the estimated risk level, and ($$-\cdot -\cdot -\cdot$$) corresponds to the 4th quartile of the estimated risk level. Table 2. Results from the analysis of data on breast cancer among Danish women. Shown are the estimates and the lower and upper limits of the 95% CIs. Estimate Lower Upper $$\beta_{1}$$ $$-1.89$$ $$-2.11$$ $$-1.66$$ $$\beta_{2}$$ $$-0.13$$ $$-0.29$$ $$-0.03$$ $$\gamma_{1}$$ $$1.82$$ $$1.59$$ $$2.06$$ $$\gamma_{2}$$ $$1.42$$ $$1.33$$ $$1.50$$ $$\omega_{1}$$ $$2.44$$ $$2.27$$ $$2.61$$ $$\omega_{2}$$ $$1.12$$ $$1.05$$ $$1.18$$ $$\sigma_{\eta_{1}}^{2}$$ $$0.07$$ $$0.01$$ $$0.35$$ $$\sigma_{\eta_{2}}^{2}$$ $$0.01$$ $$0.00$$ $$0.35$$ $$\sigma_{u_{1}}^{2}$$ $$1.16$$ $$0.60$$ $$2.23$$ $$\sigma_{u_{2}}^{2}$$ $$0.52$$ $$0.10$$ $$2.79$$ $$\textrm{cov}(\eta_{1},\eta_{2})$$ $$0.01$$ $$-0.06$$ $$0.07$$ $$\textrm{cov}(\eta_{1},u_{1})$$ $$-0.27$$ $$-0.41$$ $$-0.12$$ $$\textrm{cov}(\eta_{1},u_{2})$$ $$-0.07$$ $$-0.42$$ $$0.27$$ $$\textrm{cov}(\eta_{2},u_{1})$$ $$-0.06$$ $$-0.15$$ $$0.04$$ $$\textrm{cov}(\eta_{2},u_{2})$$ $$-0.07$$ $$-0.29$$ $$0.01$$ $$\textrm{cov}(u_{1},u_{2})$$ $$0.51$$ $$-0.29$$ $$1.30$$ Estimate Lower Upper $$\beta_{1}$$ $$-1.89$$ $$-2.11$$ $$-1.66$$ $$\beta_{2}$$ $$-0.13$$ $$-0.29$$ $$-0.03$$ $$\gamma_{1}$$ $$1.82$$ $$1.59$$ $$2.06$$ $$\gamma_{2}$$ $$1.42$$ $$1.33$$ $$1.50$$ $$\omega_{1}$$ $$2.44$$ $$2.27$$ $$2.61$$ $$\omega_{2}$$ $$1.12$$ $$1.05$$ $$1.18$$ $$\sigma_{\eta_{1}}^{2}$$ $$0.07$$ $$0.01$$ $$0.35$$ $$\sigma_{\eta_{2}}^{2}$$ $$0.01$$ $$0.00$$ $$0.35$$ $$\sigma_{u_{1}}^{2}$$ $$1.16$$ $$0.60$$ $$2.23$$ $$\sigma_{u_{2}}^{2}$$ $$0.52$$ $$0.10$$ $$2.79$$ $$\textrm{cov}(\eta_{1},\eta_{2})$$ $$0.01$$ $$-0.06$$ $$0.07$$ $$\textrm{cov}(\eta_{1},u_{1})$$ $$-0.27$$ $$-0.41$$ $$-0.12$$ $$\textrm{cov}(\eta_{1},u_{2})$$ $$-0.07$$ $$-0.42$$ $$0.27$$ $$\textrm{cov}(\eta_{2},u_{1})$$ $$-0.06$$ $$-0.15$$ $$0.04$$ $$\textrm{cov}(\eta_{2},u_{2})$$ $$-0.07$$ $$-0.29$$ $$0.01$$ $$\textrm{cov}(u_{1},u_{2})$$ $$0.51$$ $$-0.29$$ $$1.30$$ Table 2. Results from the analysis of data on breast cancer among Danish women. Shown are the estimates and the lower and upper limits of the 95% CIs. Estimate Lower Upper $$\beta_{1}$$ $$-1.89$$ $$-2.11$$ $$-1.66$$ $$\beta_{2}$$ $$-0.13$$ $$-0.29$$ $$-0.03$$ $$\gamma_{1}$$ $$1.82$$ $$1.59$$ $$2.06$$ $$\gamma_{2}$$ $$1.42$$ $$1.33$$ $$1.50$$ $$\omega_{1}$$ $$2.44$$ $$2.27$$ $$2.61$$ $$\omega_{2}$$ $$1.12$$ $$1.05$$ $$1.18$$ $$\sigma_{\eta_{1}}^{2}$$ $$0.07$$ $$0.01$$ $$0.35$$ $$\sigma_{\eta_{2}}^{2}$$ $$0.01$$ $$0.00$$ $$0.35$$ $$\sigma_{u_{1}}^{2}$$ $$1.16$$ $$0.60$$ $$2.23$$ $$\sigma_{u_{2}}^{2}$$ $$0.52$$ $$0.10$$ $$2.79$$ $$\textrm{cov}(\eta_{1},\eta_{2})$$ $$0.01$$ $$-0.06$$ $$0.07$$ $$\textrm{cov}(\eta_{1},u_{1})$$ $$-0.27$$ $$-0.41$$ $$-0.12$$ $$\textrm{cov}(\eta_{1},u_{2})$$ $$-0.07$$ $$-0.42$$ $$0.27$$ $$\textrm{cov}(\eta_{2},u_{1})$$ $$-0.06$$ $$-0.15$$ $$0.04$$ $$\textrm{cov}(\eta_{2},u_{2})$$ $$-0.07$$ $$-0.29$$ $$0.01$$ $$\textrm{cov}(u_{1},u_{2})$$ $$0.51$$ $$-0.29$$ $$1.30$$ Estimate Lower Upper $$\beta_{1}$$ $$-1.89$$ $$-2.11$$ $$-1.66$$ $$\beta_{2}$$ $$-0.13$$ $$-0.29$$ $$-0.03$$ $$\gamma_{1}$$ $$1.82$$ $$1.59$$ $$2.06$$ $$\gamma_{2}$$ $$1.42$$ $$1.33$$ $$1.50$$ $$\omega_{1}$$ $$2.44$$ $$2.27$$ $$2.61$$ $$\omega_{2}$$ $$1.12$$ $$1.05$$ $$1.18$$ $$\sigma_{\eta_{1}}^{2}$$ $$0.07$$ $$0.01$$ $$0.35$$ $$\sigma_{\eta_{2}}^{2}$$ $$0.01$$ $$0.00$$ $$0.35$$ $$\sigma_{u_{1}}^{2}$$ $$1.16$$ $$0.60$$ $$2.23$$ $$\sigma_{u_{2}}^{2}$$ $$0.52$$ $$0.10$$ $$2.79$$ $$\textrm{cov}(\eta_{1},\eta_{2})$$ $$0.01$$ $$-0.06$$ $$0.07$$ $$\textrm{cov}(\eta_{1},u_{1})$$ $$-0.27$$ $$-0.41$$ $$-0.12$$ $$\textrm{cov}(\eta_{1},u_{2})$$ $$-0.07$$ $$-0.42$$ $$0.27$$ $$\textrm{cov}(\eta_{2},u_{1})$$ $$-0.06$$ $$-0.15$$ $$0.04$$ $$\textrm{cov}(\eta_{2},u_{2})$$ $$-0.07$$ $$-0.29$$ $$0.01$$ $$\textrm{cov}(u_{1},u_{2})$$ $$0.51$$ $$-0.29$$ $$1.30$$ 5. Discussion We have proposed a new model for the cause-specific cumulative incidence of multivariate competing risks data. The model is a random effects model that allows for within-cluster dependence with regard to both risk and timing. We have assumed that the random effects are shared by all individuals in a cluster. Consequently, the model in its present form only captures positive dependence. However, extensions to different settings with different random effects structures are possible. Moreover, we have assumed that the random effects are constant over time but they may time-dependent. This is another extension that could be considered. Under the proposed model, all cause-specific and cross-cause associations are specified. Thus, left-truncation and right-censoring are easily dealt and no specification of a censoring distribution is required. In addition, the model has the potential to be applied in analysis of family studies focusing on coaggregation of diseases, e.g. breast and ovarian cancer. The proposed model calls for parameter estimation by numerical integration since no closed form expression for the marginal likelihood exists. We propose that AGQ approximation is used. In the presented simulations and data example, we used Gauss-Hermite quadrature rules with five quadrature points. We have assessed the proposed model through simulation studies. The model performed well and the parameter estimates were unbiased. However, a number of the estimated standard errors were too big. This may be caused by the use of the numerical derivative of the AGQ approximation in the estimation of the variance of the parameter estimates or the number of quadrature points and is being further investigated. The proposed model provides a framework for exploring and making inference about the distribution of age at disease onset and to investigate how absolute risk of disease is related to age at onset. We have used the proposed model to investigate the cumulative risk of breast cancer and the within-family dependence among women in Denmark. We found a within-family dependence with regard to both absolute risk and failure time distribution. The two were negatively correlated suggesting that an increased absolute risk was associated with an early onset and vice versa. We found no significant association between breast cancer and the competing cause of failure which included death and other cancers. However, these dependence measures may be difficult to identify. In general, development of procedures for model checking and selection is needed. Statistical tests for model selection may be based on the composite likelihood; in this respect, it is important to note that statistical tests concerning the variance of the random effects, i.e. $$\sigma^{2}_{\eta_{h}}=0$$ and $$\sigma^{2}_{u_{h}}=0$$, are on the boundary of the parameter space. This should be taken into account. For example, following along the lines of Self and Liang (1987), Liang and Self (1996), or Greven and others (2008). If the variance of any of the random effects in $$\boldsymbol{u}$$ is equal to zero, the model becomes much simpler and parameter estimation easier. If $$\sigma_{u_1}^{2}=\sigma_{u_2}^{2}=0$$, parameter estimation can be done without the use of numerical methods. 6. Code Availability The proposed model has been implemented in R (R Core Team, 2017) in the package mcif, which is available for download at https://github.com/kkholst. An example of how to apply the code is provided in Appendix C of the supplementary material available at Biostatistics online. Supplementary material Supplementary material is available at http://biostatistics.oxfordjournals.org. Acknowledgments This work was supported by the Danish Childhood Cancer Foundation (grant number 2011-46), the Danish Health Foundation (grant number 20128011), and the Danish Agency for Science, Technology and Innovation (grant number 1355-00053B). Conflict of Interest: None declared. References Aalen O. O. and Johansen S. ( 1978 ). An empirical transition matrix for non-homogeneous markov chains based on censored observations. Scandinavian Journal of Statistics 5 , 141 – 150 . Andersen P. K. , Geskus R. B. , de Witte T. and Putter H. ( 2012 ). Competing risks in epidemiology: possibilities and pitfalls. International Journal of Epidemiology 31 , 861 – 870 . Google Scholar CrossRef Search ADS Bandeen-Roche K. ( 2014 ). Familial studies. In: Klein J. P. , van Houwelingen H. C. , Ibrahim J. G. and Scheike T. H. (editors), Handbook of Survival Analysis , Chapter 27 . Boca Raton : Chapman & Hall/CRC , pp. 549 – 568 . Brandt A. , Bermejo J. L. , Sundquist J. and Hemminki K. ( 2008 ). Age of onset in familial cancer. Annals of Oncology 19 , 2084 – 2088 . Google Scholar CrossRef Search ADS PubMed Brandt A. , Bermejo J. L. , Sundquist J. and Hemminki K. ( 2010 ). Age of onset in familial breast cancer as background data for medical surveillance. British Journal of Cancer 102 , 42 – 47 . Google Scholar CrossRef Search ADS PubMed Burton Paul R. , Tobin Martin D. and Hopper John L. ( 2005 ). Key concepts in genetic epidemiology. Lancet 366 , 941 – 951 . Google Scholar CrossRef Search ADS PubMed Cheng Y. and Fine J. ( 2012 ). Cumulative incidence association models for bivariate competing risks data. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 74 , 183 – 202 . Google Scholar CrossRef Search ADS Cheng Y. , Fine J. P. and Kosorok M. R. ( 2007 ). Nonparametric association analysis of bivariate competing-risks data. Journal of the American Statistical Association 102 , 1407 – 1415 . Google Scholar CrossRef Search ADS Cheng Y. , Fine J. P. and Kosorok M. R. ( 2009 ). Nonparametric association analysis of exchangeable clustered competing risks data. Biometrics 65 , 385 – 393 . Google Scholar CrossRef Search ADS PubMed Claus E. B. , Risch N. J. and Thompson W. D. ( 1990 ). Age at onset as an indicator of familial risk of breast cancer. American Journal of Epidemiology 131 , 961 – 972 . Google Scholar CrossRef Search ADS PubMed Cox D. R. and Reid N. ( 2004 ). A note on pseudolikelihood constructed from marginal densities. Biometrika 91 , 729 – 737 . Google Scholar CrossRef Search ADS Fisher R. A. ( 1918 ). The correlation between relatives on the supposition of Mendelian inheritance. Transactions of the Royal Society of Edinburgh 52 , 399 – 433 . Google Scholar CrossRef Search ADS Gerds T. A. , Scheike T. H. and Andersen P. K. ( 2012 ). Absolute risk regression for competing risks: interpretation, link functions, and prediction. Statistics in Medicine 31 , 3921 – 3930 . Google Scholar CrossRef Search ADS PubMed Gjerstorff M. L. ( 2011 ). The Danish Cancer Registry. Scandinavian Journal of Public Health 39 , 42 – 45 . Google Scholar CrossRef Search ADS PubMed Greven S. , Crainiceanu C. M. , Küchenhoff H. and Peters A. ( 2008 ). Restricted likelihood ratio testing for zero variance components in linear mixed models. Journal of Computational and Graphical Statistics 17 , 870 – 891 . Google Scholar CrossRef Search ADS Hartzel J. , Agresti A. and Caffo B. ( 2001 ). Multinomial logit random effects models. Statistical Modelling 1 , 81 – 102 . Google Scholar CrossRef Search ADS Kalbfleisch J. D. and Lawless J. F. ( 1988 ). Likelihood analysis of multi-state models for disease incidence and mortality. Statistics in Medicine 7 , 149 – 160 . Google Scholar CrossRef Search ADS PubMed Kharazmi E. , Fallah M. , Sundquist K. and Hemminki K. ( 2012 ). Familial risk of early and late onset cancer: nationwide prospective cohort study. BMJ 345 , 1 – 8 . Google Scholar CrossRef Search ADS Khoury M. J. , Beaty T. H. and Cohen B. H. ( 1993 ). Fundamentals of Genetic Epidemiology . New York : Oxford University Press . Kuk A. Y.C. ( 1992 ). A semiparametric mixture model for the analysis of competing risks data. Australian Journal of Statistics 34 , 169 – 180 . Google Scholar CrossRef Search ADS Larsen K. , Petersen J. H. , Budtz-Jørgensen E. and Endahl L. ( 2000 ). Interpreting parameters in the logistic regression model with random effects. Biometrics 56 , 909 – 914 . Google Scholar CrossRef Search ADS PubMed Larson M. G. and Dinse G. E. ( 1985 ). A mixture model for the regression analysis of competing risks data. Journal of the Royal Statistical Society. Series C (Applied Statistics) 34 , 201 – 211 . Lesaffre E. and Spiessens B. ( 2001 ). On the effect of the number of quadrature points in a logistic random-effects model: an example. Applied Statistics 50 , 325 – 335 . Liang K.-Y. and Self S. G. ( 1996 ). On the asymptotic behaviour of the pseudolikelihood ratio test statistic. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 58 , 785 – 796 . McPherson K. , Steel C. M. and Dixon J. M. ( 2000 ). ABC of breast diseases. Breast cancer–epidemiology, risk factors, and genetics. BMJ 321 , 624 – 628 . Google Scholar CrossRef Search ADS PubMed Moger T. A. , Pawitan Y. and Borgan Ø. ( 2008 ). Case-cohort methods for survival data on families from routine registers. Statistics in Medicine 27 , 1062 – 1074 . Google Scholar CrossRef Search ADS PubMed Møller S. , Mucci L. A. , Harris J. R. , Scheike T. , Holst K. , Halekoh U. , Adami H.-O. , Czene K. , Christensen K. , Holm N. V. , and others. ( 2016 ). The heritability of breast cancer among women in the nordic twin study of cancer. Cancer Epidemiology Biomarkers & Prevention 25 , 145 – 150 . Google Scholar CrossRef Search ADS Naskar M. , Das K. and Ibrahim J. G. ( 2005 ). A semiparametric mixture model for analyzing clustered competing risks data. Biometrics 61 , 729 – 737 . Google Scholar CrossRef Search ADS PubMed Pedersen C. B. ( 2011 ). The Danish Civil Registration System. Scandinavian Journal of Public Health 39 , 22 – 25 . Google Scholar CrossRef Search ADS PubMed Petrucelli N. , Daly M. B. and Feldman G. L. ( 2010 ). Hereditary breast and ovarian cancer due to mutations in BRCA1 and BRCA2. Genetics in Medicine 12 , 245 – 259 . Google Scholar CrossRef Search ADS PubMed Pinheiro J. C. and Bates D. M. ( 1995 ). Approximations to the log-likelihood function in the nonlinear mixed-effects model. Journal of Computational and Graphical Statistics 4 , 12 – 35 . Pinheiro J. C. and Chao E. C. ( 2006 ). Efficient Laplacian and adaptive Gaussian quadrature algorithms for multilevel generalized linear mixed models. Journal of Computational and Graphical Statistics 15 , 58 – 81 . Google Scholar CrossRef Search ADS R Core Team. ( 2017 ). R: A Language and Environment for Statistical Computing . R Foundation for Statistical Computing , Vienna, Austria . Scheike T. H. and Sun Y. ( 2012 ). On cross-odds ratio for multivariate competing risks data. Biostatistics 13 , 680 – 694 . Google Scholar CrossRef Search ADS PubMed Scheike T. H. , Sun Y. , Zhang M.-J. and Jensen T. K. ( 2010 ). A semiparametric random effects model for multivariate competing risks data. Biometrika 97 , 133 – 145 . Google Scholar CrossRef Search ADS PubMed Self S. G. and Liang K.-Y. ( 1987 ). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. Journal of the American Statistical Association 82 , 605 – 610 . Google Scholar CrossRef Search ADS Shi H. , Cheng Y. and Jeong J.-H. ( 2013 ). Constrained parametric model for simultaneous inference of two cumulative incidence functions. Biometrical Journal 55 , 82 – 96 . Google Scholar CrossRef Search ADS PubMed Shih J. H. and Albert P. S. ( 2010 ). Modeling familial association of ages at onset of disease in the presence of competing risk. Biometrics 66 , 1012 – 1023 . Google Scholar CrossRef Search ADS PubMed Thomas D. C. ( 2004 ). Statistical methods in genetic epidemiology . New York : Oxford University Press . Varin C. , Reid N. and Firth D. ( 2011 ). An overview of composite likelihood methods. Statistica Sinica 21 , 5 – 42 . © The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

### Journal

BiostatisticsOxford University Press

Published: Jan 4, 2018

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
that matters to you.

over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Search Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly ### Organize Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place. ### Access Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. DeepDyve ### Freelancer DeepDyve ### Pro Price FREE$49/month
\$360/year

Save searches from
PubMed

Create lists to

Export lists, citations