Comparison of three statistical methods for analysis of fall predictors in
people with dementia: Negative binomial regression (NBR), regression
tree (RT), and partial least squares regression (PLSR)
Staffan Eriksson
a,c,
*
, Anders Lundquist
b
, Yngve Gustafson
c
, Lillemor Lundin-Olsson
a
a
Department of Community Medicine and Rehabilitation, Physiotherapy, Umea
˚
University, SE-901 87 Umea
˚
, Sweden
b
Department of Mathematics and Mathematical Statistics, Umea
˚
University, SE-901 87 Umea
˚
, Sweden
c
Department of Community Medicine and Rehabilitation, Geriatric Medicine, Umea
˚
University, SE-901 87 Umea
˚
, Sweden
1. Introduction
Falls and their consequences are a major health problem among
the elderly population (Masud and Morris, 2001). Reported fall
rates and odds of falling are considerably higher for people with
dementia or impaired cognition than for people without impaired
cognition (Tinetti et al., 1988; Myers et al., 1991; De Carle and
Kohn, 2001; Van Doorn et al., 2003; Kallin et al., 2004).
In studies concerning falls, methodological difficulties and
properties of the data must be considered when choosing
statistical methods. Logistic regression has often been used.
However, in a logistic regression only the proportion of fallers is
analyzed; by ignoring repeated falls, relevant information is
discarded (Robertson et al., 2005). In studies of falls, the
observation time of study participants often varies due to different
amounts of time spent at a medical ward or to drop-outs from the
study. Different observation times are difficult to take into account
using logistic regression. On the other hand, Cox regression models
can handle the problem of different observation periods (Katz,
2003). Poisson regression, NBR and some Cox regression models
can utilize information from multiple falls and handle different
observation times (Robertson et al., 2005). In Cox regression, it is
assumed that the ratio of the risk of an event in two groups is
constant over time, but this assumption has been questioned. The
Poisson distribution assumes that the variance and mean of the
outcome variable (discrete, count of falls) are equal, but this
assumption is often violated in fall investigations. Due to these
considerations, NBR has been recommended for fall prevention
trials (Robertson et al., 2005).
Regression techniques are used to determine the mean effect of
a predictor variable on an outcome variable, that is extracting
predictor variables that are important to the entire population
(Lemon et al., 2003). Consequently, predictor variables of
importance to a subgroup of a heterogeneous population are
difficult to detect. Furthermore, when using regression analysis to
analyze a heterogeneous population, the dispersion of the outcome
variable is high. Hence, it is more difficult to detect predictor
variables of importance to the entire population. This is because a
large dispersion is usually attributed to a non-representative
sample, but if the population is heterogeneous, the sample may be
representative despite a large dispersion. In the context of fall
investigations, a large dispersion is common (Robertson et al.,
Archives of Gerontology and Geriatrics 49 (2009) 383–389
ARTICLE INFO
Article history:
Received 15 July 2008
Received in revised form 1 December 2008
Accepted 4 December 2008
Available online 24 January 2009
Keywords:
Partial least squares regression
Regression tree
Negative binomial regression
Accidental falls
Fall predictors
Dementia and falls
ABSTRACT
Searching for background factors associated with falls in people with dementia is difficult because the
population is heterogeneous. The aim of this study was to compare the efficacies of three statistical
methods for analysis of fall predictors in people with dementia. NBR, RT and PLSR analyses were
compared. Data used for the comparison were from a prospective cohort study of 192 patients at a
psychogeriatric ward, specializing in patients with cognitive impairment and related behavioral and
psychological symptoms. Seventy-eight of these patients fell a total of 238 times. PLSR and RT analyses
are directed at finding patterns among predictor variables related to outcome, whereas an NBR model is
directed at finding predictor variables that, independent of other variables, are related to the outcome.
The NBR analysis explained an additional 10–15% variation compared with the PLSR and RT analyses. The
results of PLSR and RT show a similar plausible pattern of predictor variables. However, none of these
techniques appears to be sufficient in itself. In order to gain patterns of explanatory variables, RT would
be a good complement to NBR for analysis of fall predictors.
ß 2008 Elsevier Ireland Ltd. All rights reserved.
* Corresponding author at: Department of Community Medicine and Rehabilita-
tion, Physiotherapy, Umea
˚
University, SE-901 87 Umea
˚
, Sweden.
Tel.: +46 90 786 90 89; fax: +46 90 786 92 67.
E-mail address: staffan.eriksson@germed.umu.se (S. Eriksson).
Contents lists available at ScienceDirect
Archives of Gerontology and Geriatrics
journal homepage: www.elsevier.com/locate/archger
0167-4943/$ – see front matter ß 2008 Elsevier Ireland Ltd. All rights reserved.
doi:10.1016/j.archger.2008.12.004