Bagging tree classifiers for laser scanning images:
a data- and simulation-based strategy
Torsten Hothorn
1
, Berthold Lausen
*
Institut fu
¨
r Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universita
¨
t
Erlangen-Nu
¨
rnberg, Waldstraße 6, D-91054 Erlangen, Germany
Received 15 May 2002; received in revised form 16 August 2002; accepted 27 September 2002
Abstract
Diagnosis based on medical image data is common in medical decision making and clinical
routine. We discuss a strategy to derive a classifier with good performance on clinical image data and
to justify the properties of the classifier by an adapted simulation model of image data. We focus on
the problem of classifying eyes as normal or glaucomatous based on 62 routine explanatory variables
derived from laser scanning images of the optic nerve head. As learning sample we use a case-control
study of 98 normal and 98 glaucomatous subjects matched by age and sex.
Aggregating multiple unstable classifiers allows substantial reduction of misclassification error in
many applications and bench mark problems. We investigate the performance of various classifiers
for the clinical learning sample as well as for a simulation model of eye morphologies. Bagged
classification trees (bagged-CTREE) are compared to single classification trees and linear discri-
minant analysis (LDA). We additionally compare three estimators of misclassification error: 10-fold
cross-validation, the 0:632þ bootstrap and the out-of-bag estimate. In summary, the application of
our strategy of a knowledge-based decision support shows that bagged classification trees perform
best for glaucoma classification.
# 2002 Elsevier Science B.V. All rights reserved.
Keywords: Discriminant analysis; Laser scanning; Images; Bagging; Error rate; Estimation; Simulation
1. Introduction
The quest for an efficient and robust classifier is an important issue in medical deci-
sion making. The possibilities to use diagnostic image data or other high throughput
Artificial Intelligence in Medicine 27 (2003) 65–79
*
Corresponding author. Tel.: þ49-9131-85-25739; fax: þ49-9131-85-25740.
E-mail addresses: torsten.hothorn@rzmail.uni-erlangen.de (T. Hothorn),
berthold.lausen@rzmail.uni-erlangen.de (B. Lausen).
1
Tel.: þ49-9131-85-22707; fax: þ49-9131-85-25740.
0933-3657/02/$ – see front matter # 2002 Elsevier Science B.V. All rights reserved.
PII: S 0933-3657(02)00085-4