# CLASSIFICATION AND REGRESSION TREES: A POWERFUL YET SIMPLE TECHNIQUE FOR ECOLOGICAL DATA ANALYSIS

CLASSIFICATION AND REGRESSION TREES: A POWERFUL YET SIMPLE TECHNIQUE FOR ECOLOGICAL DATA ANALYSIS Classification and regression trees are ideally suited for the analysis of complex ecological data. For such data, we require flexible and robust analytical methods, which can deal with nonlinear relationships, high-order interactions, and missing values. Despite such difficulties, the methods should be simple to understand and give easily interpretable results. Trees explain variation of a single response variable by repeatedly splitting the data into more homogeneous groups, using combinations of explanatory variables that may be categorical and/or numeric. Each group is characterized by a typical value of the response variable, the number of observations in the group, and the values of the explanatory variables that define it. The tree is represented graphically, and this aids exploration and understanding. Trees can be used for interactive exploration and for description and prediction of patterns and processes. Advantages of trees include: (1) the flexibility to handle a broad range of response types, including numeric, categorical, ratings, and survival data; (2) invariance to monotonic transformations of the explanatory variables; (3) ease and robustness of construction; (4) ease of interpretation; and (5) the ability to handle missing values in both response and explanatory variables. Thus, trees complement or represent an alternative to many traditional statistical techniques, including multiple regression, analysis of variance, logistic regression, log-linear models, linear discriminant analysis, and survival models. We use classification and regression trees to analyze survey data from the Australian central Great Barrier Reef, comprising abundances of soft coral taxa (Cnidaria: Octocorallia) and physical and spatial environmental information. Regression tree analyses showed that dense aggregations, typically formed by three taxa, were restricted to distinct habitat types, each of which was defined by combinations of 3––4 environmental variables. The habitat definitions were consistent with known experimental findings on the nutrition of these taxa. When used separately, physical and spatial variables were similarly strong predictors of abundances and lost little in comparison with their joint use. The spatial variables are thus effective surrogates for the physical variables in this extensive reef complex, where information on the physical environment is often not available. Finally, we compare the use of regression trees and linear models for the analysis of these data and show how linear models fail to find patterns uncovered by the trees. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Ecology Ecological Society of America

# CLASSIFICATION AND REGRESSION TREES: A POWERFUL YET SIMPLE TECHNIQUE FOR ECOLOGICAL DATA ANALYSIS

15 pages

/lp/ecological-society-of-america/classification-and-regression-trees-a-powerful-yet-simple-technique-DBAulCgjoK
Publisher site
See Article on Publisher Site

### Abstract

Classification and regression trees are ideally suited for the analysis of complex ecological data. For such data, we require flexible and robust analytical methods, which can deal with nonlinear relationships, high-order interactions, and missing values. Despite such difficulties, the methods should be simple to understand and give easily interpretable results. Trees explain variation of a single response variable by repeatedly splitting the data into more homogeneous groups, using combinations of explanatory variables that may be categorical and/or numeric. Each group is characterized by a typical value of the response variable, the number of observations in the group, and the values of the explanatory variables that define it. The tree is represented graphically, and this aids exploration and understanding. Trees can be used for interactive exploration and for description and prediction of patterns and processes. Advantages of trees include: (1) the flexibility to handle a broad range of response types, including numeric, categorical, ratings, and survival data; (2) invariance to monotonic transformations of the explanatory variables; (3) ease and robustness of construction; (4) ease of interpretation; and (5) the ability to handle missing values in both response and explanatory variables. Thus, trees complement or represent an alternative to many traditional statistical techniques, including multiple regression, analysis of variance, logistic regression, log-linear models, linear discriminant analysis, and survival models. We use classification and regression trees to analyze survey data from the Australian central Great Barrier Reef, comprising abundances of soft coral taxa (Cnidaria: Octocorallia) and physical and spatial environmental information. Regression tree analyses showed that dense aggregations, typically formed by three taxa, were restricted to distinct habitat types, each of which was defined by combinations of 3––4 environmental variables. The habitat definitions were consistent with known experimental findings on the nutrition of these taxa. When used separately, physical and spatial variables were similarly strong predictors of abundances and lost little in comparison with their joint use. The spatial variables are thus effective surrogates for the physical variables in this extensive reef complex, where information on the physical environment is often not available. Finally, we compare the use of regression trees and linear models for the analysis of these data and show how linear models fail to find patterns uncovered by the trees.

### Journal

EcologyEcological Society of America

Published: Nov 1, 2000

Keywords: analysis of variance ; CART ; classification tree ; coral reef ; Great Barrier Reef ; habitat characteristic ; Octocorallia ; regression tree ; soft coral ; surrogate

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

over 18 million articles from more than
15,000 peer-reviewed journals.

All for just \$49/month

### Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

### Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

### Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

### Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

DeepDyve

DeepDyve

### Pro

Price

FREE

\$49/month
\$360/year

Save searches from
PubMed

Create folders to

Export folders, citations