Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Consensus Clustering of U.S. Temperature and Precipitation Data

Consensus Clustering of U.S. Temperature and Precipitation Data A ““consensus clustering”” strategy is applied to long-term temperature and precipitation time series data for the purpose of delineating climate zones of the conterminous United States in a ““data-driven”” (as opposed to ““rule-driven””) fashion. Cluster analysis simplifies a dataset by arranging ““objects”” (here, climate divisions or stations) into a smaller number of relatively homogeneous groups or clusters on the basis of interobject dissimilarities computed using the identified ““attributes”” (here, temperature and precipitation measurements recorded for the objects). The results demonstrate the spatial scales associated with climatic variability and may suggest climatically justified ways in which the number of objects in a dataset may be reduced. Implicit in this work is the arguable contention that temperature and precipitation data are both necessary and sufficient for the delineation of climatic zones. In prior work, the temperature and precipitation data were mixed during the computation of the interobject dissimilarities. This allowed the clusters to jointly reflect temperature and precipitation distinctions, but also had inherent problems relating to arbitrary attribute scaling and information redundancy that proved difficult to resolve. In the present approach, the temperature and precipitation data are clustered separately and then categorically intersected to forge consensus clusters. The consensus outcome may be viewed as having identified the temperature subzones of precipitation clusters (or vice versa) or as representing distinct groupings that are relatively homogeneous with respect to both attribute types simultaneously. The dissimilarity measure employed herein is the Euclidean distance. As it employs only continuous time series data representing a single information type (temperature or precipitation), the consensus approach has the advantage of allowing an attractively simple interpretation of the total Euclidean distance between object pairs. The total squared distance may be subdivided into three components representing object dissimilarity with respect to temporal mean (level), seasonality (variability), and coseasonality (relative temporal phasing). Therefore, concerns about redundancy or arbitrary scaling problems are neutralized. This is seen as the chief advantage of consensus clustering. The consensus strategy has several disadvantages. It is possible for two (or more) relatively general, undetailed clusterings to produce a very complex and fragmented clustering following categorical intersection. Further, the fact that the analyst chooses the clustering levels of the separate, contributing clusterings means that he or she has considerable freedom in fashioning the consensus outcome, which makes it difficult (if not impossible) to argue that true, ““natural”” clusters have been identified. The latter often applies to cluster analysis in general, however. It is believed that the consensus approach merits consideration owing to its advantages. Two consensus outcomes are presented: a lower-order solution with 14 clusters and a higher-order solution with 26 clusters. The sensitivity of these clusterings to perturbations in the input data is assessed. The regionalizations are compared with those presented in prior work. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Climate American Meteorological Society

Consensus Clustering of U.S. Temperature and Precipitation Data

Journal of Climate , Volume 10 (6) – Oct 12, 1995

Loading next page...
 
/lp/american-meteorological-society/consensus-clustering-of-u-s-temperature-and-precipitation-data-uZ0ZBUS0de
Publisher
American Meteorological Society
Copyright
Copyright © 1995 American Meteorological Society
ISSN
1520-0442
DOI
10.1175/1520-0442(1997)010<1405:CCOUST>2.0.CO;2
Publisher site
See Article on Publisher Site

Abstract

A ““consensus clustering”” strategy is applied to long-term temperature and precipitation time series data for the purpose of delineating climate zones of the conterminous United States in a ““data-driven”” (as opposed to ““rule-driven””) fashion. Cluster analysis simplifies a dataset by arranging ““objects”” (here, climate divisions or stations) into a smaller number of relatively homogeneous groups or clusters on the basis of interobject dissimilarities computed using the identified ““attributes”” (here, temperature and precipitation measurements recorded for the objects). The results demonstrate the spatial scales associated with climatic variability and may suggest climatically justified ways in which the number of objects in a dataset may be reduced. Implicit in this work is the arguable contention that temperature and precipitation data are both necessary and sufficient for the delineation of climatic zones. In prior work, the temperature and precipitation data were mixed during the computation of the interobject dissimilarities. This allowed the clusters to jointly reflect temperature and precipitation distinctions, but also had inherent problems relating to arbitrary attribute scaling and information redundancy that proved difficult to resolve. In the present approach, the temperature and precipitation data are clustered separately and then categorically intersected to forge consensus clusters. The consensus outcome may be viewed as having identified the temperature subzones of precipitation clusters (or vice versa) or as representing distinct groupings that are relatively homogeneous with respect to both attribute types simultaneously. The dissimilarity measure employed herein is the Euclidean distance. As it employs only continuous time series data representing a single information type (temperature or precipitation), the consensus approach has the advantage of allowing an attractively simple interpretation of the total Euclidean distance between object pairs. The total squared distance may be subdivided into three components representing object dissimilarity with respect to temporal mean (level), seasonality (variability), and coseasonality (relative temporal phasing). Therefore, concerns about redundancy or arbitrary scaling problems are neutralized. This is seen as the chief advantage of consensus clustering. The consensus strategy has several disadvantages. It is possible for two (or more) relatively general, undetailed clusterings to produce a very complex and fragmented clustering following categorical intersection. Further, the fact that the analyst chooses the clustering levels of the separate, contributing clusterings means that he or she has considerable freedom in fashioning the consensus outcome, which makes it difficult (if not impossible) to argue that true, ““natural”” clusters have been identified. The latter often applies to cluster analysis in general, however. It is believed that the consensus approach merits consideration owing to its advantages. Two consensus outcomes are presented: a lower-order solution with 14 clusters and a higher-order solution with 26 clusters. The sensitivity of these clusterings to perturbations in the input data is assessed. The regionalizations are compared with those presented in prior work.

Journal

Journal of ClimateAmerican Meteorological Society

Published: Oct 12, 1995

There are no references for this article.