Reliable Computing 9: 487–509, 2003.
2003 Kluwer Academic Publishers. Printed in the Netherlands.
Tree-Based Credal Networks for Classiﬁcation
IDSIA, Galleria 2, CH–6928 Manno (Lugano), Switzerland, e-mail: firstname.lastname@example.org
a degli Studi di Milano-Bicocca, Via Bicocca degli Arcimboldi 8, I–20126 Milano,
Italy, e-mail: email@example.com
(Received: 10 October 2002; accepted: 28 April 2003)
Abstract. Bayesian networks are models for uncertain reasoning which are achieving a growing
importance also for the data mining task of classiﬁcation. Credal networks extend Bayesian nets to
sets of distributions, or credal sets. This paper extends a state-of-the-art Bayesian net for classiﬁcation,
called tree-augmented naive Bayes classiﬁer, to credal sets originated from probability intervals. This
extension is a basis to address the fundamental problem of prior ignorance about the distribution that
generates the data, which is a commonplace in data mining applications. This issue is often neglected,
but addressing it properly is a key to ultimately draw reliable conclusions from the inferred models.
In this paper we formalize the new model, develop an exact linear-time classiﬁcation algorithm, and
evaluate the credal net-based classiﬁer on a number of real data sets. The empirical analysis shows
that the new classiﬁer is good and reliable, and raises a problem of excessive caution that is discussed
in the paper. Overall, given the favorable trade-off between expressiveness and efﬁcient computation,
the newly proposed classiﬁer appears to be a good candidate for the wide-scale application of reliable
classiﬁers based on credal networks, to real and complex tasks.
Classiﬁcation has a long tradition in statistics and machine learning . The
purpose of classiﬁers is to predict the unknown categorical class C of objects
described by a vector of features (or attributes), A
. Classiﬁers capture the
relationship between C and (A
) by examining past joint realizations of the
class and the features. Such abstract description of classiﬁcation is easily mapped to
a number of concrete applications. For example, medical diagnosis can be regarded
as classiﬁcation, the symptoms and the medical tests being the features and the
class representing the possible diseases. Many other application domains exist, as
image recognition, fraud detection, user proﬁling, text classiﬁcation, etc.
Traditional classiﬁcation is typically represented within Bayesian decision the-
ory, which formally solves the prediction problem if the distribution that generates
the data is known. Unfortunately, in practice there is usually little information about
the distribution, apart from that carried by the data. This is even more radical when
This paper extends work published in the Proceedings of the 8th Information Processing and
Management of Uncertainty in Knowledge-Based Systems Conference .