Quality & Quantity 32: 229–245, 1998.
© 1998 Kluwer Academic Publishers. Printed in the Netherlands.
Goodness of Fit in Regression Analysis – R
& MAGNUS STENBECK
Centre for Public Health Research, University of Karlstad, S-65188 Karlstad, Sweden;
Epidemiology, National Board of Health and Welfare, S-106 30 Stockholm, Sweden
Abstract. There has been considerable debate on how important goodness of ﬁt is as a tool in
regression analysis, especially with regard to the controversy on R
in linear regression. This article
reviews some of the arguments of this debate and its relationship to other goodness of ﬁt measures.
It attempts to clarify the distinction between goodness of ﬁt measures and other model evaluation
tools as well as the distinction between model test statistics and descriptive measures used to make
decisions on the agreement between models and data. It also argues that the utility of goodness of ﬁt
measures depends on whether the analysis focuses on explaining the outcome (model orientation) or
explaining the effect(s) of some regressor(s) on the outcome (factor orientation).
In some situations a decisive goodness of ﬁt test statistic exists and is a central tool in the analysis.
In other situations, where the goodness of ﬁt measure is not a test statistic but a descripitive measure,
it can be used as a heuristic device along with other evidence whenever appropriate. The availability
of goodness of ﬁt test statistics depends on whether the variability in the observations is restricted, as
in table analysis, or whether it is unrestricted, as in OLS and logistic regression on individual data.
is a decisive tool for measuring goodness of ﬁt, whereas R
and SEE are heuristic tools.
In sciences where causal model building holds a central position there is often
a need to evaluate theoretical models empirically. In other contexts, such as in
measurement models, the quality of data needs to be evaluated against a theoretical
model. In both situations, this is done by analysing the agreement between the data
generated by the model (predicted data, ﬁtted or expected values) and the collected
empirical data (observed data/values). The degree of agreement is a measure of the
goodness of ﬁt between data and model.
The usefulness of evaluating the goodness of ﬁt of models is not undisputed.
Opinions also differ as to how best to do this, in other words what speciﬁc statistical
measure should be used for the purpose. An intense discussion has been taking
place in recent years about measures of goodness of ﬁt for structural equation
models with latent variables (Bollen and Long, 1993). Another example is the
debate about the usefulness of R
in linear regression analysis which took place
a few years ago among methodologists in the USA (see Achen, 1990; King, 1990;
Lewis-Beck and Skalaban, 1990).