Quality & Quantity 33: 97–116, 1999.
© 1999 Kluwer Academic Publishers. Printed in the Netherlands.
Replicating Text: The Cumulation of Knowledge
in Social Science
Methodology of Simulation for Content Analysis
and DEAN P. MCKENZIE
Department of Psychology, Catholic University of Louvain, Louvain-la-Neuve, Belgium;
Victorian Transcultural Psychiatry Unit, St. Vincent’s Hospital and Department of Psychiatry,
University of Melbourne, Melbourne, Australia
Abstract. Obtaining a statistically signiﬁcant result does not necessarily tell us whether we would
obtain signiﬁcant results in other, similar studies, particularly if the original sample sizes were small.
This is why we are supposed to replicate experiments. The present study concerns social science
events that cannot be repeated by virtue of their being historically situated. Among social science
events, many textual data are datable and, by deﬁnition, unrepeatable. One solution to this quandary
lies in bootstrap replications, which are based on the original data. A case in point is that of founding
political speeches such as those that buoy the European construction. We analyze and compare 82
speeches made by President Delors over the period 1988–1994, and 28 by President Santer over the
period 1995–1997. We have all these speeches (N = 110) concorded as to which words are used, how
often, where, and when, with the help of a computer-aided content analysis package. We then test
various hypotheses using replication bootstrap estimates, that is, by replicating the original sample a
large number of times and recreating several thousand samples from the population so created.
Abbreviations: EU = European Union, RID = Regressive Imagery Dictionary.
Key words: computer-supported content analysis, replication bootstrap estimates, time-series.
The position we have to defend here is about the sense in which results of content
analyses, performed on textual data that cannot be repeated by virtue of their being
time-dated, could still be generalized using simulated replications based upon the
original data. We argue that this answer to the quandary of unrepeatable textual data
is all the more appropriate in that considering each new sample as an alternative
distribution, as is the case in the statistical resampling operation, is coherent with
conceiving of textual data as also made of alternative responses. We will have
to advance three considerations if we are to make sense of replicating unrepeat-
able textual data. The ﬁrst is a reminder of the difﬁculties surrounding hypothesis
signiﬁcance testing. The second concerns possible sources of variability that war-
rant replicating textual data: If textual data were shown to be invariable, ﬁxed,
unchangeable, in brief, unavoidable as by a force majeure (Berlin, 1974: 70), it
would be difﬁcult to show that replications based on the original data can produce
a valid distribution of estimates. The third consideration concerns counterfactuals.