Access the full text.
Sign up today, get DeepDyve free for 14 days.
B. Hill, H. Kyburg, H. Smokler (1965)
Studies in Subjective Probability, 128
Richard Bradley (2001)
Ramsey and the measurement of belief
D. Corfield, Jon Williamson (2011)
Foundations of Bayesianism
Abstract Frank Ramsey in his paper ‘Truth and Probability’ was the first to develop a theory of utility based on a representation theorem, and a theory of partial belief based on utility-valued odds. But his proof of the multiplication theorem, on which in his system the law of addition depends, contains a step for which there seems to be no justification, and Ramsey provided no clue as to how to supply one. I conjecture that the missing justification appeals naturally to a three-valued logic that de Finetti later described in his own discussion of the multiplication theorem. If this is the case then Ramsey’s paper is even more strikingly original than has been thought. 1. Introduction Frank Ramsey is celebrated for his path-breaking work in philosophy, economics, decision theory, mathematics and probability. Sadly, in his tragically short life, those paths were broken with the larger world largely unconscious of the fact. Only long after his death have his seminal contributions to all the fields mentioned become widely appreciated. It is one of the most original of those that I will be concerned with here, his work in probability; more specifically his derivation in his 1926 paper of the laws of (finitely additive) probability out of what appears to be nothing save the notion of a bet where the odds are utility-based. One of those laws is the multiplication rule, P(B|A)P(A) = P(A&B), on which in Ramsey’s presentation the proof of the addition law depends. Surprisingly, given that Ramsey’s professional occupation was that of mathematician at Cambridge, the mathematical heart of the UK, and that all that is involved in Ramsey’s proofs is elementary algebra, that of the multiplication rule involves a line for which there seems to be no justification. If there is one, Ramsey failed to give any hint as to what it might be. In what follows I will argue that Ramsey’s proof is a sketch with some missing steps, which can indeed be identified and justified, though as we shall see the justification raises its own problems. But first a brief introduction to Ramsey’s theory of utility and probability will be useful. 2. From utility-values to (consistent) degrees of belief Ramsey’s theory of (consistent) degrees of belief is a subtheory of his theory of value, a theory based on the assumption of a complete preference ordering over possible ‘worlds’ (his word: ‘states of affairs’ might be better), assumed to be closed under the formation of conditional options of the form [α if A, β if B, γ if C, …] where A, B, C, … are propositions forming a finite partition of possibility. Since his aim was to define degrees of belief in A, B, C, etc. in terms roughly of the least amount a subject would risk as a proportion of a variable return, the diminishing marginal utility of money renders money-stakes unsuitable (and as Ramsey (1926: 176) pointed out, if they are very small the outcome of a bet will not matter much to the agent anyway). At the very least a scale of value seems needed in which proportionality is guaranteed.1 Moreover, since betting odds are defined in terms of the difference between how much you risk for any given return, and these differences have to be normalised to lie within the closed unit interval, ratios of value-differences should be invariant under changes of scale and origin. As is well known, Ramsey gave axioms for equality of value-differences and stated a theorem showing that there is a one-one mapping u of values to real numbers which preserves such ratios (1926: 179): in other words, value is measurable on what is called an interval scale.2 Ramsey is now in a position to realise his aim of defining what it means for a subject to have consistent degrees of belief in terms of their willingness to accept bets with value-based odds, expressed within an abstract setting suitable for generalization to any context where there are preferences among actions whose outcomes depend on which proposition in some partition is true. In fact, Ramsey’s definition of a consistent unconditional degree of belief requires only consideration of a binary partition {A, ∼A}. For suppose that the subject is indifferent between the option of α for certain and [β if A, γ if ∼A]. Subject to that being the case, the subject’s degree of belief P(A) in A is defined to be equal to the quotient (u(α) − u(γ))/(u(β) − u(γ)), which we have to assume is independent of any particular α, β, γ. This ‘amounts roughly to defining the degree of belief in [A] by the odds at which the subject would bet on [A], the bet being conducted in terms of differences of value as defined’ (1926: 180). The representation theorem tells us that the quotient is independent of the particular u chosen, and so we have an invariant measure. Implicit in this definition of partial belief is the expected utility principle. Indifference between α and [β if A, γ if ∼A] is equivalent to u(α) = u([β if A, γ if ∼A]). By Ramsey’s definition, we have P(A) = (u(α) − u(γ))/(u(β) − u(γ)). Rearranging terms and cancelling, we quickly find that u([β if A, γ if ∼A] = P(A)u(β) + P(∼A)u(γ). But we can take α itself to be [β if A, γ if ∼A]: the subject would certainly be inconsistent in not being indifferent between α and α, in which case we can infer unconditionally that u([β if A, γ if ∼A] = P(A)u(β) + P(∼A)u(γ). It should be remembered, in judging my system, that in it value is actually defined by means of mathematical expectation in the case of beliefs of degree 1/2, and so may be expected to be scaled suitably for the valid application of the mathematical expectation in the case of other degrees of belief also. (1926: 183) We also quickly infer that if the agent is consistent also in the sense of identifying options with identical payoff conditions, then P(∼A) must be equal to 1-P(A). For, setting B = ∼A, we have that [β if A, γ if ∼A] is the same option, extensionally speaking, as [γ if B, β if ∼B]. Hence P(∼A) = P(B) = (u(α) – u(β))/(u(γ) − u(β)) = 1 – (u(α) – u(γ))/(u(β) – u(γ)]) = 1 – P(A). The degree of belief function P(A) so defined is an unconditional one, but Ramsey also showed how to define ‘a very useful new idea’ (1926, 180), that of a conditional degree of belief in B given A. Suppose the subject is indifferent between the options (1) [α if A, β if ∼A] and (2) [γ if B&A, δ if ∼B&A, β if ∼A]. Then their degree of belief in B given A is defined to be equal to (u(α) – u(δ))/(u(γ) – u(δ)), where again this is assumed to be independent of the particular choice of α, β, γ and δ. Call this quotient P(B|A) (this is not Ramsey’s notation). As in the case of Ramsey’s definition of an unconditional degree of belief, the definition of a conditional degree of belief can be understood as determining odds in a type of bet, only in this case one which only goes ahead if A is true; such a bet is called a conditional bet. For suppose that the subject is willing to exchange (1) for (2). Restricting the payoff table to just the ‘A true’ part of the joint truth table for B and A, we see that it represents a bet on p with betting quotient (u(α) – u(δ))/(u(γ) – u(δ)), i.e. the definition of P(B|A). Ramsey then sketched proofs that so defined the conditional and unconditional probabilities satisfy the following rules: P(A) + P(∼A) = 1 (already demonstrated) P(B|A) + P(∼B|A) = 1 P(B|A)P(A) = P(A&B) (multiplication rule) P(A) = P(A&B) + P(A&∼B), with the addition rule P(A∨B) + P(A&B) = P(A) + P(B) following as an easy consequence. 3. The multiplication rule We have already seen how consistency demands that P(∼A) = 1 – P(A), and it is also straightforward to show that P(∼B|A) = 1 – P(B|A). And so we come to (iii) above, the multiplication rule on which, in Ramsey’s system, the addition law depends. But even allowing for its highly concise character, Ramsey’s proof seems to have a gaping hole, as we shall now see. Like Ramsey, I will use the symbol ‘≡’ to mean ‘is indifferent between’ (for some presumptively consistent agent), with all the payoffs henceforward assumed to be in units of value. Ramsey’s proof starts with the claims that, where x = P(A) and y = P(B|A), then for any t, u, (1) ξ for certain≡[ξ+(1−x)t if A, ξ−xt if∼A] (2) [ξ +(1−x)t if A]≡[ξ+(1−x)t+(1−y)u if B&A, ξ+(1−x)t−yu if ∼B&A]. The proof of (1) is a straightforward exercise in equating expected values which I will leave to the reader. But (2) is a quite different matter, for it has no justification in terms of equality of expectations that does not already assume the multiplication rule (again, I leave the easy proof to the reader). I will flag (2) and return to it later, but for now grant it and, following Ramsey, assume that y≠0, and choose u so that ξ + (1 – x)t – yu = ξ – xt, i.e. u = t/y. Then [ξ+(1−x)t if A]≡[ξ+(1−x)t+(1−y)t/y if B&A,ξ−xt if∼B&A], whence [ξ+(1−x)t if A,ξ−xt if∼A]≡[ξ+(1−x)t+(1−y)t/y if B&A,ξ−xt if∼B&A,ξ−xt if∼A]. But by (1), ξ for certain≡[ξ+(1−x)t if A, ξ−xt if∼A]. So ξ for certain≡[ξ+(1−x)t+(1−y)t/y if B&A,ξ−xt if∼(B&A)], and so, by definition, P(B&A)=(ξ−(ξ−xt))/((ξ+(1−x)t+(1−y)t/y)−(ξ−xt))=xy=P(B|A)P(A). This was however proved on the assumption that y = P(B|A)≠0. If y = 0, take t to be 0 and u non-zero, so that from (1) and (2) we infer ξ for certain≡[ξ if A,ξ if∼A]≡[ξ+u if A&B,ξ if A&∼B,ξ if∼A]≡[ξ+u if A&B,ξ if∼(A&B)]. Therefore P(A&B)=(ξ−ξ)/(ξ+u−ξ)=0=P(B|A)P(A). QED? Certainly not as it stands. I have pointed out that (2) seems not to be justified by equating expectations without assuming what was to be proved, that P(B|A)P(A) = P(A&B). Ramsey certainly would have known this, yet puzzingly failed to provide anything by way of explanation or justification for (2). In what follows I will offer a conjecture as to the nature of what I believe Ramsey omitted, and why he omitted it. If correct, it will put Ramsey’s remark that he had ‘not worked out the mathematical logic of this in detail’ (1926: 180) in a fascinating new perspective. For if I am right what lies behind that omission is an expansion of classical propositional logic into what for Ramsey was uncharted territory, with no guarantee of consistency, and involving a class of assertions that he denied are propositions at all; for these reasons a reluctance on his part to make public any such step would of course be quite understandable. Remarkably, however, just such an extension of classical logic involving just such a class of assertions was proposed by de Finetti just over a decade later, in the same context of the multiplication theorem, starting from just the same analysis of conditional subjective probability as had Ramsey. I will come to that later. First, here is my hypothetical reconstruction of Ramsey’s proof. It starts by placing after (1) an analogous claim that for any t, u, (1a) \!ξ+(1−x)t≡[ξ+(1−x)t+(1−y)u if?,\ ξ+(1−x)t−yu if∼?], where I have placed a question-mark for the statement which, if (1a) is also to be justified by taking expectations, should appear on the right-hand side. There seems little alternative to supposing that it is a ‘proposition’ B|A whose interpretation can (hopefully) be made consistent with the operational betting interpretation of P(B|A) (we shall see shortly that this is indeed possible). Some observations of Ramsey (1929: 246, 247), developed into so-called possible-world analyses of conditionals (for example by Stalnaker), as well as a betting-based semantics based on Ramsey’s definition of P(B|A) which I will develop in the next section, now sanction replacing ∼(B|A) by ∼B|A. Noting that the preceding theorem in Ramsey’s list is that P(∼B|A) = 1−P(B|A), we easily infer that, relative to the distribution (y, 1−y) over B|A, ∼B|A, the expectations on both sides of (1a) are equal. It follows that (1b) [ξ+(1−x)t if A]≡[[ξ+(1−x)t+(1−y)u if B|A,ξ+(1−x)t−yu if∼B|A] if A], and given that conditional options [α if C, β if ∼C] are naturally parsed as conjunctions of conditional sentences ‘if C then α & if ∼C then β’ (reading α, β now as ‘you will get α, β’), we can use standard conditional reasoning to infer (1c) [ξ+(1−x)t if A]≡[ξ+(1−x)t+(1−y)u if (B|A)&A,ξ+(1−x)t−yu if (∼B|A)&A]. Clearly, all that is now needed is a modus-ponens type inference allowing (X|Y)&Y to be replaced by X&Y, giving [ξ+(1−x)t if A]≡[ξ+(1−x)t+(1−y)u if B&A,ξ+(1−x)t−yu if∼B&A], i.e. (2). 4. Ramsey, de Finetti and three-valued logic My reconstruction is a minimal extension of Ramsey’s own, where the additional steps seem not only intuitively warranted but, as we shall see, were endorsed by some later developments. But what actually could B|A mean for Ramsey? He is quite clear in his 1926 paper and later work that it is not what he called a proposition, a factual statement determinately true or false (he pointed out that it could not be a material conditional3), on the ground that it has a purely hypothetical character, only employed lacking knowledge of whether A is true or false (1929: 246, 247). My conjecture nevertheless is that Ramsey did have in mind (1a), where the term ‘B|A’ replacing the question mark is such a hypothetical assertion, but subject to the rules of probability he had so far established. The evidence for this conjecture seems to me overwhelming, since I can think of no other way of supplementing Ramsey’s proof of (iii), at least without radically altering its structure, that does not assume what it sets out to prove. We can also note that like any mathematician Ramsey proves his theorems (i)–(iv) cumulatively, each relying on its predecessor, and we have seen how my suggested addition to Ramsey’s proof depends on (ii), P(∼B|A) = 1 − P(B|A).4 If my conjecture is correct it would however explain Ramsey’s reluctance to write the interpolated steps out explicitly, since the adjunction of non-material conditionals B|A to the domain of a probability function was a move for which there was no precedent, and which could well turn out to generate inconsistency. As it turned out, of course, such an apprehension might seem to have been prescient: David Lewis’s so-called Triviality theorems (Lewis 1976) show that if the domain of a probability function P is a Boolean algebra closed under the adjunction of such conditionals then unacceptable consequences follow. One such is that P(B|A) = P(B) for any factual A, B.5 Lewis’s target was of course the theories developed by Stalnaker and others incorporating the idea, now widely known as Adams’s Principle after Ernest W. Adams who first enunciated it, that the probability of a conditional is, or should be, a conditional probability, and these theories certainly satisfied the conditions for Lewis’s theorems. To avoid his results it would clearly be necessary to characterise the conditional items B|A in such a way that their adjunction to an ordinary Boolean algebra is not itself a Boolean algebra (more strictly, factored by equivalence it isn’t a Lindenbaum algebra), generating a non-classical probability theory. In fact, there is a very straightforward way to do this. Recall that for Ramsey a bet on B given A wins if A and B are both true, loses if A is true and B false, and is annulled if A is false. Bets are of course typically won or lost depending on the established truth-values of the assertions bet on, so if we take B|A as one such, this would imply that B|A should be true when A and B are both deemed true and false when A is deemed true and B false. But the bet has no winning or losing outcome if A is deemed false; it is void, which should equally mean that B|A itself is undefined in truth-value. Enter de Finetti: We have to consider the definition of conditional probabilities and the demonstration of the multiplication theorem for probabilities. Let there be two events E′ and E″; we can bet on E′ and condition this bet on E″: if E″ does not occur, the bet will be annulled; if E″ does occur, it will be won or lost according to whether E′ does or does not occur. One can consider, then, the ‘conditional events’ (or ‘tri-events’), which are the events of a three-valued logic: this ‘tri-event’, E′ conditioned on E″, E′|E″, is the logical entity capable of having three values: true if E″ and E′ are true; false if E″ is true and E′ false; void if E″ is false. (1937: 108) That passage is from a translation of a series of lectures given by de Finetti in Paris in 1936. A year earlier, in a paper6 delivered to the Congrès international de philosophie scientifique in Paris, he had worked out the details of the three-valued logic arising from adjoining such tri-events to classical two-valued propositional logic. The suitably extended truth-functional definitions of negation, conjunction and disjunction he gave involving such propositions are now more widely known as the Strong Kleene rules, relative to which are the equivalences ∼(X|Y) ⇔ ∼X|Y and (X|Y)&Y ⇔ X&Y for factual X, Y required in my reconstruction of Ramsey’s proof of the multiplication rule. For arbitrary X, Y, X|Y is void except where X is true and Y is true, or X is false and Y is true. Lewis (1976: 304–5) was famously sceptical that the developer of a non-classical theory of probability could justify the claim that the probability it dealt in warranted the name. The challenge seems to be met here, since a fundamental theorem of de Finetti shows that ‘ordinary’ probabilities defined on a Boolean algebra extend uniquely to probabilities on the corresponding algebra of tri-events.7 But although de Finetti briefly discussed his theory of tri-events immediately prior to proving the multiplication theorem in his 1937, the proof itself appealed only to the criterion (‘coherence’) of avoiding certain loss in setting odds in conditional bets with small money stakes.8 In any case he would not have considered proving the theorem in the manner of my reconstruction of Ramsey’s proof, appealing as that does to the expected utility principle; de Finetti eschewed a utility-based approach on the grounds of its complexity compared with his coherence-based theory. Nor, despite attracting the attention of logicians and AI workers, did the logic of tri-events ever find a foothold in mainstream Bayesianism. The theory of tri-events may be an elegant way of satisfying Adams’s Principle, but it seems likely to remain only an interesting curiosity in the development of subjective probability theory. 5. Conclusion I cannot prove that my reconstruction of Ramsey’s proof of the multiplication law is what he himself had in mind and simply omitted, though the evidence seems to me very strong. If so, given his view of conditionals generally, my guess is that he most likely would have regarded terms of the form ‘B|A’ as akin to Hilbert’s ‘ideal elements’ in classical mathematics, there to facilitate (valid) proofs but lacking intrinsic significance.9 Be that as it may, I am content if I am seen merely to announce a tantalising mystery and offer what I take to be a plausible dissolution of it, one which, given their probabilistic role, leads directly to a three-valued logic for conditionals developed shortly after Ramsey wrote by de Finetti.10 Footnotes 1 Risk-aversion is reflected in the utility function being concave. 2 The use of representation theorems like this to show that real numbers under suitable operations faithfully represent a rich enough algebraic structure was completely new to discussions of probability at the time, though it was already an established focus of mathematical research, eventually to become the discipline called measurement theory. 3 Only in exceptional circumstances is P(B|A) = P(A → B), where → is the material conditional. 4 Richard Bradley offers what he calls a reconstruction of Ramsey’s proof (2001: 280–3), but it has a very different structure to Ramsey’s, and requires in addition to Ramsey’s a further axiom (Bradley’s R.10; he argues that in addition yet another axiom, which he labels R.9, is presupposed by Ramsey). Though certainly valid, Bradley’s reconstruction fails to explain the curious lacuna in Ramsey’s own proof. 5 The proof is an elementary application of the theorem of total probability given one lemma, that P((B|A)|C) = P(B|A&C) if P(A&C) > 0. In de Finetti’s three-valued logic (see below) (B|A)|C ⇔ B|A&C, where ⇔ signifies three-valued equivalence, but the logic is non-Boolean (as is standard I am taking partial functions to be the same if they take the same values wherever they are defined). 6 ‘La logique de la probabilité’. Though there are remarkable coincidences between Ramsey’s and de Finetti’s ideas about probability – the definition of conditional probability in terms of conditional bets; the probability axioms characterised as quasi-logical consistency constraints; vulnerability to a Dutch book if those axioms are violated – it was only in 1937 that de Finetti learned of Ramsey’s work, from Fréchet (Galavotti 2017: 163). 7 De Finetti proved that for the expanded algebra T of tri-events obtained from some initial Boolean algebra B every member of T is representable in the form B|A where A, B ∈ B, and by coherence P(B|A) = P(A&B)/P(A) where P(A) > 0. Thus Mura: ‘The conclusion we may draw is that Lewis’ result was really ante litteram refuted by de Finetti’s approach to tri-events in 1935’ (2009: 217). 8 ‘We could establish for the tri-events a three-valued logic perfectly analogous to ordinary logic, but this is not necessary for the goal we are pursuing’ – namely, the proof of the multiplication theorem (1937: 109). 9 Bradley’s own proof of the multiplication theorem (see footnote 4) shows that an appeal to conditionals B|A is not necessary in Ramsey’s system. 10 I would like to thank Richard Bradley, Maria Carla Galavotti, Donald Gillies and an anonymous referee for their helpful comments. References Bradley R. 2001 . Ramsey and the measurement of belief. In Foundations of Bayesianism , eds. Corfield D. , Williamson J. , 263 – 91 . Dordrecht : Kluwer . Google Scholar CrossRef Search ADS de Finetti B. 1937 . Foresight: its logical laws, its subjective sources. In Studies in Subjective Probability , eds. Kyburg H. , Smokler H. , 93 – 159 . New York : Wiley (1964) (translation of ‘La prévision: ses lois logiques, ses sources subjectives’, Annales de l’institut Henri Poincaré, 7.1 (1937): 1–68). Galavotti M.C. 2017 . On some French probabilists of the 20th century: Fréchet, Borel, Lévy. In Proceedings of the 15th International Congress in Logic, Methodology and Philosophy of Science , eds. Leitgeb H. , Niiniluoto I. , Seppälä P. , Sober E. , 155 – 73 . London : College Publications . Lewis D. 1976 . Probabilities of conditionals and conditional probabilities. Philosophical Review 85 : 297 – 315 . Mura A. 2009 . Probability and the logic of de Finetti’s tri-events. In Bruno de Finetti: Radical Probabilist , ed. Galavotti M.C. , 201 – 42 . London : College Publications . Ramsey F.P. 1926 . Truth and probability. In The Foundations of Mathematics and Other Logical Essays: Frank Plumpton Ramsey , ed. Braithwaite R.B. , 156 – 98 . London : Kegan Paul (1931). Ramsey F.P. 1929 . General propositions and causality. In The Foundations of Mathematics and Other Logical Essays: Frank Plumpton Ramsey , ed. Braithwaite R.B. , 237 – 55 . London : Kegan Paul (1931). © The Author(s) 2018. Published by Oxford University Press on behalf of The Analysis Trust. All rights reserved. For Permissions, please email: [email protected] This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)
Analysis – Oxford University Press
Published: Jul 1, 2018
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.