# Proof and the Virtues of Shared Enquiry

Proof and the Virtues of Shared Enquiry ABSTRACT This paper investigates an important aspect of mathematical practice: that proof is required for a finished piece of mathematics. If follows that non-deductive arguments — however convincing — are never sufficient. I explore four aspects of mathematical research that have facilitated the impressive success of the discipline. These I call the ‘Practical Virtues’: Permanence, Reliability, Autonomy, and Consensus (PRAC). I then argue that permitting results to become established on the basis of non-deductive evidence alone would lead to their deterioration (with some possible exceptions). This furnishes us with a partial rational justification for mathematicians’ strict insistence on proof. 1. PROOF AND PUBLICATION Even in the mathematical sciences, our principal instruments to discover the truth are induction and analogy. Laplace (as quoted in [Pólya, 1954, p. 35]) In the Theory of Numbers, it happens rather frequently that by some unexpected luck, the most elegant new truths spring up by induction. Gauss (as quoted in [Pólya, 1954, p. 59]) [T]he properties of the numbers known today have been mostly discovered by observation, and discovered long before their truth has been confirmed by rigid demonstrations. Euler (quoted in [Pólya, 1954, p. 3]]). Non-deductive techniques of various kinds — inductive, experimental, visual, analogical — have always had a number of important roles within mathematics. One central use is in the discovery of plausible conjectures that are later proved, and below we will see that they can help us discover proofs too. Non-deductive techniques are also a useful check on attempts at proof. Focusing on finished pieces of mathematics may lead us to overlook this importance, however: usually only the deductive proof is published and the heuristic work is simply discarded. Indeed, I will claim that mathematicians always require proof for the acceptance of mathematical claims. But what is meant by ‘acceptance’ here? We distinguish two distinct meanings. Private Acceptance. Personal belief on behalf of individual mathematicians. Public Acceptance. An established theorem that is now eligible for unqualified assertion in peer-reviewed journals and other serious mathematical publications. In Section 4 I discuss the famous Goldbach Conjecture, which we shall see that many mathematicians believe to be true on the basis of the vast amount of inductive evidence in its favour, even though a proof has not been found. So mathematicians do sometimes Privately Accept results in the absence of proof. However, proof is required for Public Acceptance, which is therefore the intended sense. This insistence on proof has long been reflective of the feelings of mathematicians. Euler writes, ‘we should take great care not to accept as true such properties of the numbers which we have discovered by observation and which are supported by induction alone’ (as quoted in [Pólya, 1954, p. 3]. Frege later comments that ‘in mathematics a mere moral conviction, supported by a mass of successful applications, is not good enough’ [Frege, 1980, p. 1]. More recently, Michael De Villiers writes emphatically that ‘Nobody, today, can really be considered mathematically educated or literate, if he or she is not aware of the insufficiency of quasi-empirical evidence to guarantee truth in mathematics, no matter how convincing that evidence may seem’ [De Villiers, 2004, p. 412]. It is clear that having a proof provides us with a special kind of a priori warrant for believing its conclusion. However, Private Acceptance in the absence of proof is in fact fairly common, both historically and in contemporary mathematics. So why not extend this to Public Acceptance, too? After all, mathematicians Publicly Accept results for which only very long and complex proofs are available. The first proof of the Group Classification Theorem totaled over 5,000 pages, for example; Solomon writes ‘Is the database correct? Is there a 27th sporadic simple group? I seriously doubt it, but it would be chutzpahdich to assert that a 5000-page 40-year human endeavor is beyond the possibility of human error’ [Solomon, 2001, p. 347]. The epistemic position of a mathematician who has amassed a very strong body of non-deductive evidence might therefore compare favourably. Moreover, this suggestion is not of interest only to philosophers: some mathematicians have actually recommended that the standards for Public Acceptance should be relaxed in this way. Consider the remarks of mathematician Branko Grünbaum, who used the computer application Mathematica to ‘explore and verify’ some geometric results: Do we start trusting numerical evidence (or other evidence produced by computers) as proofs of mathematics theorems? ... if we have no doubt — do we call it a theorem? ... I do think my assertions are theorems ... the mathematical community needs to come to grips with new modes of investigation that have been opened up by computers [Grünbaum, 1993, p. 8]. In the rest of this paper I will, however, present reasons why non-deductive arguments are unsuitable as justification in the context of Public Acceptance (with some possible exceptions, to be discussed in the conclusion). Rather than beginning from the individual epistemic state of an enquirer I will instead take a broadly Aristotelian approach, focusing on the practical virtues: four sociological features of mathematical research and the relationships researchers enjoy with the wider community. These we explore in the following section. I later argue that if mathematicians were to Publicly Accept results on the basis of most kinds of non-deductive evidence, these valuable features of practice would be undermined. 2. THE PRACTICAL VIRTUES In this section I will outline the four practical virtues: Permanence, Reliability, Autonomy, and Consensus (PRAC). These are empirical descriptive traits that the practice of mathematics exhibits to a high degree — though not absolutely. Later I shall argue that non-deductive evidence is not in general conducive to these virtues, and as we shall see in this section they are supported by mathematicians’ practice of insisting on proof for Public Acceptance. In order to move towards a complete rational justification of this insistence, I also sketch out how these practical virtues facilitate the flourishing of mathematics as a discipline: the progress of mathematical enquiry and the enormous success of the field. Permanence. When a result becomes Publicly Accepted, it retains this status indefinitely. In the natural sciences, the status of any hypothesis is always to some extent provisional. Researchers must always be open to the possibility that their most fundamental results will have to be revised in the light of new evidence. Historical examples abound: perhaps the most celebrated is Einstein’s discovery of the merely approximate and local character of Newtonian mechanics, the most outstanding scientific achievement of its age, and long revered as a paradigmatic instance of the certainty that scientific work could aspire to. Some current scientific theories, such as the central causal role of natural selection driving evolutionary change in biology, may seem so secure that the chances of their displacement are negligible. Yet we can always imagine future discoveries or experiments that might lead to this occurring. The status of an established mathematical result, on the other hand, is not like this: results that have become Publicly Accepted are expected to remain part of mathematics on a permanent basis. Moreover, this expectation is not mere hubris, but is grounded in strong historical precedent. The theorems of Eudoxus and Archimedes are still our theorems, though the justifications we give for them may be quite different, and ‘In most sciences one generation tears down what another has built, and what one has established, another undoes. In mathematics alone each generation adds a new storey to the old structure’ Herman Hankel (as quoted in [Debnath and Bhatta, 2006, p. 315]). One benefit of Permanence concerns the dependence on previously established theorems: mathematical arguments usually rely on many subsidiary results in deriving their conclusions. Of course, both philosophers and scientists of all kinds also cite other authors in support of their claims. But in mathematics this practice is extraordinarily effective, and without this reliance on auxiliary results research would be severely slowed down. However, if results were Publicly Accepted only for a limited time, it would be much harder to keep track of which articles are currently acceptable to cite. The Permanence of mathematics is also part of what enables a ‘deeper penetration into the subject-matter’ than in other fields of enquiry: because mathematical edifices are largely Permanent, over time they can develop to an immense complexity as every aspect of a given structure is scrupulously examined. Reliability. Publicly Accepted results are always true. This feature expresses the claim that the overwhelming majority of results mathematicians Publicly Accept are true. Reliability — often under the guise of the more traditional term ‘certainty’ — has long been part of the self-image of mathematics: Norbert Wiener writes ‘The place where most people would look for absolute certainty is in pure mathematics or logic. Indeed, “mathematical certainty” has become a byword’ [Wiener, 1915, p. 568]. But can we be sure that this belief in the Reliability of mathematics is warranted? We might initially think the observed fact of Permanence gives us sufficient grounds for Reliability, and that Publicly Accepted claims are rarely overturned simply because such results are rarely false. Taken together, Reliability and Permanence thus present an attractive picture of mathematical practice: due to the restriction to deductive proof, only true results are Publicly Accepted, and so over time only true — and hence incontrovertible — results are added to the structure. This view might be disputed, however. Although mathematical results that have been Publicly Accepted for a substantial period of time are almost never overturned, errors are actually quite often found in early drafts of attempted proof presentations given by mathematicians. So perhaps the Permanence of established mathematical results is not a sure guide to its Reliability and is merely because such errors as might now remain are unlikely to be revealed by the checking process, or even because interest in checking them has simply subsided. Indeed, we might worry whether it is even possible to establish Reliability using only sociological modes of investigation: though it is a descriptive claim whose content concerns the realities of practice, it is also one that requires appeal to a non-empirical objective standard of truth in its application. Though Reliability is suggested by the observation of Permanence (and also Consensus, the fourth virtue, to be discussed below), to establish it in practice we must therefore go beyond sociological observation and consider the powerful mathematical arguments available for Publicly Accepted results. Not only is insistence on proof responsible for maintaining Reliability; it is also precisely the availability of proof that enables us as inquirers to see that Publicly Accepted mathematics is indeed Reliable. For instance, it is unreasonable to claim that there might still be errors in every single discourse purporting to present what we surely all believe is a genuine proof of Euclid showing that there is an infinity of primes. Too many thinkers have internalised the proof and come up with their own novel presentations. This is not mere rote checking, but deep and intuitive understanding. For mathematics that has been widely circulated, internalised, and reconstituted, the denial of Reliability therefore gives expression only to a rather extreme form of scepticism. The benefits of a highly Reliable literature are of course immense. Consider again the reliance on previously proved theorems. Suppose we are reading an article in a reputable mathematical journal and come across the line ‘From [17], we know that all $$A$$s that are $$B$$s are also $$C$$s.’ The level of vigilance required by the reader here is very small even in comparison to the natural sciences: most of the time we would not fall into error if we simply accepted the result and moved on with the article without even looking up citation [17] in the bibliography (though if our interest is professional and we want to go on to use the result to prove something else this might be somewhat irresponsible). Conversely, if we are instead merely told that a claim has been made in print in a philosophy journal nothing like the same level of conviction on the part of the reader will follow. The Reliability of mathematics also means that natural scientists and others can take mathematical results ‘off the shelf’ without worrying about their veracity. They can regard the results they find in mathematics journals and textbooks as true simpliciter, and if their theories fail will rarely look to locate errors there. Autonomy. Competent researchers can always come to know Publicly Accepted results in an intellectually independent way, and publication is never permitted on the basis of trust or authority alone. This feature of mathematical practice has two components. Firstly, any mathematician can (without much effort) find his or her own explicit reasons for believing any Publicly Accepted claim. It is clear that insistence on proof helps to maintain this condition, as journals rightly believe the results they publish are Reliable and expect the arguments for them to be found convincing by all. Though proof discourses come in many forms, understanding a proof always enables any competent reader to know the conclusion in an intellectually independent way. It must be stressed that this aspect of the condition is essentially modal. It may be that no mathematician has personally scrutinised proof presentations for all or even a large proportion of the mathematical results they believe and regard as Publicly Accepted: indeed, having every mathematician check every new result every year would slow research down to a standstill. But even if this were possible it would be highly inefficient and hence undesirable.1 The second aspect of Autonomy is that generally speaking mathematicians do not publish results on the basis of personal authority alone. When journal referees judge that a printed argument constitutes a sufficient basis for publication, they never rely upon trust in the testimony of the publishing mathematician. It is also hard to see how this aspect of Autonomy could be adequately maintained without the restriction to deductive proof: if a result were supported by inductive evidence alone, we would be required to accept the judgement of the author that this evidence is conclusive. (This point will be discussed in more detail in Section 5.) Autonomy is clearly intellectually valuable for its own sake, and also helps to maintain the other virtues — particularly Consensus, which we discuss in the next section. Russell writes, ‘Have no respect for the authority of others, for there are always contrary authorities to be found’ [2009, p. 534]. Whilst a student is still learning to think mathematically, or to understand the use of some new concept, some reliance on the authority of a teacher may be necessary; results or techniques must be presented in some particular order, and the justification of a principle might take a lesson too far afield at the stage when it is first needed. But at some point in their intellectual development towards becoming independent mathematicians, students usually begin to insist on their own reasons for believing results they are taught. Of course, this is even truer of professional mathematicians. Consensus. There is a shared agreement as to which statements are Publicly Accepted. As we have just said, Consensus is enhanced by Autonomy: the realization that convincing arguments are available leads mathematicians to have faith in the published literature. This Consensus is thus spontaneous rather than coerced; this is what the philosopher Jody Azzouni has called ‘The benign fixation of mathematical practice’ [2006, p. 208]. It is also connected to the other three practical virtues as well. Consensus does not exactly imply Reliability — we could imagine a high level of agreement within a field whose findings were in fact highly suspect. But the high degree to which mathematicians exhibit Consensus is clearly connected to it: it is the acknowledgement of the Reliability of published research. And Consensus depends on Permanence, too: if results tended to move in and out of Public Acceptance over time this would likely lead to widespread disagreements during the transitional periods. We can also see how Consensus is in part a consequence of insistence on proof. Mathematics journals maintain very high standards for publication, with the warranted expectation that the articles they publish will be found convincing by the entire readership. Later we will see that this would be difficult to achieve with non-deductive evidence alone. To put into perspective the high degree of Consensus within mathematics, consider the range of disagreement amongst contemporary philosophers. In the analytic tradition alone today there are dualists, reductive and non-reductive materialists, and idealists in the philosophy of mind; consequentialists, Kantians, contractarians, virtue and natural-rights theorists, and non-cognitivists in ethics; formalists, constructivists, nominalists, and Platonists in the philosophy of mathematics, and so on for each subdiscipline. Moreover, these disagreements constitute the permanent condition of academic philosophy [MacIntyre, 2013, p. 18]. This lack of Consensus does not imply that philosophers do not have the resources to discover true answers to the questions they pursue. However, it does suggest that philosophers find it difficult to formulate arguments that are able to establish lasting agreement amongst all of their academic colleagues about the truth and rational justifiability of their particular philosophical positions. This remains the case even though consensus may be achieved in a negative direction; Gettier’s paper [1963] on the justified true belief account of knowledge is perhaps a notable example of this. In mathematics, however, such disagreement is rare. Consider the difference between an undergraduate course in moral philosophy on the one hand, where one can expect to learn about key historical figures such as Mill, Bentham, Kant, Aquinas, and Aristotle, the different conceptions of morality they have each articulated, and the arguments supporting their positions; and an undergraduate course in mathematical analysis on the other, where one can expect a concise and ahistorical presentation of what everyone agrees are the established theorems and true results pertaining to the central questions that arise in that area. Moreover, unlike in natural science where consensus is perhaps maintained only temporarily by a group of researchers operating under a shared paradigm in the course of what Thomas Kuhn [1996, p. 10] has called ‘normal science’, in mathematics the agreement reached is Permanent. The benefits of Consensus are again clear. Like the other virtues, it is important for the practice of citing other articles when invoking subsidiary results. If the auxiliary results needed were not agreed to be Publicly Accepted by all researchers, then the effectiveness of collaborative mathematical research would be reduced. It is not sufficient that the literature is highly Reliable and cited results are always in fact all true: this also needs to be acknowledged in practice. The fact that there is now a more or less permanent Consensus about which results are Publicly Accepted also means that mathematicians are free to pursue intensively specialised research and are not forced to spend time and energy warring with competing schools within the discipline. This is again similar to Kuhn’s account of the necessary conditions for ‘normal science’ to progress, although in mathematics such agreement is Permanent and not periodically fragmented by internal crises [Kuhn, 1996, p. 66]. In closing the discussion of Consensus, I will add a brief caveat to what has been said so far. Throughout history there have been disagreements of sorts: questions of how best to formulate concepts, which theorems are interesting, which objects or problems are the most important to study, aesthetic judgements and questions of explanatory value, which systems of axioms it is appropriate to work within, and so on. Consider also the rival attempts to rigorise the calculus: geometric, algebraic, arithmetic. However, apart from occasional exceptions, such as the disagreements with classical mathematics of intuitionists led by Brouwer about what constitutes an acceptable mathematical argument, which were in any case fringe, such disagreements are not directly about which results should be Publicly Accepted: in contrast to moral philosophy, rival geometric systems (Euclidean, hyperbolic, elliptic) can coexist peacefully. Once the starting points are clearly articulated, mathematicians working within one paradigm do of course recognise the results of other approaches as genuine theorems. Before moving on to the next section, I will briefly discuss the interdependence of the four practical virtues — a direct parallel of the well-trodden issue of the unity of the virtues in ancient ethical theory (bravery, temperance, justice, honesty), with many thinkers following Aristotle in opting for the strong thesis that one cannot fully have any individual virtue without possessing all the rest (for an excellent discussion, see [Annas, 1993, pp. 73–84]). In this context, we should distinguish purely logical relations between the practical virtues understood as abstract ideals, and empirical tendencies of perhaps only partially realised virtues to be mutually supporting at the level of practice (the sense in which they are intended to apply). Throughout this section, it has been suggested that the virtues are related — for instance, that Permanence is a consequence of Reliability. However, for our present purposes we actually do not need to decide whether any one virtue implies the others. It is clear that the benefits they bring to mathematical practice are distinct and separable, and so appeal to each of them will furnish mathematicians with a distinct justification for insistence on proof. 3. A CASE STUDY In this section we consider a concrete example of a computer-based heuristic approach to a simple mathematical problem. The computational power supplied by modern computers is in many respects vast: mathematicians can now easily perform calculations that would have been intractable a few decades ago. This has greatly enhanced the potential of non-deductive methods, undeniably moving us beyond mere discovery and into the context of justification. I nevertheless argue that yielding Public Acceptance to results solely on the basis of the kind of non-deductive evidence I will give here would diminish the extent to which the four Practical Virtues are realised in practice. Problem 1.2 Does there exist a positive integer, $$n$$, such that $$(2+\surd2)^{n}$$ differs from an integer by no more than $$10^{-6}$$? Let us begin by calculating the first few values of the expression and then look for a pattern. Examining these data (see Table 1, columns 1 and 3), we see that $$(2+\surd2)^{n}$$ is getting progressively closer to an integer from below at every step. Let us call the difference to the next integer up $$f(n)$$ and calculate a few values explicitly (see column 4). Looking at these figures, we notice that they seem to decrease by around the same proportion each time. This suggests we calculate the ratio between terms (see column 5). Now, examining column 5 we see that the ratio is always equal to the first term $$0.585786\dots$$. At this stage, I spotted that $$0.585786$$ is the beginning of the decimal expansion of $$(2-\surd2)$$. If this is in fact the exact value of both the first term in column 4 and the ratio of each pair of adjacent terms, then $$f(n)$$ must be equal to $$(2-\surd2)^{n}$$ for each positive integer $$n$$. We have arrived at the following conjecture: $$(2+\surd2)^{n}+(2-\surd2)^{n}$$ is an integer for all positive integers $$n$$. Table 1. Numbers for Problem 1 $$n$$  exact $$(2+\surd2)^{n}$$  decimal $$(2+\surd2)^{n}$$  $$f(n)$$  $$\frac{f(n)}{f(n-1)}$$  1  $$2+\surd2$$  3.4142134  0.585786     2  $$6+4\surd2$$  11.656854  0.343146  0.585786  3  $$20+14\surd2$$  39.798990  0.201010  0.585786  4  $$68+48\surd2$$  135.882251  0.117749  0.585786  5  $$232+164\surd2$$  463.931024  0.068976  0.585786  6  $$792+560\surd2$$  1583.959595  0.040405  0.585786  7  $$2704+1912\surd2$$  5407.976331  0.023669  0.585786  8  $$9232+6528\surd2$$  18463.986135  0.013865  0.585786  9  $$31520+22288\surd2$$  63039.991878  0.008122  0.585786  10  $$107616+76096\surd2$$  215231.995242  0.004758  0.585786  11  $$367424+259808\surd2$$  734847.997213  0.002787  0.585786  12  $$1254464+887040\surd2$$  2508927.998367  0.001633  0.585786  13  $$4283008+3028544\surd2$$  8566015.999044  0.000956  0.585786  14  $$14623104+10340096\surd2$$  29246207.999440  0.000560  0.585786  15  $$49926499+35303296\surd2$$  99852799.999672  0.000328  0.585786  $$n$$  exact $$(2+\surd2)^{n}$$  decimal $$(2+\surd2)^{n}$$  $$f(n)$$  $$\frac{f(n)}{f(n-1)}$$  1  $$2+\surd2$$  3.4142134  0.585786     2  $$6+4\surd2$$  11.656854  0.343146  0.585786  3  $$20+14\surd2$$  39.798990  0.201010  0.585786  4  $$68+48\surd2$$  135.882251  0.117749  0.585786  5  $$232+164\surd2$$  463.931024  0.068976  0.585786  6  $$792+560\surd2$$  1583.959595  0.040405  0.585786  7  $$2704+1912\surd2$$  5407.976331  0.023669  0.585786  8  $$9232+6528\surd2$$  18463.986135  0.013865  0.585786  9  $$31520+22288\surd2$$  63039.991878  0.008122  0.585786  10  $$107616+76096\surd2$$  215231.995242  0.004758  0.585786  11  $$367424+259808\surd2$$  734847.997213  0.002787  0.585786  12  $$1254464+887040\surd2$$  2508927.998367  0.001633  0.585786  13  $$4283008+3028544\surd2$$  8566015.999044  0.000956  0.585786  14  $$14623104+10340096\surd2$$  29246207.999440  0.000560  0.585786  15  $$49926499+35303296\surd2$$  99852799.999672  0.000328  0.585786  This claim can now be proved by considering the binomial expansion of the two bracketed expressions. Terms (column 2) with an even power of $$\surd2$$ will themselves be integers, and terms with an odd power of $$\surd2$$ in the first expression will be cancelled out by corresponding terms in the second expression, which will be identical except for having a negative sign. So all we need now do to solve Problem 1 is to pick $$n$$ such that $$(2-\surd2)^{n}<10^{-6}$$. This can easily be done with logarithms ($$n=26$$ is the first solution). Using a computer, finding the solution to Problem 1 was rather easy and indeed almost mechanical. The only interesting parts were deciding to look at the ratio of successive terms and spotting the decimal expansion of $$2-\surd2$$. The quasi-empirical work also led us to discover the proof as well as the solution. This is reminiscent of a famous quote by Riemann, suggesting that getting to the answer is often the most difficult part: ‘If only I had the theorems — then I should find the proofs easily enough!’ (as quoted in [Lakatos, 1976]). However, this non-deductive work need not feature in the final presentation of the argument, which may simply begin from the observation that $$(2+\surd2)^{n}+(2-\surd2)^{n}$$ is an integer for all positive integers $$n$$. The non-deductive evidence also moved us beyond merely discovering a conjecture that was sufficiently plausible to warrant further investigation. The fact that $$f(n)$$ appeared to decrease by a factor of roughly $$(2-\surd2)$$ for all of the values checked seemed like compelling evidence that the conjecture was true — and even more so given that the similar expression $$(2+\surd2)$$ occurs prominently in the question. Until we reach the point where a specific integer with the requisite property has actually been constructed, however, the confidence that can be rationally induced by such data still falls short of complete certainty. It is possible — though highly unlikely — that such patterns arise simply by chance, and this calls the Reliability of the corresponding inference into question. Such non-deductive evidence thus fails to compel assent rationally with the same authority as the proof that was later supplied: if someone did not share our intuitions about the conclusiveness of the data we would be at a loss as to how to convince them. So if a result were to be published on this kind of evidence alone there might be issues with Consensus as well as Reliability. Though often I must settle on a fixed conjecture before looking for a proof, in the context of Private Acceptance such challenges do not arise. If I am wrong, only my own time and energy will be wasted; if I am correct, only the final proof will be given and the heuristic work will go unpublished. 4. THE GOLDBACH CONJECTURE ... every even integer is a sum of two primes. I regard this as a completely certain theorem, although I cannot prove it. Euler (as quoted in [Narkiewicz, 2000, p. 333]) Our second case study is a claim of some historical significance that has fascinated mathematicians for centuries.3 Goldbach Conjecture (GC). Let $$n$$ be an even integer greater than 2. Then $$n$$ can be expressed as the sum of two (not necessarily distinct) primes. Since its emergence from the Goldbach-Euler correspondence, GC has been subject to quasi-empirical investigation on a massive scale. Several mathematicians have checked it for large initial segments of the natural numbers, and in 2013 a lower bound of $$4\times10^{18}$$ for any counterexample was given [Silva et al., 2013]. Despite all of this inductive evidence it seems that a proof is not forthcoming: number theorist and Fields medalist Alan Baker stated in a 2000 interview that ‘It is unlikely that we will get any further without a big breakthrough. Unfortunately there isn’t such a big idea on the horizon’ [Ahuja, 2000]. As mentioned earlier, some mathematicians have nevertheless yielded Private Acceptance to GC. Indeed, Echeverria [1996] has claimed that ‘the certainty of mathematicians about the truth of GC is complete’. And in 1922 the eminent number theorists John Littlewood and G.H.~Hardy were willing to assert that ‘there is no reasonable doubt that the theorem is correct’ before the conjecture had even been checked up to $$10^{5}$$. GC thus provides another example of the extension of non-deductive methods to the context of justification — one endorsed by a substantial number of mathematicians. The data gathered by Silva and others do seem convincing, but we should consider how they might be worked into a more coherent and systematic argument to establish the truth of GC: as we have noted already, attempts to prove a false conjecture may result in wasted time and energy. We know that an admittedly huge initial run of the even natural numbers have been checked and are consistent with GC. What general principle could be invoked to move us from these data to the truth of the conjecture itself? One line of thought which might come naturally here is that if there were a special property belonging to some integers, possession of which by an integer entailed that it could not be split into the sum of two primes, then integers with this property would have already been encountered by now: surely a sample of size $$4\times10^{18}$$ is sufficiently representative. However, consider the following problem.4 Problem 2. Is it true that $$991n^{2}+1$$ is never a perfect square? Suppose that we had calculated this expression for a huge initial segment of the natural numbers, well beyond Silva’s investigation for GC — every value up to $$10^{25}$$, say — and the expression was never a square. According to the same reasoning as above, this would suffice to demonstrate the conjecture that no such numbers exist. But this conjecture is actually false: the first counterexample is $$n=12,055,735,790,331,359,447,442,538,767\approx1.2\times10^{28}$$. Moreover, this is no isolated example: historically there have been many other plausible-seeming claims in number theory that have turned out to have only huge smallest counterexamples. Here are three others: In 1769, Euler conjectured that for all integers $$n$$ and $$k$$ greater than 1, if the sum of $$n$$$$k^{\text{th}}$$ powers of non-zero integers is itself a $$k^{\text{th}}$$ power, then $$n$$ is greater than or equal to $$k$$ [Dickson, 1952, Vol. 2, pp. 658 f.]. A counterexample for $$k=5$$ was found in 1966: it is now easy to verify via computer that $$144^{5}=27^{5}+84^{5}+110^{5}+133^{5}$$ [Lander and Parkin, 1966]. A counterexample for $$k=4$$ is $$422481^{4}=95800^{4}+217519^{4}+414560^{4}$$ [Frye, 1988]. In 1885, Thomas Joannes Stieltjes conjectured in a letter to Hermite that the following claim was true. Define the Mertens function as $$M(n)=\sum_{1{{\le}{\text k}{\le}{\text n}}}{\mu}(k)$$, where $${\mu}(k)$$ is the Möbius function5 [Stieltjes, 1885]. Then for all $$n>1,|M(n)|<\surd n$$. This conjecture is known to imply the Riemann Hypothesis, but in 1985 the existence of a counterexample between $$10^{14}$$ and $$e^{(3.21\times10^{64})}$$ was proved by Andrew Odlyzko and Herman te Riele [1985]. In 1919, George Pólya conjectured that at least half of the natural numbers less than any given natural number $$n$$ have an odd number of prime factors when counted with multiplicity. The conjecture was disproved by Colin Brian Haselgrove [1958, p. 145]. The first explicit counterexample $$n=906,180,359$$ was given by Russell Sherman Lehman [1960]. The smallest counterexample is $$n=906,150,257$$, found by Minoru Tanaka [1980]. Some of these conjectures or others like them could be relevant to the truth of GC: the suggestion is at least not obviously implausible. These examples and others raise a concern about the Reliability of conjectures established on such a basis alone and make it seem unlikely that there is any such principle that will enable us to complete the argument. Indeed, because number-theoretic properties of the integers are not in general uniformly distributed — the prime numbers have asymptotically zero density, for example — they are not conducive to this kind of treatment, and there is prima facie no special reason to assume the property of satisfying GC to be an exception. We can find arbitrarily long sequences of consecutive integers with no primes: $$(n+1)!+2,(n+1)!+3,\ldots,(n+1)!+(n+1)$$ is a run of $$n$$ composite integers for any natural number $$n$$. Yet knowing the existence of a such a run of consecutive composite numbers clearly would not warrant the conclusion that no prime numbers exist at all. Here it is suggested that a run of consecutive integers that are not counterexamples to GC warrant the belief that no such counterexamples exist. In this case, the run of integers also forms an initial segment of the natural numbers, but it is not clear why this should be particularly indicative: our data may again be highly biased. One potential objection to what I have said so far is to note that the account I have given of the evidence is unfair. Confidence in the result may actually have increased when it was pointed out that the number of ways of expressing $$n$$ as the sum of two primes, the Goldbach partition function $$G(n)$$, appears to increase with $$n$$. Indeed, if we plot $$G(n)$$ for the first $$10^{6}$$ values of $$n$$ a beautiful pattern emerges (see Figure 1). But this evidence is really no more conclusive, despite the visually pleasing and compelling regularity of the pattern. It is not clear that $$G(n)$$’s being large mitigates the possibility that $$G(n+1)=0$$, and the claim that the pattern will continue is itself only supported inductively. It appears, then, that this additional evidence will be of no help in constructing a convincing argument for GC from the available data. Yet the opinions of Littlewood and Hardy on matters concerning number theory are not to be dismissed lightly. How best can we account for their confidence? As an attempt at a complete elucidation of the grounds upon which Littlewood, Hardy, and other mathematicians have come to believe GC, the presentation of the evidence I have given so far is highly misleading, and indeed necessarily so. For Littlewood’s and Hardy’s confidence will not have been grounded in explicit inference from inductive evidence based on some general theoretical principle. Rather, they will have come to an intuitive judgement about the plausibility of the conjecture in light of all the available evidence they possessed: a judgement which will itself have been informed and influenced by many other things they had known, believed and experienced up to that point in their mathematical careers. And the same is true for any contemporary number theorist who now believes the truth of GC. We give three examples of related theorems, prior knowledge of which might affect our assessment of the import of the quasi-empirical data. Fig. 1. View largeDownload slide A graph of $$G(n)$$ for values of $$n$$ between 4 and 1,000,000. The x-axis labels ‘$$a$$e+0b’ mean $$a\times 10^b$$ (By ‘Reddish’ at the English-language Wikipedia) Fig. 1. View largeDownload slide A graph of $$G(n)$$ for values of $$n$$ between 4 and 1,000,000. The x-axis labels ‘$$a$$e+0b’ mean $$a\times 10^b$$ (By ‘Reddish’ at the English-language Wikipedia) Theorem 1. (Ternary Goldbach Conjecture) Let $$n$$ be an odd integer with $$n>5$$. Then $$n$$ is the sum of three (not necessarily distinct) primes. [Helfgott, 2013] This is sometimes also known as the Weak Goldbach Conjecture, because (as Euler knew) it is a simple corollary of GC (if $$n$$ is odd and $$n>5$$, and if GC is true, then $$(n-3)$$ must be the sum of two primes). Although the proof was not completed until 2013, Littlewood and Hardy were themselves able to show in 1923 that given the generalized Riemann Hypothesis the conjecture was true for sufficiently large numbers, and it is likely that this achievement would have increased their confidence in the stronger result GC. [Hardy and Littlewood, 1923] Theorem 2. Let $$n$$ be an even integer greater than 2. Then $$n$$ is the sum of a prime and a second number that is the product of at most two primes. [Chen, 1966] Clearly, this result by Chen Jingrun is progress from another direction: it seems that mathematicians are homing in on the full result GC. Theorem 3. The set of even integers that cannot be represented as the sum of two primes has asymptotic density $$0$$ [Pomerance and Crandal, 2001, p. 17]. This claim shows that counterexamples must eventually become very sparse over a long enough initial segment. Yet its exact implications remain unclear: have we now checked a sufficient number of cases for the evidence to be compelling? Viewing mathematicians’ attitudes towards non-deductive evidence as stemming from intuitive judgements based on professional experience means that we need not commit them to any dubious general principle about all number-theoretic claims inviting belief when they are verified for a large enough initial segment of the natural numbers. Yet it also seems to preclude the presentation of the available evidence in the form of an explicit argument that is compelling unto itself. It is impossible for the whole evidential and psychological history of an intuitive judgement to be externalised from the mathematician making it and published alongside the conjecture. The Public Acceptance of GC on this basis alone would therefore undermine Autonomy. We have also seen a glimpse of the wide range of different factors that might affect our intuitive judgements when a sizeable number of researchers direct their attention to a problem. So again there may be issues with Consensus as different experts weigh in about how convincing the data are. There will also always be further insights available — related conjectures, more data, or theoretical considerations — and these may cause judgements about the evidence to shift back and forth over time, damaging Permanence. In conclusion, though some specialist mathematicians have yielded Private Acceptance to the conjecture on the basis of the non-deductive evidence alone, consideration of the practical virtues reveals that Public Acceptance of the result is not yet appropriate — as has indeed been borne out in practice. 5. PUBLIC ACCEPTANCE WITHOUT PROOF In this section I argue for the more general conclusion that the use of non-deductive methods in the context of Public Acceptance would (with some possible exceptions) likewise lead to a general decline in the extent to which mathematics exhibits the four practical virtues. I have argued for the importance of these ideals for mathematical research in general. If I am right about this, mathematicians have available good reasons for their collective decision to reject these non-deductive methods in the context of Public Acceptance, through appealing to the need for mathematical enquiry as a whole to continue in good working order. This remains true even though they may in some cases be warranted in Privately Accepting claims on such grounds alone. Many conjectures for which apparently convincing evidence is available have turned out to be false, and we have seen some examples above. If these conjectures had been Publicly Accepted, then clearly the mathematical literature would have become less Reliable as a result, and Permanence would have suffered too when counterexamples or disproofs were uncovered. If journals were to permit mathematicians to assert results unqualifiedly on the basis of non-deductive evidence alone, it would be critically important that the non-deductive data supplied provide sufficient evidence to justify the mathematical results they are to support. We have also seen, however, that judgements about whether a body of evidence warrants regarding a result as established are affected by both experience and background knowledge. Such judgements are therefore always liable to be disputed, even once all the available information is taken into account, and this will damage Consensus. Similarly, as time goes on and new evidence appears, individual mathematicians may change their minds about whether a conjecture has been adequately supported, which could damage Permanence. A related point concerns Autonomy. Davis et al. [2012, p. 411] claim that the non-deductive evidence available for the Riemann Hypothesis is ‘so strong that it carries conviction even without rigorous proof’. Perhaps they are warranted in making this claim, but how could a non-specialist decide either way? In these kinds of cases, the explicit mathematical justification given is not in and of itself sufficient to compel assent, and the arguments are completed only through the intuitive judgements of the specialists. But whether such intuitive judgements are justified in a particular case is never ascertainable by an external observer, because the ultimate evidential, experiential, and psychological basis of the judgement is always ultimately inscrutable. Justification of the claim therefore rests in part on their authority, undermining both aspects of Autonomy. It is therefore important to render the means by which such evidence is assessed clear and explicit. In order to preserve the four practical virtues, publishing mathematicians would need to provide a systematic, repeatable way of ascertaining that the body of non-deductive evidence they have supplied is in fact adequate. Such techniques must be sufficiently stringent to remove any issues about Reliability, and the verdicts they render must be stable over time so as to maintain Permanence. To preserve Autonomy, it must be such that any competent mathematician can apply it: this will require it to be fully transparent and given by a definite procedure. Lastly, in order to be suitably precise and to maintain Consensus, the verdict rendered must be precise and unequivocal. Such quantitative relations between evidence and hypothesis are of course the province of the theory of probability. It is also clear that the existence of a technique of assessment meeting all of these requirements will only be possible if our non-deductive methods themselves are delineated sufficiently sharply. Otherwise it is impossible to be systematic and we are back to the kind of skill embedded in specialists’ intuition. Yet non-deductive techniques are often merely opportunistic investigations that yield data construed as evidence only in hindsight: not really methods as such, in the sense of a definite procedure that can be applied to many problems. What would be required for adequate justification, then, is a definite algorithm whose reliability can be determinately evaluated using probabilistic techniques. Such methods do exist and find application in many different areas of mathematics. One such example — a procedure for finding prime numbers based on the Rabin-Miller algorithm — is discussed by the philosopher Don Fallis [2002]. He argues that mathematicians have not presented sufficient grounds for the rational rejection of such algorithms. And indeed, with this procedure the probability of error can be made less than any preassigned bound, and the testing phase repeated by any mathematician who might doubt the veracity of the result. It therefore seems we can imagine an alternative but equally robust mathematical culture where such non-deductive techniques are permitted in the context of Public Acceptance, and yet where the practical virtues flourish. Responding to this suggestion is a challenge for another time. 6. CONCLUSION In this paper I have argued that mathematicians can cite reasons for their strict insistence on proof and consequent rejection of (nearly all) non-deductive techniques in the context of Public Acceptance, even when the available evidence seems to leave no room for doubt. But are these sufficient reasons? Is it not true that other fields manage adequately without the practical virtues? Could not mathematical enquiry better flourish by relaxing them in order to make room for a wider range of permitted techniques? And what is it for an intellectual discipline to ‘flourish’, anyway? It is clear there is more work to be done here: this brief sketch will need to be supplemented with historical enquiry into the emergence of the practical virtues and more detailed sociological enquiry into their role in facilitating the success of mathematics (which I take to outstrip other disciplines where they are not present or present only to a lesser degree — including the natural sciences). One useful case study may be the collapse of the Italian school of algebraic geometry, which moved from a conventional period of insistence on proof under the leadership of Castelnuovo, through a shift towards intuitive arguments and consequent loss of Autonomy and then Consensus under the authority of Enriques, culminating in a catastrophic loss of Reliability under Severi. Another would be to investigate within this framework the motivations for — and consequences for practice of — the rigorisation of the calculus in the nineteenth century, whereby the powerful collection of techniques and results developed by Newton, Gauss, Lagrange, Euler, and others were put on a firm intellectual foundation by Cauchy, Bolzano, and Weierstrass, so that a return to Euclidean standards of proof for Public Acceptance could take place (for an interesting theory, see [Grabiner, 1974]). I do however hope to have made a start in the right direction, and further that the considerations I have appealed to in this paper will be recognisable to working mathematicians themselves. Likewise, though I have offered no systematic account of what it is for mathematical research to flourish, the practical virtues themselves provide a starting point here. Again in direct parallel to moral theory, where Aquinas in his discussion of Augustine’s account of the virtues explains how they are both a means to and a constitutive part of the good life for human beings, our practical virtues also express part of what it is for a discipline to be in good working order (see [MacIntyre, 2011, p. 215]). The practical virtues are not sufficient for a research community to flourish, however: a research group that fails to prove or discover any important conjectures can be said to enjoy only a modest degree of success.6 Through investigating practical and communal aspects of research rather than direct evidential value alone, I have concluded that different standards are appropriate in the context of Public Acceptance compared with what is required for rational justification of Private Acceptance. This I also hope to be a distinction with which practitioners are familiar. In practice, mathematicians often rely on a disparate variety of evidence in coming to believe they have found the correct answer before even beginning to look for explicit argumentation suitable for public consumption. Through non-deductive reasoning they arrive — though not by mathematics — at the truth. REFERENCES Ahuja Anjana [2000]: ‘A million-dollar maths question’, The Times , March 16, 2000. Annas J. [1993]: The Morality of Happiness . Oxford University Press. Azzouni J. [2006]: ‘How and why mathematics is unique as a social practice’, in Hersh R. ed., 18 Unconventional Essays on the Nature of Mathematics , pp. 201– 219. Springer. Baker A. [2009]: ‘Non-deductive methods in mathematics’, Stanford Encyclopedia of Philosophy.  http://plato.stanford.edu/entries/mathematics-nondeductive/ Accessed August 2015. Chen J. [1966]: ‘On the representation of a large even integer as the sum of a prime and the product of at most two primes’, Kexue Tongbao  11, 385– 386. Davis P. Hersh R. and Marchisotto E. [2012]: The Mathematical Experience: Study Edition . Boston: Birkhäuser. Google Scholar CrossRef Search ADS   De Villiers M. [2004]: ‘The role and function of quasi-empirical methods in mathematics’, Canadian Journal of Science, Mathematics and Technology Education  4, 397– 418. Google Scholar CrossRef Search ADS   Debnath L. and Bhatta D. [2006]: Integral Transforms and Their Applications . London: Chapman and Hall/CRC. Google Scholar CrossRef Search ADS   Dickson L.E. [1952]: History of the Theory of Numbers . New York: Chelsea. Echeverria J. [1996]: ‘Empirical methods in mathematics: A case-study: Goldbach’s conjectureߣ, in, Munévar G. ed. Spanish Studies in the Philosophy of Science , pp. 19– 55. Boston: Kluwer. Google Scholar CrossRef Search ADS   Fallis D. [2002]: ‘What do mathematicians want? Probabilistic proofs and the epistemic goals of mathematicians’, Logique et Analyse  45, 373– 388. Frege G. [1980]: Foundations of Arithmetic . John Austin, trans. Oxford: Basil Blackwell. Frye R. [1988]: ‘Finding $$(95800)^{4}+(217519)^{4}+(414560)^{4}=(422481)^{4}$$ on the connection machine’, Proceedings of Supercomputing  88, 106– 116. Gettier E. [1963]: ‘Is justified true belief knowledge?’ Analysis  23, 121– 123. Google Scholar CrossRef Search ADS   Grabiner J. [1974]: ‘Is mathematical truth time-dependent?’ American Mathematical Monthly  81, 354– 365. Google Scholar CrossRef Search ADS   Grünbaum B. [1993]: ‘Quadrangles, pentagons, and computers’, Geombinatorics  3, 4– 9. Hardy G.H. and Littlewood J. [1923]: ‘On some problems of “Partitio Numerorum” III: On the expression of a number as the sum of primes’, Acta Mathematica  44, 1– 70. Google Scholar CrossRef Search ADS   Haselgrove C. [1958]: ‘A disproof of a conjecture of Pólya’, Mathematika  5, 141– 145. Google Scholar CrossRef Search ADS   Helfgott H. [2013]: ‘The ternary Goldbach conjecture is true’, arXiv:1312.7748 Kuhn T. [1996]: The Structure of Scientific Revolutions . 3rd ed. Chicago: University of Chicago Press. Google Scholar CrossRef Search ADS   Lakatos I. [1976]: Proofs and Refutations . Worrall J. and Zohar E. eds. Cambridge University Press. Google Scholar CrossRef Search ADS   Lander L.J. and Parkin T.R. [1966]: ‘Counterexample to Euler’s conjecture on sums of like powers’, Bulletin of the American Mathematical Society  72, 1079. Google Scholar CrossRef Search ADS   Lehman R. [1960]: ‘On Liouville’s function’, Mathematics of Computation  14, 311– 320. MacIntyre A. [2011]: After Virtue . London: Bloomsbury. MacIntyre A. [2013] ‘On having survived the academic moral philosophy of the twentieth century’, in O’Rourke F. ed., What Happened in and to Moral Philosophy in the Twentieth Century , pp. 17– 34. Notre Dame: Notre Dame University Press. Narkiewicz W. [2000]: The Development of Prime Number Theory: From Euclid to Hardy and Littlewood . Springer-Verlag. Google Scholar CrossRef Search ADS   Odlyzko A. and te Riele H. [1985]: ‘Disproof of the Mertens conjecture’, Journal für die reine und angewandte Mathematik  357, 138– 160. Pólya G. [1919]: ‘Verschiedene Bemerkungen zur Zahlentheorie’, Jahresbericht der Deutschen Mathemathiker-Vereinigung  28, 31– 40. Pólya G. [1954]: Mathematics and Plausible Reasoning. Vol. I . Princeton University Press. Pomerance C. and Crandal R. [2001]: Prime Numbers: A Computational Perspective . Springer. Google Scholar CrossRef Search ADS   Rotman J. [1998]: Journey into Mathematics: An Introduction to Proofs . Upper Saddle River, New Jersey: Prentice Hall. Reprinted by Dover, 2006. Russell B. [2009]: ‘A liberal decalogue’, in Autobiography . London: Routledge. Silva T.O. Herzog S. and Pardi S. [2013]: ‘Empirical verification of the even Goldbach conjecture and computation of prime gaps up to $$4\times10^{18}$$’, Mathematics of Computation  83, 2033– 2060. Google Scholar CrossRef Search ADS   Solomon R. [2001]: ‘A brief history of the classification of finite simple groups’, Bulletin of the American Mathematical Society  38, 315– 352. Google Scholar CrossRef Search ADS   Stieltjes T.J. [1885]: ‘Lettre á Hermite de 11 Juillet 1885’, in Baillaud B. and Bourget H. eds, Correspondance d’Hermite et de Stieltjes , pp. 160– 164. Paris: Gauthier-Villars, 1905. Tanaka M. [1980]: ‘A numerical investigation on cumulative sum of the Liouville function’, Tokyo Journal of Mathematics  3, 187– 189. Google Scholar CrossRef Search ADS   Thurston W. [1995]: ‘On proof and progress in mathematics’, For the Learning of Mathematics  15, 29– 37. Ulam S. [1992]: Adventures of a Mathematician . Oakland: University of California Press. Wiener N. [1915]: ‘Is mathematical certainty absolute? The Journal of Philosophy, Psychology and Scientific Methods  12, 568– 574. Google Scholar CrossRef Search ADS   Footnotes † I would like to thank Marcus Giaquinto for his patience and dedication as my supervisor at UCL, and for the benefit of his knowledge and experience since. Without his help the exposition of my argument would have been far less clear. Thanks are also due to the friends and acquaintances who looked over the manuscript in some form, including Jody Azzouni, David Corfield, James Cranch, Ernest Davis, Luke Fenton-Glynn, Elvijs Sarkans, and José Zalabardo. Lastly, I would like to thank Susi Fritscher for her continued love and support. 1 In the 1970s, Stanislaw Ulam [1992, p. 288] estimated that around 100,000 theorems were proved each year, a figure later refined to nearer 200,000. 2 Taken from the United Kingdom Mathematical Trust Senior Mentoring Scheme, 2013–2014, Sheet 1, Problem 8. 3 This section is indebted to a discussion by Alan Baker [2009], though we differ on some points. By coincidence, the author of that article — a philosopher — has the same name as the mathematician quoted below. 4 Adapted from [Rotman, 1998, p. 20]. See also [De Villiers, 2004, p. 412]. 5I.e., $$\mu(k)=0$$ if $$k$$ is a multiple of a square number other than 1, $$\mu(k)=1$$ if $$k$$ is square-free with an even number of prime factors, and $$\mu(k)=-1$$ if $$k$$ is square-free with an odd number of prime factors. 6 Conversely, research programmes may go into decline for reasons that are not related to them: for instance, the collapse of investigation into foliations in geometric topology following William Thurston’s [1995, p. 35] single-handedly proving all the interesting results in that area. © The Author [2016]. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Philosophia Mathematica Oxford University Press

# Proof and the Virtues of Shared Enquiry

Philosophia Mathematica, Volume 26 (1) – Feb 1, 2018
19 pages

/lp/ou_press/proof-and-the-virtues-of-shared-enquiry-DKOGBkJRGo
Publisher
Oxford University Press
ISSN
0031-8019
eISSN
1744-6406
D.O.I.
10.1093/philmat/nkw022
Publisher site
See Article on Publisher Site

### Abstract

ABSTRACT This paper investigates an important aspect of mathematical practice: that proof is required for a finished piece of mathematics. If follows that non-deductive arguments — however convincing — are never sufficient. I explore four aspects of mathematical research that have facilitated the impressive success of the discipline. These I call the ‘Practical Virtues’: Permanence, Reliability, Autonomy, and Consensus (PRAC). I then argue that permitting results to become established on the basis of non-deductive evidence alone would lead to their deterioration (with some possible exceptions). This furnishes us with a partial rational justification for mathematicians’ strict insistence on proof. 1. PROOF AND PUBLICATION Even in the mathematical sciences, our principal instruments to discover the truth are induction and analogy. Laplace (as quoted in [Pólya, 1954, p. 35]) In the Theory of Numbers, it happens rather frequently that by some unexpected luck, the most elegant new truths spring up by induction. Gauss (as quoted in [Pólya, 1954, p. 59]) [T]he properties of the numbers known today have been mostly discovered by observation, and discovered long before their truth has been confirmed by rigid demonstrations. Euler (quoted in [Pólya, 1954, p. 3]]). Non-deductive techniques of various kinds — inductive, experimental, visual, analogical — have always had a number of important roles within mathematics. One central use is in the discovery of plausible conjectures that are later proved, and below we will see that they can help us discover proofs too. Non-deductive techniques are also a useful check on attempts at proof. Focusing on finished pieces of mathematics may lead us to overlook this importance, however: usually only the deductive proof is published and the heuristic work is simply discarded. Indeed, I will claim that mathematicians always require proof for the acceptance of mathematical claims. But what is meant by ‘acceptance’ here? We distinguish two distinct meanings. Private Acceptance. Personal belief on behalf of individual mathematicians. Public Acceptance. An established theorem that is now eligible for unqualified assertion in peer-reviewed journals and other serious mathematical publications. In Section 4 I discuss the famous Goldbach Conjecture, which we shall see that many mathematicians believe to be true on the basis of the vast amount of inductive evidence in its favour, even though a proof has not been found. So mathematicians do sometimes Privately Accept results in the absence of proof. However, proof is required for Public Acceptance, which is therefore the intended sense. This insistence on proof has long been reflective of the feelings of mathematicians. Euler writes, ‘we should take great care not to accept as true such properties of the numbers which we have discovered by observation and which are supported by induction alone’ (as quoted in [Pólya, 1954, p. 3]. Frege later comments that ‘in mathematics a mere moral conviction, supported by a mass of successful applications, is not good enough’ [Frege, 1980, p. 1]. More recently, Michael De Villiers writes emphatically that ‘Nobody, today, can really be considered mathematically educated or literate, if he or she is not aware of the insufficiency of quasi-empirical evidence to guarantee truth in mathematics, no matter how convincing that evidence may seem’ [De Villiers, 2004, p. 412]. It is clear that having a proof provides us with a special kind of a priori warrant for believing its conclusion. However, Private Acceptance in the absence of proof is in fact fairly common, both historically and in contemporary mathematics. So why not extend this to Public Acceptance, too? After all, mathematicians Publicly Accept results for which only very long and complex proofs are available. The first proof of the Group Classification Theorem totaled over 5,000 pages, for example; Solomon writes ‘Is the database correct? Is there a 27th sporadic simple group? I seriously doubt it, but it would be chutzpahdich to assert that a 5000-page 40-year human endeavor is beyond the possibility of human error’ [Solomon, 2001, p. 347]. The epistemic position of a mathematician who has amassed a very strong body of non-deductive evidence might therefore compare favourably. Moreover, this suggestion is not of interest only to philosophers: some mathematicians have actually recommended that the standards for Public Acceptance should be relaxed in this way. Consider the remarks of mathematician Branko Grünbaum, who used the computer application Mathematica to ‘explore and verify’ some geometric results: Do we start trusting numerical evidence (or other evidence produced by computers) as proofs of mathematics theorems? ... if we have no doubt — do we call it a theorem? ... I do think my assertions are theorems ... the mathematical community needs to come to grips with new modes of investigation that have been opened up by computers [Grünbaum, 1993, p. 8]. In the rest of this paper I will, however, present reasons why non-deductive arguments are unsuitable as justification in the context of Public Acceptance (with some possible exceptions, to be discussed in the conclusion). Rather than beginning from the individual epistemic state of an enquirer I will instead take a broadly Aristotelian approach, focusing on the practical virtues: four sociological features of mathematical research and the relationships researchers enjoy with the wider community. These we explore in the following section. I later argue that if mathematicians were to Publicly Accept results on the basis of most kinds of non-deductive evidence, these valuable features of practice would be undermined. 2. THE PRACTICAL VIRTUES In this section I will outline the four practical virtues: Permanence, Reliability, Autonomy, and Consensus (PRAC). These are empirical descriptive traits that the practice of mathematics exhibits to a high degree — though not absolutely. Later I shall argue that non-deductive evidence is not in general conducive to these virtues, and as we shall see in this section they are supported by mathematicians’ practice of insisting on proof for Public Acceptance. In order to move towards a complete rational justification of this insistence, I also sketch out how these practical virtues facilitate the flourishing of mathematics as a discipline: the progress of mathematical enquiry and the enormous success of the field. Permanence. When a result becomes Publicly Accepted, it retains this status indefinitely. In the natural sciences, the status of any hypothesis is always to some extent provisional. Researchers must always be open to the possibility that their most fundamental results will have to be revised in the light of new evidence. Historical examples abound: perhaps the most celebrated is Einstein’s discovery of the merely approximate and local character of Newtonian mechanics, the most outstanding scientific achievement of its age, and long revered as a paradigmatic instance of the certainty that scientific work could aspire to. Some current scientific theories, such as the central causal role of natural selection driving evolutionary change in biology, may seem so secure that the chances of their displacement are negligible. Yet we can always imagine future discoveries or experiments that might lead to this occurring. The status of an established mathematical result, on the other hand, is not like this: results that have become Publicly Accepted are expected to remain part of mathematics on a permanent basis. Moreover, this expectation is not mere hubris, but is grounded in strong historical precedent. The theorems of Eudoxus and Archimedes are still our theorems, though the justifications we give for them may be quite different, and ‘In most sciences one generation tears down what another has built, and what one has established, another undoes. In mathematics alone each generation adds a new storey to the old structure’ Herman Hankel (as quoted in [Debnath and Bhatta, 2006, p. 315]). One benefit of Permanence concerns the dependence on previously established theorems: mathematical arguments usually rely on many subsidiary results in deriving their conclusions. Of course, both philosophers and scientists of all kinds also cite other authors in support of their claims. But in mathematics this practice is extraordinarily effective, and without this reliance on auxiliary results research would be severely slowed down. However, if results were Publicly Accepted only for a limited time, it would be much harder to keep track of which articles are currently acceptable to cite. The Permanence of mathematics is also part of what enables a ‘deeper penetration into the subject-matter’ than in other fields of enquiry: because mathematical edifices are largely Permanent, over time they can develop to an immense complexity as every aspect of a given structure is scrupulously examined. Reliability. Publicly Accepted results are always true. This feature expresses the claim that the overwhelming majority of results mathematicians Publicly Accept are true. Reliability — often under the guise of the more traditional term ‘certainty’ — has long been part of the self-image of mathematics: Norbert Wiener writes ‘The place where most people would look for absolute certainty is in pure mathematics or logic. Indeed, “mathematical certainty” has become a byword’ [Wiener, 1915, p. 568]. But can we be sure that this belief in the Reliability of mathematics is warranted? We might initially think the observed fact of Permanence gives us sufficient grounds for Reliability, and that Publicly Accepted claims are rarely overturned simply because such results are rarely false. Taken together, Reliability and Permanence thus present an attractive picture of mathematical practice: due to the restriction to deductive proof, only true results are Publicly Accepted, and so over time only true — and hence incontrovertible — results are added to the structure. This view might be disputed, however. Although mathematical results that have been Publicly Accepted for a substantial period of time are almost never overturned, errors are actually quite often found in early drafts of attempted proof presentations given by mathematicians. So perhaps the Permanence of established mathematical results is not a sure guide to its Reliability and is merely because such errors as might now remain are unlikely to be revealed by the checking process, or even because interest in checking them has simply subsided. Indeed, we might worry whether it is even possible to establish Reliability using only sociological modes of investigation: though it is a descriptive claim whose content concerns the realities of practice, it is also one that requires appeal to a non-empirical objective standard of truth in its application. Though Reliability is suggested by the observation of Permanence (and also Consensus, the fourth virtue, to be discussed below), to establish it in practice we must therefore go beyond sociological observation and consider the powerful mathematical arguments available for Publicly Accepted results. Not only is insistence on proof responsible for maintaining Reliability; it is also precisely the availability of proof that enables us as inquirers to see that Publicly Accepted mathematics is indeed Reliable. For instance, it is unreasonable to claim that there might still be errors in every single discourse purporting to present what we surely all believe is a genuine proof of Euclid showing that there is an infinity of primes. Too many thinkers have internalised the proof and come up with their own novel presentations. This is not mere rote checking, but deep and intuitive understanding. For mathematics that has been widely circulated, internalised, and reconstituted, the denial of Reliability therefore gives expression only to a rather extreme form of scepticism. The benefits of a highly Reliable literature are of course immense. Consider again the reliance on previously proved theorems. Suppose we are reading an article in a reputable mathematical journal and come across the line ‘From [17], we know that all $$A$$s that are $$B$$s are also $$C$$s.’ The level of vigilance required by the reader here is very small even in comparison to the natural sciences: most of the time we would not fall into error if we simply accepted the result and moved on with the article without even looking up citation [17] in the bibliography (though if our interest is professional and we want to go on to use the result to prove something else this might be somewhat irresponsible). Conversely, if we are instead merely told that a claim has been made in print in a philosophy journal nothing like the same level of conviction on the part of the reader will follow. The Reliability of mathematics also means that natural scientists and others can take mathematical results ‘off the shelf’ without worrying about their veracity. They can regard the results they find in mathematics journals and textbooks as true simpliciter, and if their theories fail will rarely look to locate errors there. Autonomy. Competent researchers can always come to know Publicly Accepted results in an intellectually independent way, and publication is never permitted on the basis of trust or authority alone. This feature of mathematical practice has two components. Firstly, any mathematician can (without much effort) find his or her own explicit reasons for believing any Publicly Accepted claim. It is clear that insistence on proof helps to maintain this condition, as journals rightly believe the results they publish are Reliable and expect the arguments for them to be found convincing by all. Though proof discourses come in many forms, understanding a proof always enables any competent reader to know the conclusion in an intellectually independent way. It must be stressed that this aspect of the condition is essentially modal. It may be that no mathematician has personally scrutinised proof presentations for all or even a large proportion of the mathematical results they believe and regard as Publicly Accepted: indeed, having every mathematician check every new result every year would slow research down to a standstill. But even if this were possible it would be highly inefficient and hence undesirable.1 The second aspect of Autonomy is that generally speaking mathematicians do not publish results on the basis of personal authority alone. When journal referees judge that a printed argument constitutes a sufficient basis for publication, they never rely upon trust in the testimony of the publishing mathematician. It is also hard to see how this aspect of Autonomy could be adequately maintained without the restriction to deductive proof: if a result were supported by inductive evidence alone, we would be required to accept the judgement of the author that this evidence is conclusive. (This point will be discussed in more detail in Section 5.) Autonomy is clearly intellectually valuable for its own sake, and also helps to maintain the other virtues — particularly Consensus, which we discuss in the next section. Russell writes, ‘Have no respect for the authority of others, for there are always contrary authorities to be found’ [2009, p. 534]. Whilst a student is still learning to think mathematically, or to understand the use of some new concept, some reliance on the authority of a teacher may be necessary; results or techniques must be presented in some particular order, and the justification of a principle might take a lesson too far afield at the stage when it is first needed. But at some point in their intellectual development towards becoming independent mathematicians, students usually begin to insist on their own reasons for believing results they are taught. Of course, this is even truer of professional mathematicians. Consensus. There is a shared agreement as to which statements are Publicly Accepted. As we have just said, Consensus is enhanced by Autonomy: the realization that convincing arguments are available leads mathematicians to have faith in the published literature. This Consensus is thus spontaneous rather than coerced; this is what the philosopher Jody Azzouni has called ‘The benign fixation of mathematical practice’ [2006, p. 208]. It is also connected to the other three practical virtues as well. Consensus does not exactly imply Reliability — we could imagine a high level of agreement within a field whose findings were in fact highly suspect. But the high degree to which mathematicians exhibit Consensus is clearly connected to it: it is the acknowledgement of the Reliability of published research. And Consensus depends on Permanence, too: if results tended to move in and out of Public Acceptance over time this would likely lead to widespread disagreements during the transitional periods. We can also see how Consensus is in part a consequence of insistence on proof. Mathematics journals maintain very high standards for publication, with the warranted expectation that the articles they publish will be found convincing by the entire readership. Later we will see that this would be difficult to achieve with non-deductive evidence alone. To put into perspective the high degree of Consensus within mathematics, consider the range of disagreement amongst contemporary philosophers. In the analytic tradition alone today there are dualists, reductive and non-reductive materialists, and idealists in the philosophy of mind; consequentialists, Kantians, contractarians, virtue and natural-rights theorists, and non-cognitivists in ethics; formalists, constructivists, nominalists, and Platonists in the philosophy of mathematics, and so on for each subdiscipline. Moreover, these disagreements constitute the permanent condition of academic philosophy [MacIntyre, 2013, p. 18]. This lack of Consensus does not imply that philosophers do not have the resources to discover true answers to the questions they pursue. However, it does suggest that philosophers find it difficult to formulate arguments that are able to establish lasting agreement amongst all of their academic colleagues about the truth and rational justifiability of their particular philosophical positions. This remains the case even though consensus may be achieved in a negative direction; Gettier’s paper [1963] on the justified true belief account of knowledge is perhaps a notable example of this. In mathematics, however, such disagreement is rare. Consider the difference between an undergraduate course in moral philosophy on the one hand, where one can expect to learn about key historical figures such as Mill, Bentham, Kant, Aquinas, and Aristotle, the different conceptions of morality they have each articulated, and the arguments supporting their positions; and an undergraduate course in mathematical analysis on the other, where one can expect a concise and ahistorical presentation of what everyone agrees are the established theorems and true results pertaining to the central questions that arise in that area. Moreover, unlike in natural science where consensus is perhaps maintained only temporarily by a group of researchers operating under a shared paradigm in the course of what Thomas Kuhn [1996, p. 10] has called ‘normal science’, in mathematics the agreement reached is Permanent. The benefits of Consensus are again clear. Like the other virtues, it is important for the practice of citing other articles when invoking subsidiary results. If the auxiliary results needed were not agreed to be Publicly Accepted by all researchers, then the effectiveness of collaborative mathematical research would be reduced. It is not sufficient that the literature is highly Reliable and cited results are always in fact all true: this also needs to be acknowledged in practice. The fact that there is now a more or less permanent Consensus about which results are Publicly Accepted also means that mathematicians are free to pursue intensively specialised research and are not forced to spend time and energy warring with competing schools within the discipline. This is again similar to Kuhn’s account of the necessary conditions for ‘normal science’ to progress, although in mathematics such agreement is Permanent and not periodically fragmented by internal crises [Kuhn, 1996, p. 66]. In closing the discussion of Consensus, I will add a brief caveat to what has been said so far. Throughout history there have been disagreements of sorts: questions of how best to formulate concepts, which theorems are interesting, which objects or problems are the most important to study, aesthetic judgements and questions of explanatory value, which systems of axioms it is appropriate to work within, and so on. Consider also the rival attempts to rigorise the calculus: geometric, algebraic, arithmetic. However, apart from occasional exceptions, such as the disagreements with classical mathematics of intuitionists led by Brouwer about what constitutes an acceptable mathematical argument, which were in any case fringe, such disagreements are not directly about which results should be Publicly Accepted: in contrast to moral philosophy, rival geometric systems (Euclidean, hyperbolic, elliptic) can coexist peacefully. Once the starting points are clearly articulated, mathematicians working within one paradigm do of course recognise the results of other approaches as genuine theorems. Before moving on to the next section, I will briefly discuss the interdependence of the four practical virtues — a direct parallel of the well-trodden issue of the unity of the virtues in ancient ethical theory (bravery, temperance, justice, honesty), with many thinkers following Aristotle in opting for the strong thesis that one cannot fully have any individual virtue without possessing all the rest (for an excellent discussion, see [Annas, 1993, pp. 73–84]). In this context, we should distinguish purely logical relations between the practical virtues understood as abstract ideals, and empirical tendencies of perhaps only partially realised virtues to be mutually supporting at the level of practice (the sense in which they are intended to apply). Throughout this section, it has been suggested that the virtues are related — for instance, that Permanence is a consequence of Reliability. However, for our present purposes we actually do not need to decide whether any one virtue implies the others. It is clear that the benefits they bring to mathematical practice are distinct and separable, and so appeal to each of them will furnish mathematicians with a distinct justification for insistence on proof. 3. A CASE STUDY In this section we consider a concrete example of a computer-based heuristic approach to a simple mathematical problem. The computational power supplied by modern computers is in many respects vast: mathematicians can now easily perform calculations that would have been intractable a few decades ago. This has greatly enhanced the potential of non-deductive methods, undeniably moving us beyond mere discovery and into the context of justification. I nevertheless argue that yielding Public Acceptance to results solely on the basis of the kind of non-deductive evidence I will give here would diminish the extent to which the four Practical Virtues are realised in practice. Problem 1.2 Does there exist a positive integer, $$n$$, such that $$(2+\surd2)^{n}$$ differs from an integer by no more than $$10^{-6}$$? Let us begin by calculating the first few values of the expression and then look for a pattern. Examining these data (see Table 1, columns 1 and 3), we see that $$(2+\surd2)^{n}$$ is getting progressively closer to an integer from below at every step. Let us call the difference to the next integer up $$f(n)$$ and calculate a few values explicitly (see column 4). Looking at these figures, we notice that they seem to decrease by around the same proportion each time. This suggests we calculate the ratio between terms (see column 5). Now, examining column 5 we see that the ratio is always equal to the first term $$0.585786\dots$$. At this stage, I spotted that $$0.585786$$ is the beginning of the decimal expansion of $$(2-\surd2)$$. If this is in fact the exact value of both the first term in column 4 and the ratio of each pair of adjacent terms, then $$f(n)$$ must be equal to $$(2-\surd2)^{n}$$ for each positive integer $$n$$. We have arrived at the following conjecture: $$(2+\surd2)^{n}+(2-\surd2)^{n}$$ is an integer for all positive integers $$n$$. Table 1. Numbers for Problem 1 $$n$$  exact $$(2+\surd2)^{n}$$  decimal $$(2+\surd2)^{n}$$  $$f(n)$$  $$\frac{f(n)}{f(n-1)}$$  1  $$2+\surd2$$  3.4142134  0.585786     2  $$6+4\surd2$$  11.656854  0.343146  0.585786  3  $$20+14\surd2$$  39.798990  0.201010  0.585786  4  $$68+48\surd2$$  135.882251  0.117749  0.585786  5  $$232+164\surd2$$  463.931024  0.068976  0.585786  6  $$792+560\surd2$$  1583.959595  0.040405  0.585786  7  $$2704+1912\surd2$$  5407.976331  0.023669  0.585786  8  $$9232+6528\surd2$$  18463.986135  0.013865  0.585786  9  $$31520+22288\surd2$$  63039.991878  0.008122  0.585786  10  $$107616+76096\surd2$$  215231.995242  0.004758  0.585786  11  $$367424+259808\surd2$$  734847.997213  0.002787  0.585786  12  $$1254464+887040\surd2$$  2508927.998367  0.001633  0.585786  13  $$4283008+3028544\surd2$$  8566015.999044  0.000956  0.585786  14  $$14623104+10340096\surd2$$  29246207.999440  0.000560  0.585786  15  $$49926499+35303296\surd2$$  99852799.999672  0.000328  0.585786  $$n$$  exact $$(2+\surd2)^{n}$$  decimal $$(2+\surd2)^{n}$$  $$f(n)$$  $$\frac{f(n)}{f(n-1)}$$  1  $$2+\surd2$$  3.4142134  0.585786     2  $$6+4\surd2$$  11.656854  0.343146  0.585786  3  $$20+14\surd2$$  39.798990  0.201010  0.585786  4  $$68+48\surd2$$  135.882251  0.117749  0.585786  5  $$232+164\surd2$$  463.931024  0.068976  0.585786  6  $$792+560\surd2$$  1583.959595  0.040405  0.585786  7  $$2704+1912\surd2$$  5407.976331  0.023669  0.585786  8  $$9232+6528\surd2$$  18463.986135  0.013865  0.585786  9  $$31520+22288\surd2$$  63039.991878  0.008122  0.585786  10  $$107616+76096\surd2$$  215231.995242  0.004758  0.585786  11  $$367424+259808\surd2$$  734847.997213  0.002787  0.585786  12  $$1254464+887040\surd2$$  2508927.998367  0.001633  0.585786  13  $$4283008+3028544\surd2$$  8566015.999044  0.000956  0.585786  14  $$14623104+10340096\surd2$$  29246207.999440  0.000560  0.585786  15  $$49926499+35303296\surd2$$  99852799.999672  0.000328  0.585786  This claim can now be proved by considering the binomial expansion of the two bracketed expressions. Terms (column 2) with an even power of $$\surd2$$ will themselves be integers, and terms with an odd power of $$\surd2$$ in the first expression will be cancelled out by corresponding terms in the second expression, which will be identical except for having a negative sign. So all we need now do to solve Problem 1 is to pick $$n$$ such that $$(2-\surd2)^{n}<10^{-6}$$. This can easily be done with logarithms ($$n=26$$ is the first solution). Using a computer, finding the solution to Problem 1 was rather easy and indeed almost mechanical. The only interesting parts were deciding to look at the ratio of successive terms and spotting the decimal expansion of $$2-\surd2$$. The quasi-empirical work also led us to discover the proof as well as the solution. This is reminiscent of a famous quote by Riemann, suggesting that getting to the answer is often the most difficult part: ‘If only I had the theorems — then I should find the proofs easily enough!’ (as quoted in [Lakatos, 1976]). However, this non-deductive work need not feature in the final presentation of the argument, which may simply begin from the observation that $$(2+\surd2)^{n}+(2-\surd2)^{n}$$ is an integer for all positive integers $$n$$. The non-deductive evidence also moved us beyond merely discovering a conjecture that was sufficiently plausible to warrant further investigation. The fact that $$f(n)$$ appeared to decrease by a factor of roughly $$(2-\surd2)$$ for all of the values checked seemed like compelling evidence that the conjecture was true — and even more so given that the similar expression $$(2+\surd2)$$ occurs prominently in the question. Until we reach the point where a specific integer with the requisite property has actually been constructed, however, the confidence that can be rationally induced by such data still falls short of complete certainty. It is possible — though highly unlikely — that such patterns arise simply by chance, and this calls the Reliability of the corresponding inference into question. Such non-deductive evidence thus fails to compel assent rationally with the same authority as the proof that was later supplied: if someone did not share our intuitions about the conclusiveness of the data we would be at a loss as to how to convince them. So if a result were to be published on this kind of evidence alone there might be issues with Consensus as well as Reliability. Though often I must settle on a fixed conjecture before looking for a proof, in the context of Private Acceptance such challenges do not arise. If I am wrong, only my own time and energy will be wasted; if I am correct, only the final proof will be given and the heuristic work will go unpublished. 4. THE GOLDBACH CONJECTURE ... every even integer is a sum of two primes. I regard this as a completely certain theorem, although I cannot prove it. Euler (as quoted in [Narkiewicz, 2000, p. 333]) Our second case study is a claim of some historical significance that has fascinated mathematicians for centuries.3 Goldbach Conjecture (GC). Let $$n$$ be an even integer greater than 2. Then $$n$$ can be expressed as the sum of two (not necessarily distinct) primes. Since its emergence from the Goldbach-Euler correspondence, GC has been subject to quasi-empirical investigation on a massive scale. Several mathematicians have checked it for large initial segments of the natural numbers, and in 2013 a lower bound of $$4\times10^{18}$$ for any counterexample was given [Silva et al., 2013]. Despite all of this inductive evidence it seems that a proof is not forthcoming: number theorist and Fields medalist Alan Baker stated in a 2000 interview that ‘It is unlikely that we will get any further without a big breakthrough. Unfortunately there isn’t such a big idea on the horizon’ [Ahuja, 2000]. As mentioned earlier, some mathematicians have nevertheless yielded Private Acceptance to GC. Indeed, Echeverria [1996] has claimed that ‘the certainty of mathematicians about the truth of GC is complete’. And in 1922 the eminent number theorists John Littlewood and G.H.~Hardy were willing to assert that ‘there is no reasonable doubt that the theorem is correct’ before the conjecture had even been checked up to $$10^{5}$$. GC thus provides another example of the extension of non-deductive methods to the context of justification — one endorsed by a substantial number of mathematicians. The data gathered by Silva and others do seem convincing, but we should consider how they might be worked into a more coherent and systematic argument to establish the truth of GC: as we have noted already, attempts to prove a false conjecture may result in wasted time and energy. We know that an admittedly huge initial run of the even natural numbers have been checked and are consistent with GC. What general principle could be invoked to move us from these data to the truth of the conjecture itself? One line of thought which might come naturally here is that if there were a special property belonging to some integers, possession of which by an integer entailed that it could not be split into the sum of two primes, then integers with this property would have already been encountered by now: surely a sample of size $$4\times10^{18}$$ is sufficiently representative. However, consider the following problem.4 Problem 2. Is it true that $$991n^{2}+1$$ is never a perfect square? Suppose that we had calculated this expression for a huge initial segment of the natural numbers, well beyond Silva’s investigation for GC — every value up to $$10^{25}$$, say — and the expression was never a square. According to the same reasoning as above, this would suffice to demonstrate the conjecture that no such numbers exist. But this conjecture is actually false: the first counterexample is $$n=12,055,735,790,331,359,447,442,538,767\approx1.2\times10^{28}$$. Moreover, this is no isolated example: historically there have been many other plausible-seeming claims in number theory that have turned out to have only huge smallest counterexamples. Here are three others: In 1769, Euler conjectured that for all integers $$n$$ and $$k$$ greater than 1, if the sum of $$n$$$$k^{\text{th}}$$ powers of non-zero integers is itself a $$k^{\text{th}}$$ power, then $$n$$ is greater than or equal to $$k$$ [Dickson, 1952, Vol. 2, pp. 658 f.]. A counterexample for $$k=5$$ was found in 1966: it is now easy to verify via computer that $$144^{5}=27^{5}+84^{5}+110^{5}+133^{5}$$ [Lander and Parkin, 1966]. A counterexample for $$k=4$$ is $$422481^{4}=95800^{4}+217519^{4}+414560^{4}$$ [Frye, 1988]. In 1885, Thomas Joannes Stieltjes conjectured in a letter to Hermite that the following claim was true. Define the Mertens function as $$M(n)=\sum_{1{{\le}{\text k}{\le}{\text n}}}{\mu}(k)$$, where $${\mu}(k)$$ is the Möbius function5 [Stieltjes, 1885]. Then for all $$n>1,|M(n)|<\surd n$$. This conjecture is known to imply the Riemann Hypothesis, but in 1985 the existence of a counterexample between $$10^{14}$$ and $$e^{(3.21\times10^{64})}$$ was proved by Andrew Odlyzko and Herman te Riele [1985]. In 1919, George Pólya conjectured that at least half of the natural numbers less than any given natural number $$n$$ have an odd number of prime factors when counted with multiplicity. The conjecture was disproved by Colin Brian Haselgrove [1958, p. 145]. The first explicit counterexample $$n=906,180,359$$ was given by Russell Sherman Lehman [1960]. The smallest counterexample is $$n=906,150,257$$, found by Minoru Tanaka [1980]. Some of these conjectures or others like them could be relevant to the truth of GC: the suggestion is at least not obviously implausible. These examples and others raise a concern about the Reliability of conjectures established on such a basis alone and make it seem unlikely that there is any such principle that will enable us to complete the argument. Indeed, because number-theoretic properties of the integers are not in general uniformly distributed — the prime numbers have asymptotically zero density, for example — they are not conducive to this kind of treatment, and there is prima facie no special reason to assume the property of satisfying GC to be an exception. We can find arbitrarily long sequences of consecutive integers with no primes: $$(n+1)!+2,(n+1)!+3,\ldots,(n+1)!+(n+1)$$ is a run of $$n$$ composite integers for any natural number $$n$$. Yet knowing the existence of a such a run of consecutive composite numbers clearly would not warrant the conclusion that no prime numbers exist at all. Here it is suggested that a run of consecutive integers that are not counterexamples to GC warrant the belief that no such counterexamples exist. In this case, the run of integers also forms an initial segment of the natural numbers, but it is not clear why this should be particularly indicative: our data may again be highly biased. One potential objection to what I have said so far is to note that the account I have given of the evidence is unfair. Confidence in the result may actually have increased when it was pointed out that the number of ways of expressing $$n$$ as the sum of two primes, the Goldbach partition function $$G(n)$$, appears to increase with $$n$$. Indeed, if we plot $$G(n)$$ for the first $$10^{6}$$ values of $$n$$ a beautiful pattern emerges (see Figure 1). But this evidence is really no more conclusive, despite the visually pleasing and compelling regularity of the pattern. It is not clear that $$G(n)$$’s being large mitigates the possibility that $$G(n+1)=0$$, and the claim that the pattern will continue is itself only supported inductively. It appears, then, that this additional evidence will be of no help in constructing a convincing argument for GC from the available data. Yet the opinions of Littlewood and Hardy on matters concerning number theory are not to be dismissed lightly. How best can we account for their confidence? As an attempt at a complete elucidation of the grounds upon which Littlewood, Hardy, and other mathematicians have come to believe GC, the presentation of the evidence I have given so far is highly misleading, and indeed necessarily so. For Littlewood’s and Hardy’s confidence will not have been grounded in explicit inference from inductive evidence based on some general theoretical principle. Rather, they will have come to an intuitive judgement about the plausibility of the conjecture in light of all the available evidence they possessed: a judgement which will itself have been informed and influenced by many other things they had known, believed and experienced up to that point in their mathematical careers. And the same is true for any contemporary number theorist who now believes the truth of GC. We give three examples of related theorems, prior knowledge of which might affect our assessment of the import of the quasi-empirical data. Fig. 1. View largeDownload slide A graph of $$G(n)$$ for values of $$n$$ between 4 and 1,000,000. The x-axis labels ‘$$a$$e+0b’ mean $$a\times 10^b$$ (By ‘Reddish’ at the English-language Wikipedia) Fig. 1. View largeDownload slide A graph of $$G(n)$$ for values of $$n$$ between 4 and 1,000,000. The x-axis labels ‘$$a$$e+0b’ mean $$a\times 10^b$$ (By ‘Reddish’ at the English-language Wikipedia) Theorem 1. (Ternary Goldbach Conjecture) Let $$n$$ be an odd integer with $$n>5$$. Then $$n$$ is the sum of three (not necessarily distinct) primes. [Helfgott, 2013] This is sometimes also known as the Weak Goldbach Conjecture, because (as Euler knew) it is a simple corollary of GC (if $$n$$ is odd and $$n>5$$, and if GC is true, then $$(n-3)$$ must be the sum of two primes). Although the proof was not completed until 2013, Littlewood and Hardy were themselves able to show in 1923 that given the generalized Riemann Hypothesis the conjecture was true for sufficiently large numbers, and it is likely that this achievement would have increased their confidence in the stronger result GC. [Hardy and Littlewood, 1923] Theorem 2. Let $$n$$ be an even integer greater than 2. Then $$n$$ is the sum of a prime and a second number that is the product of at most two primes. [Chen, 1966] Clearly, this result by Chen Jingrun is progress from another direction: it seems that mathematicians are homing in on the full result GC. Theorem 3. The set of even integers that cannot be represented as the sum of two primes has asymptotic density $$0$$ [Pomerance and Crandal, 2001, p. 17]. This claim shows that counterexamples must eventually become very sparse over a long enough initial segment. Yet its exact implications remain unclear: have we now checked a sufficient number of cases for the evidence to be compelling? Viewing mathematicians’ attitudes towards non-deductive evidence as stemming from intuitive judgements based on professional experience means that we need not commit them to any dubious general principle about all number-theoretic claims inviting belief when they are verified for a large enough initial segment of the natural numbers. Yet it also seems to preclude the presentation of the available evidence in the form of an explicit argument that is compelling unto itself. It is impossible for the whole evidential and psychological history of an intuitive judgement to be externalised from the mathematician making it and published alongside the conjecture. The Public Acceptance of GC on this basis alone would therefore undermine Autonomy. We have also seen a glimpse of the wide range of different factors that might affect our intuitive judgements when a sizeable number of researchers direct their attention to a problem. So again there may be issues with Consensus as different experts weigh in about how convincing the data are. There will also always be further insights available — related conjectures, more data, or theoretical considerations — and these may cause judgements about the evidence to shift back and forth over time, damaging Permanence. In conclusion, though some specialist mathematicians have yielded Private Acceptance to the conjecture on the basis of the non-deductive evidence alone, consideration of the practical virtues reveals that Public Acceptance of the result is not yet appropriate — as has indeed been borne out in practice. 5. PUBLIC ACCEPTANCE WITHOUT PROOF In this section I argue for the more general conclusion that the use of non-deductive methods in the context of Public Acceptance would (with some possible exceptions) likewise lead to a general decline in the extent to which mathematics exhibits the four practical virtues. I have argued for the importance of these ideals for mathematical research in general. If I am right about this, mathematicians have available good reasons for their collective decision to reject these non-deductive methods in the context of Public Acceptance, through appealing to the need for mathematical enquiry as a whole to continue in good working order. This remains true even though they may in some cases be warranted in Privately Accepting claims on such grounds alone. Many conjectures for which apparently convincing evidence is available have turned out to be false, and we have seen some examples above. If these conjectures had been Publicly Accepted, then clearly the mathematical literature would have become less Reliable as a result, and Permanence would have suffered too when counterexamples or disproofs were uncovered. If journals were to permit mathematicians to assert results unqualifiedly on the basis of non-deductive evidence alone, it would be critically important that the non-deductive data supplied provide sufficient evidence to justify the mathematical results they are to support. We have also seen, however, that judgements about whether a body of evidence warrants regarding a result as established are affected by both experience and background knowledge. Such judgements are therefore always liable to be disputed, even once all the available information is taken into account, and this will damage Consensus. Similarly, as time goes on and new evidence appears, individual mathematicians may change their minds about whether a conjecture has been adequately supported, which could damage Permanence. A related point concerns Autonomy. Davis et al. [2012, p. 411] claim that the non-deductive evidence available for the Riemann Hypothesis is ‘so strong that it carries conviction even without rigorous proof’. Perhaps they are warranted in making this claim, but how could a non-specialist decide either way? In these kinds of cases, the explicit mathematical justification given is not in and of itself sufficient to compel assent, and the arguments are completed only through the intuitive judgements of the specialists. But whether such intuitive judgements are justified in a particular case is never ascertainable by an external observer, because the ultimate evidential, experiential, and psychological basis of the judgement is always ultimately inscrutable. Justification of the claim therefore rests in part on their authority, undermining both aspects of Autonomy. It is therefore important to render the means by which such evidence is assessed clear and explicit. In order to preserve the four practical virtues, publishing mathematicians would need to provide a systematic, repeatable way of ascertaining that the body of non-deductive evidence they have supplied is in fact adequate. Such techniques must be sufficiently stringent to remove any issues about Reliability, and the verdicts they render must be stable over time so as to maintain Permanence. To preserve Autonomy, it must be such that any competent mathematician can apply it: this will require it to be fully transparent and given by a definite procedure. Lastly, in order to be suitably precise and to maintain Consensus, the verdict rendered must be precise and unequivocal. Such quantitative relations between evidence and hypothesis are of course the province of the theory of probability. It is also clear that the existence of a technique of assessment meeting all of these requirements will only be possible if our non-deductive methods themselves are delineated sufficiently sharply. Otherwise it is impossible to be systematic and we are back to the kind of skill embedded in specialists’ intuition. Yet non-deductive techniques are often merely opportunistic investigations that yield data construed as evidence only in hindsight: not really methods as such, in the sense of a definite procedure that can be applied to many problems. What would be required for adequate justification, then, is a definite algorithm whose reliability can be determinately evaluated using probabilistic techniques. Such methods do exist and find application in many different areas of mathematics. One such example — a procedure for finding prime numbers based on the Rabin-Miller algorithm — is discussed by the philosopher Don Fallis [2002]. He argues that mathematicians have not presented sufficient grounds for the rational rejection of such algorithms. And indeed, with this procedure the probability of error can be made less than any preassigned bound, and the testing phase repeated by any mathematician who might doubt the veracity of the result. It therefore seems we can imagine an alternative but equally robust mathematical culture where such non-deductive techniques are permitted in the context of Public Acceptance, and yet where the practical virtues flourish. Responding to this suggestion is a challenge for another time. 6. CONCLUSION In this paper I have argued that mathematicians can cite reasons for their strict insistence on proof and consequent rejection of (nearly all) non-deductive techniques in the context of Public Acceptance, even when the available evidence seems to leave no room for doubt. But are these sufficient reasons? Is it not true that other fields manage adequately without the practical virtues? Could not mathematical enquiry better flourish by relaxing them in order to make room for a wider range of permitted techniques? And what is it for an intellectual discipline to ‘flourish’, anyway? It is clear there is more work to be done here: this brief sketch will need to be supplemented with historical enquiry into the emergence of the practical virtues and more detailed sociological enquiry into their role in facilitating the success of mathematics (which I take to outstrip other disciplines where they are not present or present only to a lesser degree — including the natural sciences). One useful case study may be the collapse of the Italian school of algebraic geometry, which moved from a conventional period of insistence on proof under the leadership of Castelnuovo, through a shift towards intuitive arguments and consequent loss of Autonomy and then Consensus under the authority of Enriques, culminating in a catastrophic loss of Reliability under Severi. Another would be to investigate within this framework the motivations for — and consequences for practice of — the rigorisation of the calculus in the nineteenth century, whereby the powerful collection of techniques and results developed by Newton, Gauss, Lagrange, Euler, and others were put on a firm intellectual foundation by Cauchy, Bolzano, and Weierstrass, so that a return to Euclidean standards of proof for Public Acceptance could take place (for an interesting theory, see [Grabiner, 1974]). I do however hope to have made a start in the right direction, and further that the considerations I have appealed to in this paper will be recognisable to working mathematicians themselves. Likewise, though I have offered no systematic account of what it is for mathematical research to flourish, the practical virtues themselves provide a starting point here. Again in direct parallel to moral theory, where Aquinas in his discussion of Augustine’s account of the virtues explains how they are both a means to and a constitutive part of the good life for human beings, our practical virtues also express part of what it is for a discipline to be in good working order (see [MacIntyre, 2011, p. 215]). The practical virtues are not sufficient for a research community to flourish, however: a research group that fails to prove or discover any important conjectures can be said to enjoy only a modest degree of success.6 Through investigating practical and communal aspects of research rather than direct evidential value alone, I have concluded that different standards are appropriate in the context of Public Acceptance compared with what is required for rational justification of Private Acceptance. This I also hope to be a distinction with which practitioners are familiar. In practice, mathematicians often rely on a disparate variety of evidence in coming to believe they have found the correct answer before even beginning to look for explicit argumentation suitable for public consumption. Through non-deductive reasoning they arrive — though not by mathematics — at the truth. REFERENCES Ahuja Anjana [2000]: ‘A million-dollar maths question’, The Times , March 16, 2000. Annas J. [1993]: The Morality of Happiness . Oxford University Press. Azzouni J. [2006]: ‘How and why mathematics is unique as a social practice’, in Hersh R. ed., 18 Unconventional Essays on the Nature of Mathematics , pp. 201– 219. Springer. Baker A. [2009]: ‘Non-deductive methods in mathematics’, Stanford Encyclopedia of Philosophy.  http://plato.stanford.edu/entries/mathematics-nondeductive/ Accessed August 2015. Chen J. [1966]: ‘On the representation of a large even integer as the sum of a prime and the product of at most two primes’, Kexue Tongbao  11, 385– 386. Davis P. Hersh R. and Marchisotto E. [2012]: The Mathematical Experience: Study Edition . Boston: Birkhäuser. Google Scholar CrossRef Search ADS   De Villiers M. [2004]: ‘The role and function of quasi-empirical methods in mathematics’, Canadian Journal of Science, Mathematics and Technology Education  4, 397– 418. Google Scholar CrossRef Search ADS   Debnath L. and Bhatta D. [2006]: Integral Transforms and Their Applications . London: Chapman and Hall/CRC. Google Scholar CrossRef Search ADS   Dickson L.E. [1952]: History of the Theory of Numbers . New York: Chelsea. Echeverria J. [1996]: ‘Empirical methods in mathematics: A case-study: Goldbach’s conjectureߣ, in, Munévar G. ed. Spanish Studies in the Philosophy of Science , pp. 19– 55. Boston: Kluwer. Google Scholar CrossRef Search ADS   Fallis D. [2002]: ‘What do mathematicians want? Probabilistic proofs and the epistemic goals of mathematicians’, Logique et Analyse  45, 373– 388. Frege G. [1980]: Foundations of Arithmetic . John Austin, trans. Oxford: Basil Blackwell. Frye R. [1988]: ‘Finding $$(95800)^{4}+(217519)^{4}+(414560)^{4}=(422481)^{4}$$ on the connection machine’, Proceedings of Supercomputing  88, 106– 116. Gettier E. [1963]: ‘Is justified true belief knowledge?’ Analysis  23, 121– 123. Google Scholar CrossRef Search ADS   Grabiner J. [1974]: ‘Is mathematical truth time-dependent?’ American Mathematical Monthly  81, 354– 365. Google Scholar CrossRef Search ADS   Grünbaum B. [1993]: ‘Quadrangles, pentagons, and computers’, Geombinatorics  3, 4– 9. Hardy G.H. and Littlewood J. [1923]: ‘On some problems of “Partitio Numerorum” III: On the expression of a number as the sum of primes’, Acta Mathematica  44, 1– 70. Google Scholar CrossRef Search ADS   Haselgrove C. [1958]: ‘A disproof of a conjecture of Pólya’, Mathematika  5, 141– 145. Google Scholar CrossRef Search ADS   Helfgott H. [2013]: ‘The ternary Goldbach conjecture is true’, arXiv:1312.7748 Kuhn T. [1996]: The Structure of Scientific Revolutions . 3rd ed. Chicago: University of Chicago Press. Google Scholar CrossRef Search ADS   Lakatos I. [1976]: Proofs and Refutations . Worrall J. and Zohar E. eds. Cambridge University Press. Google Scholar CrossRef Search ADS   Lander L.J. and Parkin T.R. [1966]: ‘Counterexample to Euler’s conjecture on sums of like powers’, Bulletin of the American Mathematical Society  72, 1079. Google Scholar CrossRef Search ADS   Lehman R. [1960]: ‘On Liouville’s function’, Mathematics of Computation  14, 311– 320. MacIntyre A. [2011]: After Virtue . London: Bloomsbury. MacIntyre A. [2013] ‘On having survived the academic moral philosophy of the twentieth century’, in O’Rourke F. ed., What Happened in and to Moral Philosophy in the Twentieth Century , pp. 17– 34. Notre Dame: Notre Dame University Press. Narkiewicz W. [2000]: The Development of Prime Number Theory: From Euclid to Hardy and Littlewood . Springer-Verlag. Google Scholar CrossRef Search ADS   Odlyzko A. and te Riele H. [1985]: ‘Disproof of the Mertens conjecture’, Journal für die reine und angewandte Mathematik  357, 138– 160. Pólya G. [1919]: ‘Verschiedene Bemerkungen zur Zahlentheorie’, Jahresbericht der Deutschen Mathemathiker-Vereinigung  28, 31– 40. Pólya G. [1954]: Mathematics and Plausible Reasoning. Vol. I . Princeton University Press. Pomerance C. and Crandal R. [2001]: Prime Numbers: A Computational Perspective . Springer. Google Scholar CrossRef Search ADS   Rotman J. [1998]: Journey into Mathematics: An Introduction to Proofs . Upper Saddle River, New Jersey: Prentice Hall. Reprinted by Dover, 2006. Russell B. [2009]: ‘A liberal decalogue’, in Autobiography . London: Routledge. Silva T.O. Herzog S. and Pardi S. [2013]: ‘Empirical verification of the even Goldbach conjecture and computation of prime gaps up to $$4\times10^{18}$$’, Mathematics of Computation  83, 2033– 2060. Google Scholar CrossRef Search ADS   Solomon R. [2001]: ‘A brief history of the classification of finite simple groups’, Bulletin of the American Mathematical Society  38, 315– 352. Google Scholar CrossRef Search ADS   Stieltjes T.J. [1885]: ‘Lettre á Hermite de 11 Juillet 1885’, in Baillaud B. and Bourget H. eds, Correspondance d’Hermite et de Stieltjes , pp. 160– 164. Paris: Gauthier-Villars, 1905. Tanaka M. [1980]: ‘A numerical investigation on cumulative sum of the Liouville function’, Tokyo Journal of Mathematics  3, 187– 189. Google Scholar CrossRef Search ADS   Thurston W. [1995]: ‘On proof and progress in mathematics’, For the Learning of Mathematics  15, 29– 37. Ulam S. [1992]: Adventures of a Mathematician . Oakland: University of California Press. Wiener N. [1915]: ‘Is mathematical certainty absolute? The Journal of Philosophy, Psychology and Scientific Methods  12, 568– 574. Google Scholar CrossRef Search ADS   Footnotes † I would like to thank Marcus Giaquinto for his patience and dedication as my supervisor at UCL, and for the benefit of his knowledge and experience since. Without his help the exposition of my argument would have been far less clear. Thanks are also due to the friends and acquaintances who looked over the manuscript in some form, including Jody Azzouni, David Corfield, James Cranch, Ernest Davis, Luke Fenton-Glynn, Elvijs Sarkans, and José Zalabardo. Lastly, I would like to thank Susi Fritscher for her continued love and support. 1 In the 1970s, Stanislaw Ulam [1992, p. 288] estimated that around 100,000 theorems were proved each year, a figure later refined to nearer 200,000. 2 Taken from the United Kingdom Mathematical Trust Senior Mentoring Scheme, 2013–2014, Sheet 1, Problem 8. 3 This section is indebted to a discussion by Alan Baker [2009], though we differ on some points. By coincidence, the author of that article — a philosopher — has the same name as the mathematician quoted below. 4 Adapted from [Rotman, 1998, p. 20]. See also [De Villiers, 2004, p. 412]. 5I.e., $$\mu(k)=0$$ if $$k$$ is a multiple of a square number other than 1, $$\mu(k)=1$$ if $$k$$ is square-free with an even number of prime factors, and $$\mu(k)=-1$$ if $$k$$ is square-free with an odd number of prime factors. 6 Conversely, research programmes may go into decline for reasons that are not related to them: for instance, the collapse of investigation into foliations in geometric topology following William Thurston’s [1995, p. 35] single-handedly proving all the interesting results in that area. © The Author [2016]. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

### Journal

Philosophia MathematicaOxford University Press

Published: Feb 1, 2018

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
that matters to you.

over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Search Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly ### Organize Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place. ### Access Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. DeepDyve ### Freelancer DeepDyve ### Pro Price FREE$49/month
\$360/year

Save searches from
PubMed

Create lists to

Export lists, citations