Causation, Probability, and the Continuity Bind

Causation, Probability, and the Continuity Bind Analyses of singular (token-level) causation often make use of the idea that a cause increases the probability of its effect. Of particular salience in such accounts are the values of the probability function of the effect, conditional on the presence and absence of the putative cause, analysed around the times of the events in question: causes are characterized by the effect’s probability function being greater when conditionalized upon them. Put this way, it becomes clearer that the ‘behaviour’ (continuity) of probability functions in small intervals about the times in question ought to be of concern. In this article, I make an extended case that causal theorists employing the ‘probability raising’ idea should pay attention to the continuity question. Specifically, if the probability functions are ‘jumping about’ in ways typical of discontinuous functions, then the stability of the relevant probability increase is called into question. The rub, however, is that sweeping requirements for either continuity or discontinuity are problematic and, as I argue, this constitutes a ‘continuity bind’. Hence more subtle considerations and constraints are needed, two of which I consider: (1) utilizing discontinuous first derivatives of continuous probability functions, and (2) abandoning point probability for imprecise (interval) probability. 1Introduction 2Probability Trajectories and Continuity 2.1Probability trajectories 2.2Causation as discontinuous jumps 2.3Against systematic discontinuity 3Broader Discontinuity Concerns 4The Continuity Bind 4.1Retaining continuity with discontinuous first derivatives 4.2Imprecise (interval) probability trajectories 5Concluding Remarks Appendix 1 Introduction Analyses of singular (token-level) causation often make use of the idea that a cause increases the probability of its effect. Of particular salience in such accounts are the values of the probability function of the effect, conditional on the presence and absence of the putative cause, analysed around the times of the events in question: causes are characterized by the effect’s probability function being greater when conditionalized upon them. Put this way, it becomes clearer that the ‘behaviour’ (continuity) of probability functions in small temporal intervals about the times in question ought to be of concern. One prominent (but under-examined) account of token-level causation, that of Ellery Eells ([1991]), actually requires point ‘jumps’ (discontinuities) in the relevant probability functions for positive and negative token-level causes. In this article, I make an extended case that causal theorists employing the ‘probability raising’ idea should pay attention to the continuity question, as it has serious implications for the viability of their accounts. Specifically, if the probability functions are ‘jumping about’ in ways typical of discontinuous functions, then the stability of the relevant probability increase is called into question. The rub, however, is that sweeping requirements for either continuity or discontinuity are problematic and, as I argue, this constitutes a ‘continuity bind’. Hence more subtle considerations and constraints are needed. I begin by introducing the question of continuity in the context of causation and probability functions (trajectories) using the work of Eells—one of the few theorists to explicitly consider continuity.1 I then show how discontinuity requirements like Eells’s are untenable, and in a surprisingly decisive way. Next, I argue that the discontinuity issue also has problematic implications for causal accounts without explicit discontinuity requirements (Menzies [1989]; Noordhof [1999]; Hitchcock [2004]; Kvart [2004]; Northcott [2010]; Glynn [2011]). After this, I consider a blanket continuity requirement and show that while it is a tempting response, it is unworkable because of the need to allow for the possibility of (empirically motivated) discontinuity in probability trajectories. And therein lies the continuity bind. Finally, I consider two potential ways out of the bind: (1) utilizing discontinuous first derivatives of continuous probability trajectories for theoretically motivated discontinuity needs, and (2) abandoning point probability trajectories altogether in favour of imprecise (interval) probability trajectories. 2 Probability Trajectories and Continuity I present the continuity question here in a causal setting, roughly following Eells ([1991]); I then expand it to other prominent accounts in Section 3. Let x and y denote token events, where x takes place at time and place (tx, sx) and y takes place at (ty, sy). Assume that x’s being X caused (in some plausible way) y’s being Y, where x is of type X and y is of type Y. Of interest is how the probability of token event y’s being Y evolves between tx and ty, that is, how the probability of y’s being Y changes as a function of time. (I will abbreviate the token events of ‘x being X’ and ‘y being Y’ by just writing the properties exemplified, X and Y.) In the ensuing discussion, probability will be understood as objective and physical, that is, as a ‘physical probabilities’ or ‘chance’. A single-case time-dependent probability function, P, will be assumed as part of a probability space triple <Ω,F,P>, where Ω is a set, F is a σ-field over Ω, and P is a probability function on F that obeys the standard (Kolmogorov) axioms of the probability calculus. Physical probabilities apply to particular events, ones that occur or fail to occur at a particular time and place, and hence have values defined relative to a time of evaluation. To make explicit the temporal index, t, involved in evaluating the probability of event Y∈F at time t, I use the notation PY(t). In general, if an event Y occurs at a time ty, PY(t) is strictly between zero and one prior to ty, and one at time ty and all later times. Following Eells ([1991]) I use the term ‘probability trajectory’ to refer to a probability function understood in this way as a function of time.2 Again, while there are challenges to any interpretation of probability, in this causal setting, an objective physical understanding akin to ‘chance’ is a reasonable way to proceed; see (Eells [1991], pp. 34–55, Eells [2010]) detailed discussions. I (loosely) follow Jenann Ismael ([2011], pp. 419–20) with my pre-theoretic understanding of physical probability or chance, taking it to be ‘the link between the fundamental level of physical description in quantum mechanics and the measurement results that mark the points of empirical contact between theory and world’. I follow her in that my understanding is that physical probability is objective and non-trivial (not everywhere zero or one). I remain agnostic, however, with respect to her ultimate analysis of it—especially whether its grounding is at the quantum level or some higher level as in (Glynn [2010]; Fenton-Glynn [forthcoming]; Sober [2010]). 2.1 Probability trajectories As an illustration of a probability trajectory in action, consider the following example from (Rosen [1978]), modified here from (Eells [1991]): Example 1: A poorly putted golf ball is rolling roughly in the direction of the cup when a squirrel runs by and bumps it in such a way that its resulting trajectory is directly toward the cup and it continues right into the cup. Following the standard assumptions of such causal discussions, I take the probability values of PY(t) to reflect the objective probability of the event (ball going in the hole) and assume that it is strictly less than one until Y occurs. Suppose that the probability of the ball going into the cup, given its initial trajectory, velocity, and so on, is 0.25. Suppose further that, in general, the (type) probability of balls going in when squirrels bump them is very low (say 0.05); however, in this (token) case, the particular trajectory of the ball immediately following the bump was such that the probability of the ball falling in the cup was rather high, say 0.8. Let the event of the squirrel bumping the ball be x being X, and the event of the ball going into the cup be y being Y. The probability trajectory of Y can be depicted following Eells ([1991], p. 293), as in Figure 1. Figure 1. View largeDownload slide Probability trajectory with discontinuous jump at occurring event. Figure 1. View largeDownload slide Probability trajectory with discontinuous jump at occurring event. The standard analysis of this example is that the squirrel’s kick, X, caused the ball to drop into the cup, Y, despite the fact that, in general, squirrel kicks in such situations almost never result in the ball going into the hole. For causal considerations, the salient features of the graph are that the probability of Y takes an immediate point drop in probability at tx, corresponding to the type-level fact that X-type events generally decrease the probability of Y-type events, and that the probability of Y recovers immediately after the ball is bumped at tx to a higher value than it had before, because of the favourable trajectory and velocity actually imparted by token event X. Hopefully, this causal story is plausible enough, though its causal details are not the primary concern here. For present purposes, the crucial features of the graph are the discontinuities at tx and ty, that is, the fact that the probability of Y ‘jumps’ up just after x happens and then ‘jumps’ again to one at the moment the ball falls into the cup. As can be seen from the graph, Eells employs jump discontinuities in the probability trajectory to (a) indicate that a token (positive) ‘cause’ has taken place, and (b) emphasize that the world is chancy or indeterministic at the macro-level in that the probability of the event in question is bound away from one until it happens.3 Use (a) is central to his account (as I detail below); I will call this general idea the causal discontinuous jump principle (CDJP): CDJP: The probability trajectory of an event, e, that occurs at te jumps discontinuously at times when events causally relevant to e occur. The status of the second use of discontinuity, (b), is less clear in Eells’s work. This ‘occurring event discontinuity’ assumption is made and discussed explicitly (Eells [1991], p. 294). I will refer to this assumption as the discontinuous jump principle (DJP): DJP: The probability trajectory of an event, e, that occurs at te jumps discontinuously to one at time te. Notice though that the assumption that the probability of an event is not one until the event occurs is also consistent with the graph continuously approaching one from below. Eells recognizes that the indeterminism could also be represented in a continuous fashion, with the probability continuously approaching one from below. But he writes that his analysis does not ‘pay attention’ to whether the trajectory is continuous at the time that the event occurs (Eells [1991], p. 294, Footnote 6), and thus is not explicitly committed to DJP. He does, however, consistently draw all his graphs with DJP discontinuities. Consider the alternative graph depicted in Figure 2, in which the probability trajectory continuously approaches one at ty. It is equally true in this graph that the probability of Y is strictly less than one until it actually occurs at ty. The difference between this graph and the graph in Figure 1 is that in Figure 1, the value of PY(t) is bound away from one prior to ty, while in Figure 2, the value of PY(t) becomes arbitrarily close to, but is always less than, one as t approaches ty. I argue below that Eells was mistaken about nothing turning on DJP and that thinkers concerned with probability should indeed ‘pay attention’ to this continuity issue. But first I sketch his causal account. Figure 2. View largeDownload slide Probability trajectory with continuous probability at occurring event. Figure 2. View largeDownload slide Probability trajectory with continuous probability at occurring event. 2.2 Causation as discontinuous jumps In his treatment of this example, Eells depicts the probability trajectory of PY(t) as in the graph in Figure 1. The crucial feature of this graph for the causal question is that the probability of the ball falling in the cup is higher immediately after the ball is bumped at tx than it was prior to the bump, despite the fact that, in general, such types of events lower the probability of the ball going in, as indicated in the graph by the immediate point drop in probability at tx.4 According to Eells, this structural property of the example is what leads us to say that this token squirrel bump caused the ball to go in, despite the fact that, in general, such bumps tend to prevent balls going in rather than cause them. When the probability trajectory of an event, y, has this structure, Eells defines y to have occurred ‘because’ x occurred. Explicitly, what is required for event y’s being Y ‘because’ event x was X are the following three conditions: (i) the probability of Y changes at the time of x, (ii) immediately after the time of x, the probability of Y is both high and higher than before x, and (iii) this probability remains high until the time of y. Eells describes three additional causal relations: an event’s occurring ‘despite’ another event, an event’s being ‘independent’ of another event, and an event’s being ‘autonomous’ of another event. Though the details of these three conditions will not be of particular concern here, the basic idea is that in the ‘despite’ case, the probability decreases (and remains low); in the ‘independent’ case, it remains the same; and in the ‘autonomous’ case, the probability increases to a high level, but then drops to a low level. Eells ([1991], p. 355) defines each of these relations in terms of the left and rights limits of PY(t) at tx. He also specifies the qualifications needed to preserve the ‘causes increase the probability of their effects’ idea, in particular that one must hold fixed the set, K, of the actual, separate, independent causes of Y and also any interactive factors by which X influences the probability of Y (Eells [1991], Section 6.4). In order to build in the temporal evolution of the probability in the proper way, Eells further specifies Wt be the conjunction of all factors of the world (relevant to y’s being Y) that have fallen into place by time t and whose exemplification (relative to K) can be traced back to the exemplification of X at tx. He then defines PY(t)=P(Y|K & Wt) for all times t. With this understanding of the probability trajectory, the degree to which y is Y because of, despite, independent of, and autonomous of x’s being X is defined as follows:   B=M−P−,D=P−−P+,I=1−|P−−P+|,A=P+−M, where P– and P+ are the left- and right-hand limits of PY(t) at tx, respectively, and M=min {PY(t):t in (tx,ty)}, which can be thought of, intuitively, as the lowest point in the probability trajectory between tx and ty. It should be clear that in the case of ‘despite’, if X is to play a negative (‘despite’) causal role (D > 0), that P−−P+>0, from which it follows that P+≠P− or, in other words, that the limits of PY(t) from the left and the right at tx are not equal, which entails a jump discontinuity at tx. The same follows for the ‘because’ relationship as well.5 Thus if X is a positive or negative token cause of Y, then PY(t) has a jump discontinuity at the time of X. That is, Eells’s account requires an event’s probability trajectory to have discontinuous jumps at times at which causally relevant events occur.6 Let me stress again that Eells’s official position is that it is inconsequential as to whether occurring events jump discontinuously to one (DJP), though he favoured and utilized the discontinuous version. But as to the question of whether causes require a jump discontinuity in the probability trajectory of their effects (CDJP), Eells’s answer is an unequivocal ‘yes’, since his entire token-level account depends on such discontinuities. It is surprising that this aspect of his influential account has received virtually no discussion. 2.3 Against systematic discontinuity In this section, I make a detailed case for why systematic discontinuity requirements like DJP or CDJP are problematic. I first direct my case against DJP. The reasons for beginning with DJP are: (1) DJP is more general and is thus of interest in its own right, outside the setting of probabilistic causality (for example, event ontology, chance, and so on), and (2) the argument is more straightforward and perspicuous in the case of DJP and requires only minor adjustment to apply to CDJP as well. In what follows, I present the main thread of a formal argument against DJP (the details of which can be found in the appendix) and then show how it extends to CDJP. Reconsider Example 1—especially the period from after the time the squirrel bumps the ball to the time it enters the cup. The instant the ball comes off the bump, it has a certain trajectory and speed, one that will take it directly into the cup; this is why the probability of Y is high after that instant. As time gets closer to ty and the ball gets closer to the cup, the number of eventualities that could prevent the fall into the cup decreases, and so its probability continues to increase. In other words, as the ball passes by points on the green closer and closer to the cup, and with the same favourable trajectory and speed, the probability of its going in the cup would naturally be expected to continue to get closer and closer to one. While these considerations alone favour a continuous increase of PY(t) to one, a stronger case can be made. If the probability trajectories of all occurring events jump discontinuously at the instant they occur (DJP), then the probability trajectory for each of the occurring events ‘leading up’ (causally) to the event under consideration would also have a jump discontinuity at the time of their occurrence. The probability (trajectory) of the original event is not independent of the probability (trajectories) of certain events leading up to it, that is, its probability depends on certain events that need to ‘fall into place’ in order for it to happen—and this gives rise to problems. Returning to Example 1: between the time of the cause, tx, and the time of the effect, ty, the graphs of both versions (Figures 1 and 2) depict the probability trajectory as continuous in the interval just to the left of ty. This, however, does not accord with the ‘jumpy’ nature of the probabilistically relevant prior events falling into place. If all the events involved in the ball traversing the points on the green after being bumped and before entering the cup have probability trajectories that have a jump discontinuity at the time they occur, then it seems that the probability trajectory of Y (ball falling into the cup), which depends upon these events falling into place, should reflect this discontinuous ‘jumping’ at the times these prior events occur. Such considerations suggest that the discontinuous jump (as mandated by DJP) in the probability trajectory of an occurring event is inconsistent with the probability trajectory PY(t) being continuous in the interval just before ty, as it must be in order to have a jump discontinuity. If this is right, then assuming DJP in such settings is inconsistent, since PY(t) is required (as depicted) to be continuous in at least some small interval to the left of ty. I now put this objection on a formal footing to show more precisely the source of the problem. The form the argument will take is that of an inconsistent/incoherent dilemma, namely, that DJP in this setting entails either that the probability trajectory, PY(t), is discontinuous from the left at ty (has no left-hand limit), which is inconsistent with there being a jump discontinuity at ty, or the certainty (distance the probability is from one) of antecedent events upon which Y depends becomes arbitrarily larger than the certainty of Y itself, which will be shown to be an incoherent result. For definiteness, the setting will parallel Example 1 and concern the assessment of the causal relevance of a token event x being X for event y being Y, where these events occur at tx and ty, respectively. I will show that DJP entails the unintended (and unexpected) consequence that PY(t) has no limit from the left (is left discontinuous) at ty. To get the argument off the ground, I make use of a well-known theorem from probability, Bayes's theorem, which states that   PY|X(t)=PX|Y(t)PY(t)PX(t), PX(t)>0, where PY|X(t) is the probability of Y conditional on X at time t, and similarly for PX|Y(t). A simple variation of this that will be useful here is:   PY(t)=PY|X(t)PX(t)PX|Y(t), PX|Y(t)>0. (1) The role of Equation (1) will be to instantiate in a formal way the intuitive idea expressed by the idea that the probability (trajectory) of the event under consideration depends somehow on the probability (trajectories) of the events that fall into place leading up to it.7 Consider a sequence of moments, {t̂i}, converging to ty (the moment Y occurs) and a sequence of events, {Xi}, occurring at these times and upon which Y probabilistically depends. In the context of Example 1, these events and moments will be where the ball was at half of a second before it went in, and a fourth of a second, an eighth of a second, and so on. More formally, we might put this as t̂i=ty−12i and Xi= the event of the ball being where it was at t̂i with the particular favourable trajectory it had.8 In order to formalize the degree to which probability trajectories jump discontinuously to one, I use the left-hand limit of the probability of event X as time approaches the time X occurs (from before) and define the ‘distance’ of the jump to one as   D(X,t)=lims→t−1−PX(s). (2) The value of D(X,tx), where tx is the time X occurs, is the distance the probability trajectory jumps to reach one when X occurs. That D(X,tx) is greater than zero for all events is equivalent to the assumption of DJP. Returning to sequence {Xi}, we see that {D(Xi,t̂i)} might be thought of as the ‘degree of indeterminism’ or ‘chanciness’ for each of the Xi events. In terms of Example 1, it would be values representing how far the probability of the ball being where it is at one-half of a second before it goes in, a fourth of a second, an eighth of a second and so on, jumps as each of those events occur. The {Xi} sequence of events leading up to Y will be used to show that DJP is problematic. At this point, the argument bifurcates based on whether the sequence {D(Xi,t̂i)} converges to zero, that is, whether   limi→∞D(Xi,t̂i)=0. (3) I will show that if it does not converge to zero, then we have the inconsistency horn of the dilemma; and if it does converge to zero, we have the incoherency horn. 2.3.1 Inconsistency The general strategy for this horn is to show how, using Bayes’s theorem, if the ‘event occurring jumps’ of the events getting close in time to Y do not converge to zero, then this forces probability trajectory PY(t) to be discontinuous to the left of ty (not have a limit as time approaches ty from the left), which contradicts the requirement that the left-hand limit of PY(t) at ty exists. The full details of the proof can be found in the appendix, which I sketch below. Let 0 < L < 1 be the left-hand limit of PY(t) at the time Y occurs, ty. To show that PY(t) is discontinuous from the left at ty, it is sufficient to find a sequence of times, {ti}, converging to ty, such that PY(t) evaluated at those points does not converge to L, that is, {PY(ti)}→L. In terms of the definition of convergence, this means showing that there is an ε>0 such that   |PY(ti)−L|≥ε, for some i greater than anyN > 0. (4) The next task is to actually construct problematic sequence {ti} and show that it satisfies Equation (4). Since, by hypothesis, {D(Xi,t̂i)} does not converge to zero, there is an ε̂>0 such that   D(Xi,t̂i)≥ε̂, for someigreater than anyN > 0. (5) It will simplify the notation to define {Li} to be the sequence of left-hand limits for each of {Xi}. So Li=1−D(Xi,t̂i), from which it follows that |1−Li|≥ε̂ because of Equation (5). To construct the sequence of times, {ti}, that will generate the contradiction, we must find a sequence of moments slightly before each of the {t̂i}, since we will be interested in what happens to the probabilities of {Xi} right before the time they occur—recall that PXi(t̂i) is simply equal to one. A natural choice would be to pick moments such as the following: t̂i−110i. Sequences of moments like this always occur just before moment {Xi} occurs, and are such that as i →∞, they get arbitrarily close to those moments, t̂i. But we need the new sequence to be more ‘tightly’ tied to PXi(t). The factor required will depend on each of the PXi(t) and how quickly each approach their limit near t̂i. The limit in question is Li, so by the definition of limit, we can find a minimal distance, δ > 0, such that the distance between PXi(t) and Li can be made less than an arbitrary ε>0 for all t within the minimal distance δ of t̂i. The ‘ ε’ we use here comes from the assumption that D(Xi,t̂i)→0, namely, ε̂/2, which is half of the ε̂>0 we have from Equation (5). Now consider the sequence of {δi}, each greater than zero, with the property that if |t̂i−t|<δi, then |Li−PXi(t)|<ε̂/2. Again, that δi > 0 exist follows from the definition of the (left) limit of PXi(t) at t̂i being Li. Finally, we define {ti} as ti=t̂i−δi10i. Thus the sequence of {ti} is such that as i →∞, the tis get arbitrarily close to the t̂is (the times that Xis occur), and further, each ti is within δi of t̂i, so |Li−PXi(ti)|<ε̂/2, for each ti. From this it follows that   1−PXi(ti)≥ε, for someigreater than anyN > 0, (6) where ε=ε̂/2. (See the appendix, near Equation (A.3) for details.) In making use of Bayes’s theorem, the conditional probabilities PYXi(ti) and PXiY(ti) will be needed. It will be assumed that limi→∞PYXi(ti)=L and limi→∞PXiY(ti)=1. The reasoning for this is as follows: given that the limit of PY(ti) is L, conditionalizing on the Xis, which are the particular events leading up to Y, should not affect convergence; similarly since (by definition) {Xi} are events leading up to Y at times, {ti}, converging to the time they occur, t̂i, the probability of these events at these times conditionalized on Y will naturally converge to one as time converges to the time of Y.9 The convergence of the conditional sequence PYXi(ti) to L entails that for any ε1>0, there is an N1>0 such that for all i>N1,   |PYXi(ti)−L|<ε1. (7) Similarly, the convergence of the conditional sequence PXiY(ti) to one entails that for any ε2>0, there is an N2 > 0 such that for all i > N2,   |1−PXiY(ti)|<ε2. (8) From this, appropriate values can be chosen for ε1 and ε2 in terms of the ε from Equation (6) and then Equation (1)—namely, Bayes's theorem—can be utilized to show that   |PY(ti)−L|≥ε′>0, (9) for some i greater than any N>max{N1,N2}>0, where ε′=εL(ε+4)4(ε+2). Thus PY(t) is discontinuous from the left at ty. (See Equations (A.7) through (A.11) in the appendix for details.) 2.3.2 Incoherence Having shown that if the sequence of ‘indeterministic jumps’, {D(Xi,t̂i)}, does not converge to zero, then we have the contradictory result that PY(t) is discontinuous from the left at ty, consider now the case in which {D(Xi,t̂i)}→0. Again, this means that the difference between the left-hand limit of PXi(t) at t̂i and one can be made arbitrarily small for large enough i. That is, we can find events, Xi, probabilistically relevant to Y and arbitrarily close to the time of Y by picking an i large enough such that the probability of Xi just before t̂i is arbitrarily close to one—this despite the fact that the probability of Y at t̂i is bound away from one. For example, this means that the probability of the ball being at a point arbitrarily close to falling into the cup at an arbitrarily small instant before (say ε) it actually does is a million, or a billion, or trillion, and so on, times closer to one than the probability of the ball’s falling into the cup the same arbitrarily small instant, ε, before it actually does. More concretely, pick any location and time that the ball was very close to falling in, say within 110256 of a second and 110512 of an inch from the edge. At an even smaller instant before the ball was at this location, its probability of being there was any huge number you like, say 101024, times closer to one than the ball’s probability was when it was a trillion (or any huge number you like) times closer to falling in. In short, this result says that while the probabilistically relevant antecedent events of the ball being closer and closer to the hole with the favourable trajectory it had are such that their probability right before they happen are getting as close to one as you like, the probability of the ball falling in the hole, as close to the time it did as you like, is as many times farther away from one as you want to make it. That an event’s probability right before it happens is arbitrarily farther away from one than is each of an infinitesimally close series of (probabilistically) relevant events leading up to it is unintelligible. This I offer as the incoherency horn of the dilemma.10 2.3.3 Extending to CDJP Having made the case that DJP is untenable, it is relatively straightforward to extend this to CDJP. The same construction technique from Section 2.3.1 (just before Equation (6)) can be employed to generate a similarly problematic sequence of events, {Xi} and {ti}, where {ti} converges to tx (the moment x occurs) instead of ty, and in this case Xi = = the event of the ball and the squirrel being where they were and in the particular states (velocity, direction, internal states, and so on) at ti, which is converging to tx. In the case of DJP, we exploited the fact that PY(t) has to ‘jump’ because of its dependence (via Bayes’s theorem) on the PXi(t), which jump as Xi occurs at ti; but in the case of CDJP, the problem is more immediate. It seems incontrovertible that the {Xi} events are positive (‘because’) token causes, especially for large i when ti gets very (arbitrarily) close to tx.11 But according to CDJP, this would require that PY(t) must at the very least have jump discontinuities at each of {ti}, corresponding to each of the {Xi} being a ‘because’ token causal factor—and this raises numerous problems. First, Eells explicitly states that it is a requirement that PY(t) be constant in some small interval to the left (and right) of tx, but this cannot be the case if each {Xi} are ‘because’ token causal factors. Second, things would seem to get even worse in that there are an uncountable number of possible such ‘because’ token causal events just prior to tx, which might well be in tension with core parts of real analysis.12 But there is a potential way out, which is related to how Eells deals with suggestions that his account cannot distinguish between simultaneous events as far as causal relevance goes. Recalling that the causal background context, K, and temporal indexical set, Wt, is built into the definition of the probability trajectory, that is, PY(t)=P(Y|K & Wt), it might be open to him to deny that the probability trajectories are the same trajectory (function) for each of the {Xi} events and the X event.13 In other words, when considering X’s causal significance for Y, we have   PY(t)=P(Y|K & Wt), but for each Xi’s probability trajectory for Y, we have   PYi(t)=P(Y|Ki &(Wi)t). Thus the PYi(t) could be such that they were (point-wise) converging to PY(t) (from below) and still jumping (as required) at the critical time, ti. In particular, the PYi(t) functions would presumably be increasing such that PYi(t)≤PYi+1(t)≤…≤PY(t), for t<tx (Figure 3). A virtue of this response would be that the increasing nature of the probability (trajectories) near the critical point, ti, would fit neatly with the idea that as each of the events in question—namely, the ball and squirrel being where they were (Xi) at instants closer and closer to when they collide (ti) in the favourable way they did with respect to causing the ball to fall in the hole (Y)—actually occur, the probability of Y increases. Figure 3. View largeDownload slide Probability trajectories PYi(t) for Y with respect to Xi converging to PY(t) from below. Figure 3. View largeDownload slide Probability trajectories PYi(t) for Y with respect to Xi converging to PY(t) from below. Unfortunately, however, this response is not available on Eells’s account, as it stands. The first, not insurmountable, problem is that as mentioned above, Eells requires that probability trajectories be constant in some open interval to the left (and right) of the jump discontinuities. Eells was (rightly) concerned about the limits upon which his account so crucially turns, and in personal correspondence, he indicated that the reason for requiring the trajectories to be constant in this way was indeed in part to ensure that the limits exist, which of course it would, but at the steep cost of coherency. To simply require that the left- and right-hand limits exist, and not necessarily be constant in any open intervals to the left (or right), would seem to be a less restrictive alternative, and it would allow the ‘series of increasing trajectories response’ I just sketched. But there is a second problem—and that is that the (extensional) way that Eells has defined the causal background context, K, and temporal indexical set, Wt, does not allow him to assert that the trajectories associated with the Xi’s causal relevance for Y are distinct from X’s trajectory, PY(t). This is because K is defined to be the set of factors causally independent of X or Xi that are causally relevant to y’s being Y in the actual situation, and thus are identical for X and Xi. And as for the Wt, the (actual) factors relevant to y’s being Y that are not in K but have fallen into place by time t, these too will coincide in the cases of the Xi and X.14 It seems then that utilizing systematic discontinuities for understanding occurring events (DJP) or causation (CDJP) is untenable, but it turns out that even the threat of discontinuities in probability trajectories gives rise to problems for causal theorists, as I take up next.15 3 Broader Discontinuity Concerns The attention Eells paid to continuity was well placed: how the probabilities evolve through time is both relevant and non-trivial. Such continuity issues ought to be of more concern to all causal theorists exploring probabilistic accounts. Most other prominent accounts have the opposite concern from Eells: they depend—at least implicitly—on the continuity of probability trajectories. Accounts most obviously embroiled with continuity are those that make explicit use of temporal conditions, for example, Peter Menzies's ([1989]), which utilizes ‘temporally dense’ chains of (counterfactual) probability increases, and Igal Kvart ([2004]), which looks for ‘stable screeners’ and ‘causal relevance neutralizers’ in temporally intermediate events between cause and effect. If the temporal evolution of the probability values in question cannot be assumed to be continuous, this strains such accounts by rendering the probability in the interval potentially unstable in that it may jump between values that may or may not preserve the presence of the relevant probability increases or the absence of stable screeners (probability decreasers). Menzies’s account does not explicitly address continuity, but it does, however, implicitly constrain discontinuities. Building on David Lewis’s ([1986]) counterfactual analysis in terms of unconditional probabilities, Menzies requires that causally related events c and e be ‘probabilistically dependent’. He understands this to mean that there must be intermediate events corresponding to any finite set of intervening times between the times of c and e such that the actual probability of each of the intervening events is significantly higher than it would have been had the immediately preceding event in the set not happened. Put in terms of Eells’s probability trajectories, this effectively requires the probability function to be monotonically increasing between the times of c and e, and thereby limits the number and kind of possible discontinuities (recall Footnote 12).16 Kvart seems to be sensitive to the possibility of the inequality flipping at intermediate temporal points between c and e, and offers a condition that may be intended to prevent it. The condition is developed in an example in a section entitled ‘Illustrating Causal Relevance through Infinite Regress’ (Kvart [2004], pp. 369–70), but is difficult to assess. The example involves considering a series of events at intervening moments between putative cause and effect, and verifying that the inequality does not reverse at any of the points—that they are not ‘neutralizers’. Causal relevance is defeated if one of the moments reverses the inequality. But if one can continue indefinitely without a reversal, one thereby constructs an infinite series of moments that are not neutralizers. His full condition is that ‘there is only infinite regress of this sort (i.e. there not being a suitable terminating chain)’, and this establishes that c is causally relevant to e, since it guarantees that ‘there is no neutralizer for c and e’ (Kvart [2004], p. 370). The logical form of the condition seems to be that all such sequences of intervening events converging to an intervening time must not terminate with (contain) a neutralizer. If so, then it might be understood as employing (part of) an alternative specification of the standard epsilon-delta definition of continuity—constraining all convergent sequences in the domain—but in such a way that it entails not full continuity, but rather something close to Menzies’s monotonically increasing condition. While other prominent probabilistic accounts of causation may eschew explicit temporal conditions, not surprisingly, they are not able to avoid temporal (and hence continuity) issues altogether. For example, in (Noordhof [1999]; Hitchcock [2004]; Northcott [2010]; Glynn [2011]), one finds reference to probability inequalities assessed ‘shortly before’ the time of the cause and/or effect.17 With some variation, these accounts all consider the probability of event e just before it occurs, conditional on the presence and absence of putative cause c. The critical inequalities involve conditional probabilities at moments just before the time of the cause, tc – ε, or ‘shortly before’ the time of effect, te – ε, and perhaps at times in between. This comparison is typically assumed to be stable, that is, that one can ignore the precise ε>0 magnitude expressed by ‘shortly before’, safely assuming that if ε is sufficiently small, the values of the probabilities will retain the property of interest, an inequality in this case. The inequality must be assumed to hold for all values closer than ε, because otherwise its holding would be completely arbitrary: it could be made to hold or not depending on the particular ε one chose, which would render the inequality meaningless for the purpose at hand. There is a very important difference between it holding for some ε>0 and it holding for some ε>0 and all smaller values. This kind of stability can be generally assumed only if the probability trajectories are continuous with respect to time to the left of tc and/or te. I focus below on time te, but the same reasoning applies to time tc or any other time between them. The relevant probability inequality is:   Pte−ε(e|c)>Pte−ε(e|∼c). (10) The critical probabilities are shortly before the time of putative effect e because at the precise time of e, the (conditional) probabilities are trivial. If the probability trajectories involved are not continuous to the left of te, then the mere fact that the inequality holds at a given time shortly before te fails to ensure that it will hold (to the left) in any interval about te. If the inequality could be reversing in the neighbourhood of (te−ε,te), then it holding at te – ε is not going to be decisive for the causal efficacy of c, since such accounts clearly require a non-arbitrary sense of Equation (10) for their warrant and plausibility. Luke Glynn’s ([2011]) admirably complete account shows that even when utilizing variables instead of events for the relevant probability assessments, there remains a dependence on time, and hence continuity. Glynn ([unpublished]) originally employed a just before ε-inequality; however, the version published as (Glynn [2011]) eliminates such explicit reference to time, expressing causal conditions instead in terms of the conditional probabilities of variables attaining a value. Nonetheless, a temporal index plays a role in the definition of Glynn’s ‘revealer of positive relevance’ set, which is to ‘include only variables representing events occurring no later than tE’ (Glynn [2011], p. 358). Glynn also employs a condition reminiscent of Menzies and Kvart in requiring that there be the right combinations of ‘increasers’ (supporters of the inequality) and ‘decreasers’ (under-cutters of the inequality) in interval (tc, te). While this might be thought to stabilize the inequality in the same way as continuity, the notion of the ‘right combination’ is only coherent if the set of potential increaser–decreaser points is a finite set, which is certainly not the case for interval (tc, te). Finally, in his discussion of the ‘hiker ducking boulder’ example, Glynn ([2011], p. 382) proceeds by ‘interpolating a variable’ along the route of the boulder by which time it is too late for the hiker to duck. So, even when utilizing ‘variables attaining values’ instead of ‘events’, temporal indices and their attendant continuity issues still loom. Continuity assumptions may also come into play in related discussions, for example, generalized causal relevance (Hitchcock [1993]), deterministic chance (Glynn [2010]), or rational belief revision (van Fraassen [1984]). In the first, Hitchcock’s generalized account of causal relevance utilizes probability spaces and measures to define conditional probability functions on variables (for example, amount of medicine, and blood pressure). He analyses (certain) causal claims as captured by conditions like: ‘“there exists an m such that x > m implies that f(x)>1−ε”, where the value of ε is typically left vague’ (Hitchcock [1993], p. 350). Glynn uses the inequality Ch(tc−ε)w(pe|pc)>Ch(tc−ε)w(pe|∼pc), translating it as, ‘just before c occurred, the chance of e conditional upon the occurrence of c was greater than the chance of e conditional upon the non-occurrence of c’ (Glynn [2010], p. 74). Finally, van Fraassen employs credence functions indexed by times towards the future, where Pt is the agent’s credence function at time t, and Pt + x is her function at a later time, t + x (van Fraassen [1984], p. 244). To summarize, if probability trajectories cannot be assumed to be continuous, then probabilistic accounts of causation are undermined because the probabilities around the times of interest are rendered potentially ‘unstable’, in the sense of jumping between values that may or may not preserve the relevant features of the probabilistic analysis—typically, probability increases in the presence of the putative cause expressed in the form of an inequality. A glaring question at this point might be: why not simply require that probability trajectories be continuous? There are two reasons to worry about this move. The first and immediate reason is because it seems that some kinds of events must be understood as having discontinuous probability trajectories, for example, quantum events. I take this up at length in the next section. A second, perhaps less obvious, reason is that such a continuity assumption would ‘definitionally’ legislate a priori against a particular qualitative feature of probability trajectories (a discontinuity) that may well turn out to be relevant to causation and other empirical and metaphysical questions. Whatever the remedy, it ought not be so restrictive as to decide such substantive empirical or philosophical questions by definition. 4 The Continuity Bind So far, if one is ‘keeping score’, the tally would seem to favour continuity. One important and influential analysis of causation (Eells [1991]) has faltered because of a dependence on discontinuity. Add to this the very real (if neglected) continuity needs of many other probabilistic causal analyses and the balance would seem to tip towards continuity. Enter quantum theory. In what follows, I consider the case for discontinuous probability trajectories based on quantum phenomena, which effectively ties the score and creates a real ‘continuity bind’. I then sketch two possibilities for mediating between the demands for continuity assumptions and the need not to rule out the possibility of discontinuity. Quantum events like the decay of an atom may well have a non-trivial probability of occurring that does not change through time. Accordingly, at the instant they occur, their probability trajectory will jump from a constant value to one as in Figure 4. Quantum events seem to be of a singular kind that does not depend on any ‘ordinary’ causal factors, and hence have a probability trajectory that does not evolve through time until it jumps to one. One might suppose that these discontinuities could be limited to the quantum level, but this is not obviously possible. A straightforward example suggesting otherwise involves nothing more than a Geiger counter that emits a clicking sound (macro-level event) when a micro-level decay event is detected.18 Another response might be to maintain that while at the quantum level such quantum events have discontinuous probability trajectories, macro-level events that involve quantum events will ‘dampen out’ the discontinuity. On such a view, events at the macro-level always have duration; they consist of intervals of time (and space). Thus the discontinuity is avoided at the macro-level because the detection event and the ensuing clicking event have temporal duration during which the probability of the click (detection event) can increase sharply but continuously to one. Let such a quantum-level decay event have a probability of r, and then the graph in Figure 5 represents the probability trajectory of Y= the click of the counter, with interval [ty,ty+δ] being the duration of the detection event, that is, the time from the decay event through the detection and ensuing click. The probability of Y prior to ty is r−ε, for some ε>0, which incorporates the possibility of a mechanical or other macro-level failure to detect the particle or to bring about the click. Figure 4. View largeDownload slide The discontinuous probability trajectory of a quantum-level event. Figure 4. View largeDownload slide The discontinuous probability trajectory of a quantum-level event. Figure 5. View largeDownload slide The continuous probability trajectory of the detection of a quantum event. Figure 5. View largeDownload slide The continuous probability trajectory of the detection of a quantum event. At best this response—which understands macro- and quantum-level (physical) probability and events as distinct kinds, with macro-level events having a duration that ‘dampens out’ the discontinuous probability at the quantum level—saves only macro-level continuity, and with the cost of assuming a bifurcated view of physical probability and/or events. And more problematically, it involves making significant assumptions about how empirical theory will ultimately unfold. Retaining at least the possibility of discontinuous probability trajectories at all levels would seem to be the preferable way to proceed. Thus we have the continuity bind. There are pressing needs for both continuity and the possibility of discontinuity. In particular: requiring systematic discontinuities like DJP or CDJP is problematic, presupposing everywhere continuous probability trajectories is too restrictive, and yet probabilistic causal analyses require some of the stability provided by continuity assumptions. Next, I sketch two possibilities for mediating between the demands for continuity assumptions and the possibility of (empirically or metaphysically motivated) discontinuity, namely, by (1) utilizing discontinuous first derivatives of everywhere continuous probability trajectories, or (2) abandoning point probability trajectories altogether in favour of imprecise (interval) probability trajectories. 4.1 Retaining continuity with discontinuous first derivatives If it turns out that some events falling into place—in particular, probabilistically significant ones—need to involve some kind of qualitative ‘shift’ in the relevant probability trajectory, then, as we have seen in Eells’s work, this shift can be understood as a jump discontinuity. But there is another way. An alternative is to capture such a shift by a jump in the rate of change of the probability trajectory. Formally, this could be put in terms of the first derivative (with respect to time) of the probability trajectory function, PY′(t), which is interpretable as the rate of change of the original function PY(t). The role played by jump discontinuities in PY(t) in causal analyses could be handled in a parallel way by jump discontinuities in PY′(t), and the jumps in the first derivative would still correspond to, and be grounded in, the idea that ‘causes’ constitute shifts in the probability trajectory—in this case the shift would not be a discontinuous jump in the probability trajectory itself, but rather a discontinuous jump in the rate of change of the probability trajectory.19 The probability trajectory, PY(t), could then be assumed or required to be continuous, thereby avoiding the discontinuity issue altogether. There would still be significant work to be done developing the precise conditions, since such conditions would be based in both (1) the discontinuities of the derivative of the probability trajectory, and (2) the shape of the now continuous probability trajectory itself, and these may or may not be easy to put together. In particular, for an event to be a causal factor, it would have to result in the appropriate jump in PY′(t), but it would also have to be the case that PY(t) itself remained high, as per (Eells [1991], Chapter 6), especially Section 6.2). Another issue to work out would be precisely how differentiable PY(t) would have to be. It would not be necessary to require PY(t) to be everywhere differentiable; as long as the left and right derivatives existed at the point in question, the derivative itself could fail to exist.20 While this approach may have promise for causal accounts, and especially for Eells’s discontinuity problems, there is little reason to think that it is viable or even coherent for quantum mechanics. Further, as already discussed, a continuous probability trajectory would at best alleviate the need for macro-level discontinuities, and while the idea of entirely distinct kinds of (physical) probability at the quantum and macro-levels is not incoherent, it is not an attractive philosophical commitment, as it legislates too much from a pre-empirical framework. 4.2 Imprecise (interval) probability trajectories Within the setting of point probabilities, the pull towards and push away from continuity does indeed constitute a bind, but this is not necessarily so in settings in which probabilities are not points, but rather intervals. Imprecise (non-point-valued) probability has been studied for some time in applied and subjective probability settings—for example, (Walley [1991]; Kyburg [1999]; Weichselberger [2000])—and has garnered renewed interest of late; see (Augustin et al.[2014]). And recently imprecise probabilities have been further extended to objective understandings of chance (physical probabilities); see (Fenton-Glynn [forthcoming]; Peressini [2016]). The advantage of interval-valued probability is that the notion of a continuous function opens up when the function in question is an interval-valued function. It turns out that there are multiple ways to generalize the standard point function definition of continuous, and thus one could seek to develop and employ a kind of continuity that is not so restrictive as to decide substantive philosophical questions by definition, stabilizes causally salient probability inequality claims between trajectories, and retains the possibility of jumpiness to capture quantum or other theoretically motivated discontinuity. The area in which to seek more general notions of continuity is generalized set-valued analysis. In set-valued analysis, one finds weaker notions of continuity, for example, upper and lower semi-continuity, that could be employed in a way that minimizes the restrictions on discontinuities, actually allowing some kinds, and yet offering sufficient stability for causal considerations.21 Another approach would be to utilize classes of interval functions generated by distinct generalizations of continuity. For example, Anguelov et al.([2006]) develop three distinct notions of continuity—S-continuity, D-continuity, and H-continuity—that apply to interval functions. The class of S-continuous interval functions seem especially well suited. And Anguelov ([unpublished]) explores how by pairing a lower semi-continuous function, f, with an upper semi-continuous function, f¯, such that f̲≤f¯ produces interval function F=[f̲,f¯], which is a completely novel entity from both algebraic and topological points of view. Such functions can be quite jumpy (discontinuous in the ordinary sense) with the caveat being that the upper endpoint function can only jump up and the lower endpoint function can only jump down, and hence such functions do not have the problematic ‘gaps’ that discontinuous point-valued functions can have. 5 Concluding Remarks I hope to have shown that the question of the continuity of probability trajectories does indeed have important implications for any account of causality that employs a form of the probability raising condition. If the probability functions can be ‘jumping about’ in ways typical of discontinuous functions, then the stability of the relevant probability increase is called into question. Because of this, coupled with pressure from physical (quantum) theory to allow for the possibility of discontinuities, one is faced with a (dis)continuity bind that appears to be difficult to resolve in the standard framework. While a continuous trajectory with discontinuities in its first derivative doing this work is one possibility, exploration of the imprecise probability framework seems the more promising option for a way out of the bind, that is, a way of retaining the possibility of discrete shifts in a qualitative feature of a probability trajectory, while still enjoying some of the stability of continuity. Appendix This appendix contains the detailed proof sketched in Section 2.3.1. Let 0 < L < 1 be the left-hand limit of PY(t) at the time Y occurs, ty. To show that PY(t) is discontinuous from the left at ty, it is sufficient to find a sequence of times, {ti}, converging to ty such that PY(t) evaluated at those points does not converge to L, that is, {PY(ti)}→L. In terms of the definition of convergence, this means showing that there is an ε>0 such that   |PY(ti)−L|≥ε, for someigreater than anyN>0. (A.1) The next task is to actually construct the problematic sequence, {ti}, and show that it satisfies Equation (A.1). Since, by hypothesis, {D(Xi,t̂i)} does not converge to zero, there is an ε̂>0 such that   D(Xi,t̂i)≥ε̂, for someigreater than anyN>0. (A.2) It will simplify notation to define {Li} to be the sequence of left-hand limits for each of the {Xi}, so Li=1−D(Xi,t̂i), from which it follows that |1−Li|≥ε̂ because of Equation (A.2). To construct the sequence of times, {ti}, that will generate the contradiction, we must find a sequence of moments slightly before each of the {t̂i}, since we will be interested in what is happening to the probabilities of the {Xi}s right before the time they occur—recall that PXi(t̂i) is simply equal to one. A natural choice would be to pick moments like t̂i−110i. Such a sequence of moments are always just before the moment {Xi} occurs and are such that as i→∞, they get arbitrarily close to those moments, t̂i. But for our purposes here, we need the new sequence to be more ‘tightly’ tied to PXi(t). The factor required will depend on each of the PXi(t) and how quickly each approach their limit near t̂i. The limit in question is Li, so by the definition of limit, we can find a minimal distance, δ>0, such that the distance between PXi(t) and Li can be made less than an arbitrary ε>0 for all t within the minimal distance, δ of t̂i. The ε we will use here comes from the assumption that D(Xi,t̂i)→0, namely, ε̂2, which is half of the ε̂>0 we have from Equation (6). Now consider the sequence of {δi}, each greater than zero, with the property that if |t̂i−t|<δi, then |Li−PXi(t)|<ε̂2. Again, that the δi>0 exist follows from the definition of the (left) limit of PXi(t) at t̂i being Li. Finally, we define the {ti} as ti=t̂i−δi10i. Thus the sequence of {ti} are such that as i→∞, the tis get arbitrarily close to the t̂is (the times that the Xis occur), and further, each ti is within δi of t̂i, so |Li−PXi(ti)|<ε̂2, for each ti. This entails that   |1−PXi(ti)|=|1−Li+Li−PXi(ti)|≥| |1−Li|−|Li−PXi(ti)| |, after adding and subtracting Li, regrouping, and making use of the identity |a+b|≥||b|−|a||. Because, from above |1−Li|≥ε̂ and |Li−PXi(ti)|<ε̂2, we have that   |1−PXi(ti)|≥|ε̂−ε̂2|=ε̂2, (A.3) for some i greater than any N > 0. This step also depends on the fact that if |a|≥|c| and |b|≤|d|, then |a−b|≥|c−d|. Finally, using Equation A.3, removing unnecessary absolute value signs, and relabelling our crucial value ε̂2 as ε for notational simplicity, we get:   1−PXi(ti)≥ε, for someigreater than anyN>0. (A.4) In making use of Bayes’s theorem below, the conditional probabilities PYXi(ti) and PXiY(ti) will be needed. It will be assumed that limi→∞PYXi(ti)=L and limi→∞PXiY(ti)=1. The reasoning for this is as follows: given that the limit of PY(ti) is L, conditionalizing on the Xis, which are the particular events leading up to Y, should not affect convergence; similarly, since (by definition) the {Xi} are events leading up to Y at times {ti} converging to the time they occur t̂i, the probability of these events at these times conditionalized on Y will naturally converge to one as time converges to the time of Y.22 The convergence of conditional sequence PYXi(ti) to L entails that for any ε1>0, there is an N1>0 such that, for all i > N1,   |PYXi(ti)−L|<ε1. (A.5) Similarly, the convergence of the conditional sequence PXiY(ti) to one entails that for any ε2>0, there is an N2 > 0 such that, for all i > N2,   |1−PXiY(ti)|<ε2. (A.6) Now the particular values to be used for ε1 and ε2 are   ε1=εL2(ε+2)>0 and ε2=ε4>0. Thus from Equation (A.6) and the value for ε2 and multiplying through by L > 0,   |L−LPXiY(ti)|<ε2L=εL4. (A.7) Adding and subtracting L, using the triangle inequality, and Equations (A.5) and (A.7), we may derive that   |PYXi(ti)−LPXiY(ti)|=|(PYXi(ti)−L)+(L−LPXiY(ti))|≤|PYXi(ti)−L|+|L−LPXiY(ti)|<εL2(ε+2)+εL4=2εL+εL(ε+2)4(ε+2)=εL(ε+4)4(ε+2). (A.8) Next, using Equation (A.5) again, with the value for ε1, we have that   PYXi(ti)>L−εL2(ε+2). (A.9) From Equation (A.4), multiplying by PYXi(ti), and using Equation (A.9), we can derive that   PYXi(ti)(1−PXi(ti))≥(L−εL2(ε+2))ε=(2L(ε+2)−εL2(ε+2))ε=(4L+2εL−εL2(ε+2))ε=εL(4+2ε−ε2(ε+2))=εL(ε+4)2(ε+2). (A.10) Finally, returning to Equation (A.1), which expresses the continuity condition   |PY(ti)−L|=|PXi(ti)PYXi(ti)PXiY(ti)−L|                (Bayes’stheorem(2.1))=1PXiY(ti)|PXi(ti)PYXi(ti)−LPXiY(ti)|≥|PXi(ti)PYXi(ti)−LPXiY(ti)|                (since PXiY(ti)≤1)=|PXi(ti)PYXi(ti)−PYXi(ti)+PYXi(ti)−LPXiY(ti)|=|PYXi(ti)(PXi(ti)−1)+(PYXi(ti)−LPXiY(ti))|≥| |PYXi(ti)(1−PXi(ti))|−|(PYXi(ti)−LPXiY(ti))| |(|a+b|≥||b|−|a||)≥|εL(ε+4)2(ε+2)−εL(ε+4)4(ε+2)|                               (from Equations (A.8) and (A.10))23=εL(ε+4)4(ε+2)>0. (A.11)This holds for some i greater than any N>max{N1,N2}>0. Thus PY(t) is discontinuous from the left at ty. 1 Peter Menzies ([1989]) and Igal Kvart ([2004]) in their respective accounts are also sensitive to how the probabilities evolve through time, but without explicitly addressing continuity one way or the other; I take this up below. 2 For this article, I will understand the basic form of these probabilities as unconditional. This is distinct from general probability, which applies to classes of events and whose basic forms are conditional. This is for clarity and convenience only; the continuity issues I deal with here are not sensitive to whether the probabilities are analysed in the standard Kolmogorovian way or some other way, with a different conditionalization rule and/or with conditional probabilities as the basic form; see (Hájek [2003]). 3 A ‘jump’ discontinuity is one in which the left- and right-hand limits exist, but are not equal. It is important to note that in order for there to be a discontinuous jump (jump discontinuity) as Y occurs (or at any other significant time, for example, tx), it is necessary that the trajectory be continuous in some (perhaps very small) interval to the left of the jump discontinuity—this will become significant below. The other two possibilities, that the left- and right-hand limits exist and are equal, or that one (or both) fail to exist are called ‘removable’ and ‘essential’ discontinuities, respectively. The essential discontinuity case will come up again below. 4 Another way of putting this, in terms of type-level probabilistic causality, is that the bump is a token-level positive causal factor for the ball going in, but that bumps of this type are type-level negative causal factors. 5 This is because, by definition, M≤P+, so if B=M−P−>0, then P+−P−>0, so P+≠P−, that is, the limits of PY(t) from the left and the right at tx are not equal, which again entails a jump discontinuity at tx. 6 Eells ([1991], p. 354) makes it explicit that it is a presupposition of his account that the left and right limits of PY(t) at tx exist. The value at the point tx itself is not constrained. At another point, Eells ([1991], p. 355) also requires that the probability trajectory actually be constant in some open interval both to the left and right of tx. He may have sensed the possibility of pathological behaviour around such discontinuities. In any event, I argue below that his required ‘jump’ discontinuity will entail not only that PY(t) cannot be constant to the left of tx, but that it cannot even have a limit from the left. 7 Recall that the causal background context, K, is built into the definition in that PY(t)=P(Y|K & Wt), so for example, PY|X(t) will be P(Y|X & K &Wt). In the case of the example under consideration here, where the Xs are not arbitrary events, but rather ones causally relevant to Y and converging to Y’s time, all the probability functions in what follows will have the same causal background context; I will generally suppress the additional notation in what follows, noting it only where it makes a difference. 8 As will become clear below, a key feature of the Xis is that they be at least probabilistically and/or causally relevant to Y. Thus the construction is unproblematic when there is a space-time process leading up to or constituting event Y, as in the case here. The argument does not, however, necessarily apply to quantum events or certain kinds of macro-level events that embed quantum event, which may not have such probabilistically relevant antecedent events under certain interpretations. (I will not get into the distinction between causally and probabilistically relevant here; there is much debate about when and how these notions coincide. In most settings, probabilistic relevance is weaker and I will employ it here.) While there is debate about whether all macro-level examples of causation need to have such an intermediate process, even accepting a pluralistic view (Hall [2004]), for my argument here it is sufficient that it work for the large class of macro-level cases like Example 1, in which there is such a mediating process. See Footnote 10 for more on the kinds of quantum-embedding macro events that are excluded. 9 Nothing turns on these particular values for the limits of the conditional probabilities. As long as they converge to some 0<L̂≤1, as they must, then the proof can proceed with simple scaler adjustments. 10 I stress again that the argument of this section is against the view embodied in DJP, namely, that all occurring events must jump discontinuously to one; it is does not militate against the (very plausible) view that some classes of events might so jump. In particular, it is compatible with the view that the probability trajectories of quantum events or certain kinds of macro-level events that embed quantum events in particular ways do in fact so jump. As a reviewer for this journal points out, without such a limitation a quantum variation of my Example 1 might be construed as a counterexample to the incoherence horn of the dilemma. Consider Example 1 modified so that the golf ball’s fall into the cup triggers a quantum triggering device that has an irreducible chance of 0.9 of triggering an explosion nearby. One must assume (per impossible) that the triggering process and the detonation, if they happen, will both be instantaneous, so if the explosion happens, it will happen at the same moment, ty, when the ball falls into the hole. Then if the ball being where it was with its favourable trajectory at moments converging to the time it falls into the cup are the events {(Xi,t̂i)}, and the explosion is Y, the {(Xi,t̂i)} events are getting arbitrarily close to Y and their probabilities are getting arbitrarily close to one, but the probability of Y is bound away from one (at 0.9). Whether it is legitimate in this context to collapse such triggering events on to Y in the probability trajectory is not clear, but fortunately it need not be taken up here. 11 See (Kvart [2004], p. 369–70) for a similar use of such an example and discussion. 12 There are provable restrictions on the size and nature of the set of points of discontinuity for real-valued functions. In particular, this set must be an Fσ set, that is, one that can be written as a countable union of closed sets of real numbers (Royden [1988], p. 53). And if the function in question is monotonically increasing or decreasing on an open interval, as a probability trajectory might well be eventually, then there can be no essential discontinuities and at most countably many ‘jump’ discontinuities; see (Rudin [1976], p. 95–7) for details. 13 This is how Eells’s account avoids the putative defect that, for example, an event located far away, say another squirrel kicking a tree, but taking place at the same time as x, would also be deemed a cause of Y because the trajectory would jump at the time of both events, tx. But in fact, there would be two trajectories, one for X and one for the faraway event, and the latter trajectory would not have a ‘because’ jump because its (distinct) Wt would not include the factors (collision, change in the ball’s trajectory, and so on) traceable back only to x being X. 14 Eells ([1991], pp. 344–5) does allow that one might conceive of K as a ‘kind of population’, and this could open the possibility of a non-extensional understanding that might allow one to individuate the trajectories in a more fine-grained way. Of course, such a move brings with it problems of its own. The account could also potentially be modified in a way that allows for (appropriately bounded) essential discontinuities as well as jump discontinuities. As long as a proper bound was in place, one could employ the ‘limit superior’ (least upper bound of the cluster points) from the left, rather than a limit from the left. This would potentially alleviate issues with isolated essential discontinuities, though the problematic proliferation of them due to DJP/CDJP would remain, as would the problems to be taken up in the next section. I owe thanks to an anonymous reviewer for helping me see this possibility. 15 As a reviewer points out, another possibility for handling unruly kinds or numbers of discontinuities in probability trajectories would be to move to an account along the lines of Woodward's ([2003]) or Pearl's ([2009], which distinguishes between the actual situation of interest and causal models of the situation. Such an approach, in utilizing a model to analyse causation, considers only a fixed and generally discrete set of causal variables. Thus even if in the actual situation the probability trajectory had (lots of) essential discontinuities, within the confines of the idealized model such ‘bad behaviour’ might well be modelled sufficiently by a simple jump discontinuity. Indeed, if the continuity bind proves insurmountable in the end, then this could be seen as a case for such model-based accounts. 16 This turns out to be an implausibly strong condition and Menzies ([1996]) himself disavows even an amended version of this theory. This tension between stability in the probability trajectory and cripplingly strong constraints on it is endemic to the point probability framework, as I take up below. 17 Christopher Hitchcock ([2004], p. 414) develops a proposal growing out of Ned Hall’s suggestion that one should evaluate the probability of an effect shortly before the time at which the effect occurs. 18 Another now classic (if rather mean) example used in this context by Dretske and Snyder ([1972]), with a debt to Schrödinger, involves a quantum-mechanical process and detector hooked to a device that fires a revolver at a cat, and is calibrated to do so with a (quantum) probability of 0.01. 19 There is no concern that such discontinuities in the derivative of PY(t) might result in the same or similar problems because there is an important difference: the arguments above cannot be applied to the derivative of PY(t) because the derivative of a probability function is not itself a probability function. 20 Since the number of possible configurations of a probability function and its derivative afford many more complexities than are available when simply classifying discontinuities, it could well be that, as in (Hitchcock [1993], p. 359), there is ‘no natural division of causal relevance into a few simple species, such as positive and negative; rather, causal relevance is infinite in variety’. On this account, causal relevance is infinite in variety because it is represented as an array of conditional probability functions that themselves have infinitely many possible configurations. 21 See (Peressini [2016]) for further details. 22 Again, nothing turns on these particular values for the limits of the conditional probabilities. As long as they converge to some 0<L̂≤1, as they must, then the proof can proceed with simple scaler adjustments. 23 This step also depends on the fact that if |a|≥|c| and |b|≤|d|, then |a−b|≥|c−d|. Acknowledgements I owe most of what I know and much of what I have figured out about probabilistic causality to my friend and teacher, the late Ellery Eells. This project started as a result of a graduate seminar taught by Ellery in 1989 in which we were reading a draft of (Eells [1991]). I was concerned about the continuity of the probability functions and, despite the fact that I could not (yet) substantiate my worries, the ever generous Ellery gave me a footnote in his book (Eells [1991], p. 294). I pursued the issue as one of my dissertation qualifying papers in 1991, but shelved the project because I was not able finish the proof that appears in the appendix of this article. I returned to it while on sabbatical in Berlin in 2012–13, and realized that the proof required considering two cases (Equation (3))—which allowed me to finally finish it. I thank Dörte and Frieder Middelhauve and Marina Malidzanovic for helping make the sabbatical possible, along with Michael Pauen and the Berlin School of Mind and Brain, where I was a visiting scholar. I owe much to Jodi Melamed for comments and support, and also to my research assistants, Shaila Wadhwani and Clark Wolf. Finally, I thank the journal’s anonymous reviewers for their very careful readings and helpful comments. References Anguelov R. [unpublished]: ‘An Introduction to Some Spaces of Interval Functions’, available at <arxiv.org/abs/math/0408013>. Anguelov R. Markov S. Sendov B. [ 2006]: ‘The Set of Hausdorff Continuous Functions: The Largest Linear Space of Interval Functions’, Reliable Computing , 12, pp. 337– 63. Google Scholar CrossRef Search ADS   Augustin T. Coolen F. de Cooman G. Troffaes M. [ 2014]: Introduction to Imprecise Probabilities , Chicester: Wiley. Dretske F. I. Snyder A. [ 1972]: ‘Causal Irregularity’, Philosophy of Science , 39, pp. 69– 71. Google Scholar CrossRef Search ADS   Eells E. [ 1991]: Probabilistic Causality , New York: Cambridge University Press. Eells E. [ 2010]: ‘Objective Probability Theory Theory’, in Eells E. Fetzer J. (eds), The Place of Probability in Science , Dordrecht: Springer, pp. 3– 44. Fenton-Glynn L. ([forthcoming]): ‘Imprecise Best System Chances.’, Proceedings of the European Philosophy of Science Association Conference . Glynn L. [ 2010]: ‘Deterministic Chance’, British Journal for the Philosophy of Science , 61, pp. 51– 80. Google Scholar CrossRef Search ADS   Glynn L. [ 2011]: ‘A Probabilistic Analysis of Causation’, British Journal for the Philosophy of Science , 62, pp. 343– 92. Google Scholar CrossRef Search ADS   Glynn L. [unpublished]: ‘A Probabilistic Analysis of Causation', available at <web.mit.edu/gradphilconf/2008/A%20Probabilistic%20Analysis%20of%20Causation.pdf>. Hájek A. [ 2003]: ‘What Conditional Probability Could Not Be’, Synthese , 137, pp. 273– 323. Google Scholar CrossRef Search ADS   Hall N. [ 2004]: ‘Two Concepts of Causation’, in Collins J. Hall N. Paul L. A. (eds), Causation and Counterfactuals , Cambridge, MA: MIT Press, pp. 181– 276. Hitchcock C. [ 1993]: ‘A Generalized Probabilistic Theory of Causal Relevance’, Synthese , 97, pp. 335– 64. Google Scholar CrossRef Search ADS   Hitchcock C. [ 2004]: ‘Do All and Only Causes Raise the Probabilities of Effects?’, in Collins J. Hall N. Paul L. A. (eds), Causation and Counterfactuals , Cambridge, MA: MIT Press, pp. 403– 17. Ismael J. [ 2011]: ‘A Modest Proposal about Chance’, Journal of Philosophy , 108, pp. 416– 42. Google Scholar CrossRef Search ADS   Kvart I. [ 2004]: ‘Causation: Probabilistic and Counterfactual Analyses’, in Collins J. Hall N. Paul L. A. (eds), Causation and Counterfactuals , Cambridge, MA: MIT Press, pp. 359– 86. Kyburg H. [ 1999]: ‘Interval-Valued Probabilities’, available at <www.sipta.org/documentation/interval_prob/kyburg.pdf>. Lewis D. [ 1986]: ‘Postscripts to “Causation”’, in his Philosophical Papers: Volume II , Oxford: Oxford University Press, pp. 172– 213. Menzies P. [ 1989]: ‘Probabilistic Causation and Causal Processes: A Critique of Lewis’, Philosophy of Science , 56, pp. 642– 63. Google Scholar CrossRef Search ADS   Menzies P. [ 1996]: ‘Probabilistic Causation and the Pre-emption Problem’, Mind , 105, pp. 85– 117. Google Scholar CrossRef Search ADS   Noordhof P. [ 1999]: ‘Probabilistic Causation, Preemption, and Counterfactuals’, Mind , 108, pp. 95– 125. Google Scholar CrossRef Search ADS   Northcott R. [ 2010]: ‘Natural-Born Determinists: A New Defense of Causation as Probability-Raising’, Philosophical Studies , 150, pp. 1– 20. Google Scholar CrossRef Search ADS   Pearl J. [ 2009]: Causality , Cambridge: Cambridge University Press. Peressini A. F. [ 2016]: ‘Imprecise Probability and Chance’, Erkenntnis , 81, pp. 604– 13. Google Scholar CrossRef Search ADS   Rosen D. A. [ 1978]: ‘In Defense of a Probabilistic Theory of Causality’, Philosophy of Science , 45, pp. 561– 86. Google Scholar CrossRef Search ADS   Royden H. L. [ 1988]: Real Analysis , New York: Macmillan. Rudin W. [ 1976]: Principles of Mathematical Analysis , New York: McGraw-Hill. Sober E. [ 2010]: ‘Evolutionary Theory and the Reality of Macro Probabilities’, in Eells E. Fetzer J. (eds), The Place of Probability in Science , Dordrecht: Springer, pp. 133– 62. van Fraassen C. [ 1984]: ‘Belief and the Will’, Journal of Philosophy , 81, pp. 235– 56. Google Scholar CrossRef Search ADS   Walley P. [ 1991]: Statistical Reasoning with Imprecise Probabilities , London: Chapman and Hall. Weichselberger K. [ 2000]: ‘The Theory of Interval-Probability as a Unifying Concept for Uncertainty’, International Journal of Approximate Reasoning , 24, pp. 149– 70. Google Scholar CrossRef Search ADS   Woodward J. [ 2003]: Making Things Happen: A Theory of Causal Explanation , Oxford: Oxford University Press. © The Author 2017. Published by Oxford University Press on behalf of British Society for the Philosophy of Science. All rights reserved. For Permissions, please email: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The British Journal for the Philosophy of Science Oxford University Press

Causation, Probability, and the Continuity Bind

Loading next page...
 
/lp/ou_press/causation-probability-and-the-continuity-bind-Wh8Tnhgruw
Publisher
Oxford University Press
Copyright
© The Author 2017. Published by Oxford University Press on behalf of British Society for the Philosophy of Science. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ISSN
0007-0882
eISSN
1464-3537
D.O.I.
10.1093/bjps/axw030
Publisher site
See Article on Publisher Site

Abstract

Analyses of singular (token-level) causation often make use of the idea that a cause increases the probability of its effect. Of particular salience in such accounts are the values of the probability function of the effect, conditional on the presence and absence of the putative cause, analysed around the times of the events in question: causes are characterized by the effect’s probability function being greater when conditionalized upon them. Put this way, it becomes clearer that the ‘behaviour’ (continuity) of probability functions in small intervals about the times in question ought to be of concern. In this article, I make an extended case that causal theorists employing the ‘probability raising’ idea should pay attention to the continuity question. Specifically, if the probability functions are ‘jumping about’ in ways typical of discontinuous functions, then the stability of the relevant probability increase is called into question. The rub, however, is that sweeping requirements for either continuity or discontinuity are problematic and, as I argue, this constitutes a ‘continuity bind’. Hence more subtle considerations and constraints are needed, two of which I consider: (1) utilizing discontinuous first derivatives of continuous probability functions, and (2) abandoning point probability for imprecise (interval) probability. 1Introduction 2Probability Trajectories and Continuity 2.1Probability trajectories 2.2Causation as discontinuous jumps 2.3Against systematic discontinuity 3Broader Discontinuity Concerns 4The Continuity Bind 4.1Retaining continuity with discontinuous first derivatives 4.2Imprecise (interval) probability trajectories 5Concluding Remarks Appendix 1 Introduction Analyses of singular (token-level) causation often make use of the idea that a cause increases the probability of its effect. Of particular salience in such accounts are the values of the probability function of the effect, conditional on the presence and absence of the putative cause, analysed around the times of the events in question: causes are characterized by the effect’s probability function being greater when conditionalized upon them. Put this way, it becomes clearer that the ‘behaviour’ (continuity) of probability functions in small temporal intervals about the times in question ought to be of concern. One prominent (but under-examined) account of token-level causation, that of Ellery Eells ([1991]), actually requires point ‘jumps’ (discontinuities) in the relevant probability functions for positive and negative token-level causes. In this article, I make an extended case that causal theorists employing the ‘probability raising’ idea should pay attention to the continuity question, as it has serious implications for the viability of their accounts. Specifically, if the probability functions are ‘jumping about’ in ways typical of discontinuous functions, then the stability of the relevant probability increase is called into question. The rub, however, is that sweeping requirements for either continuity or discontinuity are problematic and, as I argue, this constitutes a ‘continuity bind’. Hence more subtle considerations and constraints are needed. I begin by introducing the question of continuity in the context of causation and probability functions (trajectories) using the work of Eells—one of the few theorists to explicitly consider continuity.1 I then show how discontinuity requirements like Eells’s are untenable, and in a surprisingly decisive way. Next, I argue that the discontinuity issue also has problematic implications for causal accounts without explicit discontinuity requirements (Menzies [1989]; Noordhof [1999]; Hitchcock [2004]; Kvart [2004]; Northcott [2010]; Glynn [2011]). After this, I consider a blanket continuity requirement and show that while it is a tempting response, it is unworkable because of the need to allow for the possibility of (empirically motivated) discontinuity in probability trajectories. And therein lies the continuity bind. Finally, I consider two potential ways out of the bind: (1) utilizing discontinuous first derivatives of continuous probability trajectories for theoretically motivated discontinuity needs, and (2) abandoning point probability trajectories altogether in favour of imprecise (interval) probability trajectories. 2 Probability Trajectories and Continuity I present the continuity question here in a causal setting, roughly following Eells ([1991]); I then expand it to other prominent accounts in Section 3. Let x and y denote token events, where x takes place at time and place (tx, sx) and y takes place at (ty, sy). Assume that x’s being X caused (in some plausible way) y’s being Y, where x is of type X and y is of type Y. Of interest is how the probability of token event y’s being Y evolves between tx and ty, that is, how the probability of y’s being Y changes as a function of time. (I will abbreviate the token events of ‘x being X’ and ‘y being Y’ by just writing the properties exemplified, X and Y.) In the ensuing discussion, probability will be understood as objective and physical, that is, as a ‘physical probabilities’ or ‘chance’. A single-case time-dependent probability function, P, will be assumed as part of a probability space triple <Ω,F,P>, where Ω is a set, F is a σ-field over Ω, and P is a probability function on F that obeys the standard (Kolmogorov) axioms of the probability calculus. Physical probabilities apply to particular events, ones that occur or fail to occur at a particular time and place, and hence have values defined relative to a time of evaluation. To make explicit the temporal index, t, involved in evaluating the probability of event Y∈F at time t, I use the notation PY(t). In general, if an event Y occurs at a time ty, PY(t) is strictly between zero and one prior to ty, and one at time ty and all later times. Following Eells ([1991]) I use the term ‘probability trajectory’ to refer to a probability function understood in this way as a function of time.2 Again, while there are challenges to any interpretation of probability, in this causal setting, an objective physical understanding akin to ‘chance’ is a reasonable way to proceed; see (Eells [1991], pp. 34–55, Eells [2010]) detailed discussions. I (loosely) follow Jenann Ismael ([2011], pp. 419–20) with my pre-theoretic understanding of physical probability or chance, taking it to be ‘the link between the fundamental level of physical description in quantum mechanics and the measurement results that mark the points of empirical contact between theory and world’. I follow her in that my understanding is that physical probability is objective and non-trivial (not everywhere zero or one). I remain agnostic, however, with respect to her ultimate analysis of it—especially whether its grounding is at the quantum level or some higher level as in (Glynn [2010]; Fenton-Glynn [forthcoming]; Sober [2010]). 2.1 Probability trajectories As an illustration of a probability trajectory in action, consider the following example from (Rosen [1978]), modified here from (Eells [1991]): Example 1: A poorly putted golf ball is rolling roughly in the direction of the cup when a squirrel runs by and bumps it in such a way that its resulting trajectory is directly toward the cup and it continues right into the cup. Following the standard assumptions of such causal discussions, I take the probability values of PY(t) to reflect the objective probability of the event (ball going in the hole) and assume that it is strictly less than one until Y occurs. Suppose that the probability of the ball going into the cup, given its initial trajectory, velocity, and so on, is 0.25. Suppose further that, in general, the (type) probability of balls going in when squirrels bump them is very low (say 0.05); however, in this (token) case, the particular trajectory of the ball immediately following the bump was such that the probability of the ball falling in the cup was rather high, say 0.8. Let the event of the squirrel bumping the ball be x being X, and the event of the ball going into the cup be y being Y. The probability trajectory of Y can be depicted following Eells ([1991], p. 293), as in Figure 1. Figure 1. View largeDownload slide Probability trajectory with discontinuous jump at occurring event. Figure 1. View largeDownload slide Probability trajectory with discontinuous jump at occurring event. The standard analysis of this example is that the squirrel’s kick, X, caused the ball to drop into the cup, Y, despite the fact that, in general, squirrel kicks in such situations almost never result in the ball going into the hole. For causal considerations, the salient features of the graph are that the probability of Y takes an immediate point drop in probability at tx, corresponding to the type-level fact that X-type events generally decrease the probability of Y-type events, and that the probability of Y recovers immediately after the ball is bumped at tx to a higher value than it had before, because of the favourable trajectory and velocity actually imparted by token event X. Hopefully, this causal story is plausible enough, though its causal details are not the primary concern here. For present purposes, the crucial features of the graph are the discontinuities at tx and ty, that is, the fact that the probability of Y ‘jumps’ up just after x happens and then ‘jumps’ again to one at the moment the ball falls into the cup. As can be seen from the graph, Eells employs jump discontinuities in the probability trajectory to (a) indicate that a token (positive) ‘cause’ has taken place, and (b) emphasize that the world is chancy or indeterministic at the macro-level in that the probability of the event in question is bound away from one until it happens.3 Use (a) is central to his account (as I detail below); I will call this general idea the causal discontinuous jump principle (CDJP): CDJP: The probability trajectory of an event, e, that occurs at te jumps discontinuously at times when events causally relevant to e occur. The status of the second use of discontinuity, (b), is less clear in Eells’s work. This ‘occurring event discontinuity’ assumption is made and discussed explicitly (Eells [1991], p. 294). I will refer to this assumption as the discontinuous jump principle (DJP): DJP: The probability trajectory of an event, e, that occurs at te jumps discontinuously to one at time te. Notice though that the assumption that the probability of an event is not one until the event occurs is also consistent with the graph continuously approaching one from below. Eells recognizes that the indeterminism could also be represented in a continuous fashion, with the probability continuously approaching one from below. But he writes that his analysis does not ‘pay attention’ to whether the trajectory is continuous at the time that the event occurs (Eells [1991], p. 294, Footnote 6), and thus is not explicitly committed to DJP. He does, however, consistently draw all his graphs with DJP discontinuities. Consider the alternative graph depicted in Figure 2, in which the probability trajectory continuously approaches one at ty. It is equally true in this graph that the probability of Y is strictly less than one until it actually occurs at ty. The difference between this graph and the graph in Figure 1 is that in Figure 1, the value of PY(t) is bound away from one prior to ty, while in Figure 2, the value of PY(t) becomes arbitrarily close to, but is always less than, one as t approaches ty. I argue below that Eells was mistaken about nothing turning on DJP and that thinkers concerned with probability should indeed ‘pay attention’ to this continuity issue. But first I sketch his causal account. Figure 2. View largeDownload slide Probability trajectory with continuous probability at occurring event. Figure 2. View largeDownload slide Probability trajectory with continuous probability at occurring event. 2.2 Causation as discontinuous jumps In his treatment of this example, Eells depicts the probability trajectory of PY(t) as in the graph in Figure 1. The crucial feature of this graph for the causal question is that the probability of the ball falling in the cup is higher immediately after the ball is bumped at tx than it was prior to the bump, despite the fact that, in general, such types of events lower the probability of the ball going in, as indicated in the graph by the immediate point drop in probability at tx.4 According to Eells, this structural property of the example is what leads us to say that this token squirrel bump caused the ball to go in, despite the fact that, in general, such bumps tend to prevent balls going in rather than cause them. When the probability trajectory of an event, y, has this structure, Eells defines y to have occurred ‘because’ x occurred. Explicitly, what is required for event y’s being Y ‘because’ event x was X are the following three conditions: (i) the probability of Y changes at the time of x, (ii) immediately after the time of x, the probability of Y is both high and higher than before x, and (iii) this probability remains high until the time of y. Eells describes three additional causal relations: an event’s occurring ‘despite’ another event, an event’s being ‘independent’ of another event, and an event’s being ‘autonomous’ of another event. Though the details of these three conditions will not be of particular concern here, the basic idea is that in the ‘despite’ case, the probability decreases (and remains low); in the ‘independent’ case, it remains the same; and in the ‘autonomous’ case, the probability increases to a high level, but then drops to a low level. Eells ([1991], p. 355) defines each of these relations in terms of the left and rights limits of PY(t) at tx. He also specifies the qualifications needed to preserve the ‘causes increase the probability of their effects’ idea, in particular that one must hold fixed the set, K, of the actual, separate, independent causes of Y and also any interactive factors by which X influences the probability of Y (Eells [1991], Section 6.4). In order to build in the temporal evolution of the probability in the proper way, Eells further specifies Wt be the conjunction of all factors of the world (relevant to y’s being Y) that have fallen into place by time t and whose exemplification (relative to K) can be traced back to the exemplification of X at tx. He then defines PY(t)=P(Y|K & Wt) for all times t. With this understanding of the probability trajectory, the degree to which y is Y because of, despite, independent of, and autonomous of x’s being X is defined as follows:   B=M−P−,D=P−−P+,I=1−|P−−P+|,A=P+−M, where P– and P+ are the left- and right-hand limits of PY(t) at tx, respectively, and M=min {PY(t):t in (tx,ty)}, which can be thought of, intuitively, as the lowest point in the probability trajectory between tx and ty. It should be clear that in the case of ‘despite’, if X is to play a negative (‘despite’) causal role (D > 0), that P−−P+>0, from which it follows that P+≠P− or, in other words, that the limits of PY(t) from the left and the right at tx are not equal, which entails a jump discontinuity at tx. The same follows for the ‘because’ relationship as well.5 Thus if X is a positive or negative token cause of Y, then PY(t) has a jump discontinuity at the time of X. That is, Eells’s account requires an event’s probability trajectory to have discontinuous jumps at times at which causally relevant events occur.6 Let me stress again that Eells’s official position is that it is inconsequential as to whether occurring events jump discontinuously to one (DJP), though he favoured and utilized the discontinuous version. But as to the question of whether causes require a jump discontinuity in the probability trajectory of their effects (CDJP), Eells’s answer is an unequivocal ‘yes’, since his entire token-level account depends on such discontinuities. It is surprising that this aspect of his influential account has received virtually no discussion. 2.3 Against systematic discontinuity In this section, I make a detailed case for why systematic discontinuity requirements like DJP or CDJP are problematic. I first direct my case against DJP. The reasons for beginning with DJP are: (1) DJP is more general and is thus of interest in its own right, outside the setting of probabilistic causality (for example, event ontology, chance, and so on), and (2) the argument is more straightforward and perspicuous in the case of DJP and requires only minor adjustment to apply to CDJP as well. In what follows, I present the main thread of a formal argument against DJP (the details of which can be found in the appendix) and then show how it extends to CDJP. Reconsider Example 1—especially the period from after the time the squirrel bumps the ball to the time it enters the cup. The instant the ball comes off the bump, it has a certain trajectory and speed, one that will take it directly into the cup; this is why the probability of Y is high after that instant. As time gets closer to ty and the ball gets closer to the cup, the number of eventualities that could prevent the fall into the cup decreases, and so its probability continues to increase. In other words, as the ball passes by points on the green closer and closer to the cup, and with the same favourable trajectory and speed, the probability of its going in the cup would naturally be expected to continue to get closer and closer to one. While these considerations alone favour a continuous increase of PY(t) to one, a stronger case can be made. If the probability trajectories of all occurring events jump discontinuously at the instant they occur (DJP), then the probability trajectory for each of the occurring events ‘leading up’ (causally) to the event under consideration would also have a jump discontinuity at the time of their occurrence. The probability (trajectory) of the original event is not independent of the probability (trajectories) of certain events leading up to it, that is, its probability depends on certain events that need to ‘fall into place’ in order for it to happen—and this gives rise to problems. Returning to Example 1: between the time of the cause, tx, and the time of the effect, ty, the graphs of both versions (Figures 1 and 2) depict the probability trajectory as continuous in the interval just to the left of ty. This, however, does not accord with the ‘jumpy’ nature of the probabilistically relevant prior events falling into place. If all the events involved in the ball traversing the points on the green after being bumped and before entering the cup have probability trajectories that have a jump discontinuity at the time they occur, then it seems that the probability trajectory of Y (ball falling into the cup), which depends upon these events falling into place, should reflect this discontinuous ‘jumping’ at the times these prior events occur. Such considerations suggest that the discontinuous jump (as mandated by DJP) in the probability trajectory of an occurring event is inconsistent with the probability trajectory PY(t) being continuous in the interval just before ty, as it must be in order to have a jump discontinuity. If this is right, then assuming DJP in such settings is inconsistent, since PY(t) is required (as depicted) to be continuous in at least some small interval to the left of ty. I now put this objection on a formal footing to show more precisely the source of the problem. The form the argument will take is that of an inconsistent/incoherent dilemma, namely, that DJP in this setting entails either that the probability trajectory, PY(t), is discontinuous from the left at ty (has no left-hand limit), which is inconsistent with there being a jump discontinuity at ty, or the certainty (distance the probability is from one) of antecedent events upon which Y depends becomes arbitrarily larger than the certainty of Y itself, which will be shown to be an incoherent result. For definiteness, the setting will parallel Example 1 and concern the assessment of the causal relevance of a token event x being X for event y being Y, where these events occur at tx and ty, respectively. I will show that DJP entails the unintended (and unexpected) consequence that PY(t) has no limit from the left (is left discontinuous) at ty. To get the argument off the ground, I make use of a well-known theorem from probability, Bayes's theorem, which states that   PY|X(t)=PX|Y(t)PY(t)PX(t), PX(t)>0, where PY|X(t) is the probability of Y conditional on X at time t, and similarly for PX|Y(t). A simple variation of this that will be useful here is:   PY(t)=PY|X(t)PX(t)PX|Y(t), PX|Y(t)>0. (1) The role of Equation (1) will be to instantiate in a formal way the intuitive idea expressed by the idea that the probability (trajectory) of the event under consideration depends somehow on the probability (trajectories) of the events that fall into place leading up to it.7 Consider a sequence of moments, {t̂i}, converging to ty (the moment Y occurs) and a sequence of events, {Xi}, occurring at these times and upon which Y probabilistically depends. In the context of Example 1, these events and moments will be where the ball was at half of a second before it went in, and a fourth of a second, an eighth of a second, and so on. More formally, we might put this as t̂i=ty−12i and Xi= the event of the ball being where it was at t̂i with the particular favourable trajectory it had.8 In order to formalize the degree to which probability trajectories jump discontinuously to one, I use the left-hand limit of the probability of event X as time approaches the time X occurs (from before) and define the ‘distance’ of the jump to one as   D(X,t)=lims→t−1−PX(s). (2) The value of D(X,tx), where tx is the time X occurs, is the distance the probability trajectory jumps to reach one when X occurs. That D(X,tx) is greater than zero for all events is equivalent to the assumption of DJP. Returning to sequence {Xi}, we see that {D(Xi,t̂i)} might be thought of as the ‘degree of indeterminism’ or ‘chanciness’ for each of the Xi events. In terms of Example 1, it would be values representing how far the probability of the ball being where it is at one-half of a second before it goes in, a fourth of a second, an eighth of a second and so on, jumps as each of those events occur. The {Xi} sequence of events leading up to Y will be used to show that DJP is problematic. At this point, the argument bifurcates based on whether the sequence {D(Xi,t̂i)} converges to zero, that is, whether   limi→∞D(Xi,t̂i)=0. (3) I will show that if it does not converge to zero, then we have the inconsistency horn of the dilemma; and if it does converge to zero, we have the incoherency horn. 2.3.1 Inconsistency The general strategy for this horn is to show how, using Bayes’s theorem, if the ‘event occurring jumps’ of the events getting close in time to Y do not converge to zero, then this forces probability trajectory PY(t) to be discontinuous to the left of ty (not have a limit as time approaches ty from the left), which contradicts the requirement that the left-hand limit of PY(t) at ty exists. The full details of the proof can be found in the appendix, which I sketch below. Let 0 < L < 1 be the left-hand limit of PY(t) at the time Y occurs, ty. To show that PY(t) is discontinuous from the left at ty, it is sufficient to find a sequence of times, {ti}, converging to ty, such that PY(t) evaluated at those points does not converge to L, that is, {PY(ti)}→L. In terms of the definition of convergence, this means showing that there is an ε>0 such that   |PY(ti)−L|≥ε, for some i greater than anyN > 0. (4) The next task is to actually construct problematic sequence {ti} and show that it satisfies Equation (4). Since, by hypothesis, {D(Xi,t̂i)} does not converge to zero, there is an ε̂>0 such that   D(Xi,t̂i)≥ε̂, for someigreater than anyN > 0. (5) It will simplify the notation to define {Li} to be the sequence of left-hand limits for each of {Xi}. So Li=1−D(Xi,t̂i), from which it follows that |1−Li|≥ε̂ because of Equation (5). To construct the sequence of times, {ti}, that will generate the contradiction, we must find a sequence of moments slightly before each of the {t̂i}, since we will be interested in what happens to the probabilities of {Xi} right before the time they occur—recall that PXi(t̂i) is simply equal to one. A natural choice would be to pick moments such as the following: t̂i−110i. Sequences of moments like this always occur just before moment {Xi} occurs, and are such that as i →∞, they get arbitrarily close to those moments, t̂i. But we need the new sequence to be more ‘tightly’ tied to PXi(t). The factor required will depend on each of the PXi(t) and how quickly each approach their limit near t̂i. The limit in question is Li, so by the definition of limit, we can find a minimal distance, δ > 0, such that the distance between PXi(t) and Li can be made less than an arbitrary ε>0 for all t within the minimal distance δ of t̂i. The ‘ ε’ we use here comes from the assumption that D(Xi,t̂i)→0, namely, ε̂/2, which is half of the ε̂>0 we have from Equation (5). Now consider the sequence of {δi}, each greater than zero, with the property that if |t̂i−t|<δi, then |Li−PXi(t)|<ε̂/2. Again, that δi > 0 exist follows from the definition of the (left) limit of PXi(t) at t̂i being Li. Finally, we define {ti} as ti=t̂i−δi10i. Thus the sequence of {ti} is such that as i →∞, the tis get arbitrarily close to the t̂is (the times that Xis occur), and further, each ti is within δi of t̂i, so |Li−PXi(ti)|<ε̂/2, for each ti. From this it follows that   1−PXi(ti)≥ε, for someigreater than anyN > 0, (6) where ε=ε̂/2. (See the appendix, near Equation (A.3) for details.) In making use of Bayes’s theorem, the conditional probabilities PYXi(ti) and PXiY(ti) will be needed. It will be assumed that limi→∞PYXi(ti)=L and limi→∞PXiY(ti)=1. The reasoning for this is as follows: given that the limit of PY(ti) is L, conditionalizing on the Xis, which are the particular events leading up to Y, should not affect convergence; similarly since (by definition) {Xi} are events leading up to Y at times, {ti}, converging to the time they occur, t̂i, the probability of these events at these times conditionalized on Y will naturally converge to one as time converges to the time of Y.9 The convergence of the conditional sequence PYXi(ti) to L entails that for any ε1>0, there is an N1>0 such that for all i>N1,   |PYXi(ti)−L|<ε1. (7) Similarly, the convergence of the conditional sequence PXiY(ti) to one entails that for any ε2>0, there is an N2 > 0 such that for all i > N2,   |1−PXiY(ti)|<ε2. (8) From this, appropriate values can be chosen for ε1 and ε2 in terms of the ε from Equation (6) and then Equation (1)—namely, Bayes's theorem—can be utilized to show that   |PY(ti)−L|≥ε′>0, (9) for some i greater than any N>max{N1,N2}>0, where ε′=εL(ε+4)4(ε+2). Thus PY(t) is discontinuous from the left at ty. (See Equations (A.7) through (A.11) in the appendix for details.) 2.3.2 Incoherence Having shown that if the sequence of ‘indeterministic jumps’, {D(Xi,t̂i)}, does not converge to zero, then we have the contradictory result that PY(t) is discontinuous from the left at ty, consider now the case in which {D(Xi,t̂i)}→0. Again, this means that the difference between the left-hand limit of PXi(t) at t̂i and one can be made arbitrarily small for large enough i. That is, we can find events, Xi, probabilistically relevant to Y and arbitrarily close to the time of Y by picking an i large enough such that the probability of Xi just before t̂i is arbitrarily close to one—this despite the fact that the probability of Y at t̂i is bound away from one. For example, this means that the probability of the ball being at a point arbitrarily close to falling into the cup at an arbitrarily small instant before (say ε) it actually does is a million, or a billion, or trillion, and so on, times closer to one than the probability of the ball’s falling into the cup the same arbitrarily small instant, ε, before it actually does. More concretely, pick any location and time that the ball was very close to falling in, say within 110256 of a second and 110512 of an inch from the edge. At an even smaller instant before the ball was at this location, its probability of being there was any huge number you like, say 101024, times closer to one than the ball’s probability was when it was a trillion (or any huge number you like) times closer to falling in. In short, this result says that while the probabilistically relevant antecedent events of the ball being closer and closer to the hole with the favourable trajectory it had are such that their probability right before they happen are getting as close to one as you like, the probability of the ball falling in the hole, as close to the time it did as you like, is as many times farther away from one as you want to make it. That an event’s probability right before it happens is arbitrarily farther away from one than is each of an infinitesimally close series of (probabilistically) relevant events leading up to it is unintelligible. This I offer as the incoherency horn of the dilemma.10 2.3.3 Extending to CDJP Having made the case that DJP is untenable, it is relatively straightforward to extend this to CDJP. The same construction technique from Section 2.3.1 (just before Equation (6)) can be employed to generate a similarly problematic sequence of events, {Xi} and {ti}, where {ti} converges to tx (the moment x occurs) instead of ty, and in this case Xi = = the event of the ball and the squirrel being where they were and in the particular states (velocity, direction, internal states, and so on) at ti, which is converging to tx. In the case of DJP, we exploited the fact that PY(t) has to ‘jump’ because of its dependence (via Bayes’s theorem) on the PXi(t), which jump as Xi occurs at ti; but in the case of CDJP, the problem is more immediate. It seems incontrovertible that the {Xi} events are positive (‘because’) token causes, especially for large i when ti gets very (arbitrarily) close to tx.11 But according to CDJP, this would require that PY(t) must at the very least have jump discontinuities at each of {ti}, corresponding to each of the {Xi} being a ‘because’ token causal factor—and this raises numerous problems. First, Eells explicitly states that it is a requirement that PY(t) be constant in some small interval to the left (and right) of tx, but this cannot be the case if each {Xi} are ‘because’ token causal factors. Second, things would seem to get even worse in that there are an uncountable number of possible such ‘because’ token causal events just prior to tx, which might well be in tension with core parts of real analysis.12 But there is a potential way out, which is related to how Eells deals with suggestions that his account cannot distinguish between simultaneous events as far as causal relevance goes. Recalling that the causal background context, K, and temporal indexical set, Wt, is built into the definition of the probability trajectory, that is, PY(t)=P(Y|K & Wt), it might be open to him to deny that the probability trajectories are the same trajectory (function) for each of the {Xi} events and the X event.13 In other words, when considering X’s causal significance for Y, we have   PY(t)=P(Y|K & Wt), but for each Xi’s probability trajectory for Y, we have   PYi(t)=P(Y|Ki &(Wi)t). Thus the PYi(t) could be such that they were (point-wise) converging to PY(t) (from below) and still jumping (as required) at the critical time, ti. In particular, the PYi(t) functions would presumably be increasing such that PYi(t)≤PYi+1(t)≤…≤PY(t), for t<tx (Figure 3). A virtue of this response would be that the increasing nature of the probability (trajectories) near the critical point, ti, would fit neatly with the idea that as each of the events in question—namely, the ball and squirrel being where they were (Xi) at instants closer and closer to when they collide (ti) in the favourable way they did with respect to causing the ball to fall in the hole (Y)—actually occur, the probability of Y increases. Figure 3. View largeDownload slide Probability trajectories PYi(t) for Y with respect to Xi converging to PY(t) from below. Figure 3. View largeDownload slide Probability trajectories PYi(t) for Y with respect to Xi converging to PY(t) from below. Unfortunately, however, this response is not available on Eells’s account, as it stands. The first, not insurmountable, problem is that as mentioned above, Eells requires that probability trajectories be constant in some open interval to the left (and right) of the jump discontinuities. Eells was (rightly) concerned about the limits upon which his account so crucially turns, and in personal correspondence, he indicated that the reason for requiring the trajectories to be constant in this way was indeed in part to ensure that the limits exist, which of course it would, but at the steep cost of coherency. To simply require that the left- and right-hand limits exist, and not necessarily be constant in any open intervals to the left (or right), would seem to be a less restrictive alternative, and it would allow the ‘series of increasing trajectories response’ I just sketched. But there is a second problem—and that is that the (extensional) way that Eells has defined the causal background context, K, and temporal indexical set, Wt, does not allow him to assert that the trajectories associated with the Xi’s causal relevance for Y are distinct from X’s trajectory, PY(t). This is because K is defined to be the set of factors causally independent of X or Xi that are causally relevant to y’s being Y in the actual situation, and thus are identical for X and Xi. And as for the Wt, the (actual) factors relevant to y’s being Y that are not in K but have fallen into place by time t, these too will coincide in the cases of the Xi and X.14 It seems then that utilizing systematic discontinuities for understanding occurring events (DJP) or causation (CDJP) is untenable, but it turns out that even the threat of discontinuities in probability trajectories gives rise to problems for causal theorists, as I take up next.15 3 Broader Discontinuity Concerns The attention Eells paid to continuity was well placed: how the probabilities evolve through time is both relevant and non-trivial. Such continuity issues ought to be of more concern to all causal theorists exploring probabilistic accounts. Most other prominent accounts have the opposite concern from Eells: they depend—at least implicitly—on the continuity of probability trajectories. Accounts most obviously embroiled with continuity are those that make explicit use of temporal conditions, for example, Peter Menzies's ([1989]), which utilizes ‘temporally dense’ chains of (counterfactual) probability increases, and Igal Kvart ([2004]), which looks for ‘stable screeners’ and ‘causal relevance neutralizers’ in temporally intermediate events between cause and effect. If the temporal evolution of the probability values in question cannot be assumed to be continuous, this strains such accounts by rendering the probability in the interval potentially unstable in that it may jump between values that may or may not preserve the presence of the relevant probability increases or the absence of stable screeners (probability decreasers). Menzies’s account does not explicitly address continuity, but it does, however, implicitly constrain discontinuities. Building on David Lewis’s ([1986]) counterfactual analysis in terms of unconditional probabilities, Menzies requires that causally related events c and e be ‘probabilistically dependent’. He understands this to mean that there must be intermediate events corresponding to any finite set of intervening times between the times of c and e such that the actual probability of each of the intervening events is significantly higher than it would have been had the immediately preceding event in the set not happened. Put in terms of Eells’s probability trajectories, this effectively requires the probability function to be monotonically increasing between the times of c and e, and thereby limits the number and kind of possible discontinuities (recall Footnote 12).16 Kvart seems to be sensitive to the possibility of the inequality flipping at intermediate temporal points between c and e, and offers a condition that may be intended to prevent it. The condition is developed in an example in a section entitled ‘Illustrating Causal Relevance through Infinite Regress’ (Kvart [2004], pp. 369–70), but is difficult to assess. The example involves considering a series of events at intervening moments between putative cause and effect, and verifying that the inequality does not reverse at any of the points—that they are not ‘neutralizers’. Causal relevance is defeated if one of the moments reverses the inequality. But if one can continue indefinitely without a reversal, one thereby constructs an infinite series of moments that are not neutralizers. His full condition is that ‘there is only infinite regress of this sort (i.e. there not being a suitable terminating chain)’, and this establishes that c is causally relevant to e, since it guarantees that ‘there is no neutralizer for c and e’ (Kvart [2004], p. 370). The logical form of the condition seems to be that all such sequences of intervening events converging to an intervening time must not terminate with (contain) a neutralizer. If so, then it might be understood as employing (part of) an alternative specification of the standard epsilon-delta definition of continuity—constraining all convergent sequences in the domain—but in such a way that it entails not full continuity, but rather something close to Menzies’s monotonically increasing condition. While other prominent probabilistic accounts of causation may eschew explicit temporal conditions, not surprisingly, they are not able to avoid temporal (and hence continuity) issues altogether. For example, in (Noordhof [1999]; Hitchcock [2004]; Northcott [2010]; Glynn [2011]), one finds reference to probability inequalities assessed ‘shortly before’ the time of the cause and/or effect.17 With some variation, these accounts all consider the probability of event e just before it occurs, conditional on the presence and absence of putative cause c. The critical inequalities involve conditional probabilities at moments just before the time of the cause, tc – ε, or ‘shortly before’ the time of effect, te – ε, and perhaps at times in between. This comparison is typically assumed to be stable, that is, that one can ignore the precise ε>0 magnitude expressed by ‘shortly before’, safely assuming that if ε is sufficiently small, the values of the probabilities will retain the property of interest, an inequality in this case. The inequality must be assumed to hold for all values closer than ε, because otherwise its holding would be completely arbitrary: it could be made to hold or not depending on the particular ε one chose, which would render the inequality meaningless for the purpose at hand. There is a very important difference between it holding for some ε>0 and it holding for some ε>0 and all smaller values. This kind of stability can be generally assumed only if the probability trajectories are continuous with respect to time to the left of tc and/or te. I focus below on time te, but the same reasoning applies to time tc or any other time between them. The relevant probability inequality is:   Pte−ε(e|c)>Pte−ε(e|∼c). (10) The critical probabilities are shortly before the time of putative effect e because at the precise time of e, the (conditional) probabilities are trivial. If the probability trajectories involved are not continuous to the left of te, then the mere fact that the inequality holds at a given time shortly before te fails to ensure that it will hold (to the left) in any interval about te. If the inequality could be reversing in the neighbourhood of (te−ε,te), then it holding at te – ε is not going to be decisive for the causal efficacy of c, since such accounts clearly require a non-arbitrary sense of Equation (10) for their warrant and plausibility. Luke Glynn’s ([2011]) admirably complete account shows that even when utilizing variables instead of events for the relevant probability assessments, there remains a dependence on time, and hence continuity. Glynn ([unpublished]) originally employed a just before ε-inequality; however, the version published as (Glynn [2011]) eliminates such explicit reference to time, expressing causal conditions instead in terms of the conditional probabilities of variables attaining a value. Nonetheless, a temporal index plays a role in the definition of Glynn’s ‘revealer of positive relevance’ set, which is to ‘include only variables representing events occurring no later than tE’ (Glynn [2011], p. 358). Glynn also employs a condition reminiscent of Menzies and Kvart in requiring that there be the right combinations of ‘increasers’ (supporters of the inequality) and ‘decreasers’ (under-cutters of the inequality) in interval (tc, te). While this might be thought to stabilize the inequality in the same way as continuity, the notion of the ‘right combination’ is only coherent if the set of potential increaser–decreaser points is a finite set, which is certainly not the case for interval (tc, te). Finally, in his discussion of the ‘hiker ducking boulder’ example, Glynn ([2011], p. 382) proceeds by ‘interpolating a variable’ along the route of the boulder by which time it is too late for the hiker to duck. So, even when utilizing ‘variables attaining values’ instead of ‘events’, temporal indices and their attendant continuity issues still loom. Continuity assumptions may also come into play in related discussions, for example, generalized causal relevance (Hitchcock [1993]), deterministic chance (Glynn [2010]), or rational belief revision (van Fraassen [1984]). In the first, Hitchcock’s generalized account of causal relevance utilizes probability spaces and measures to define conditional probability functions on variables (for example, amount of medicine, and blood pressure). He analyses (certain) causal claims as captured by conditions like: ‘“there exists an m such that x > m implies that f(x)>1−ε”, where the value of ε is typically left vague’ (Hitchcock [1993], p. 350). Glynn uses the inequality Ch(tc−ε)w(pe|pc)>Ch(tc−ε)w(pe|∼pc), translating it as, ‘just before c occurred, the chance of e conditional upon the occurrence of c was greater than the chance of e conditional upon the non-occurrence of c’ (Glynn [2010], p. 74). Finally, van Fraassen employs credence functions indexed by times towards the future, where Pt is the agent’s credence function at time t, and Pt + x is her function at a later time, t + x (van Fraassen [1984], p. 244). To summarize, if probability trajectories cannot be assumed to be continuous, then probabilistic accounts of causation are undermined because the probabilities around the times of interest are rendered potentially ‘unstable’, in the sense of jumping between values that may or may not preserve the relevant features of the probabilistic analysis—typically, probability increases in the presence of the putative cause expressed in the form of an inequality. A glaring question at this point might be: why not simply require that probability trajectories be continuous? There are two reasons to worry about this move. The first and immediate reason is because it seems that some kinds of events must be understood as having discontinuous probability trajectories, for example, quantum events. I take this up at length in the next section. A second, perhaps less obvious, reason is that such a continuity assumption would ‘definitionally’ legislate a priori against a particular qualitative feature of probability trajectories (a discontinuity) that may well turn out to be relevant to causation and other empirical and metaphysical questions. Whatever the remedy, it ought not be so restrictive as to decide such substantive empirical or philosophical questions by definition. 4 The Continuity Bind So far, if one is ‘keeping score’, the tally would seem to favour continuity. One important and influential analysis of causation (Eells [1991]) has faltered because of a dependence on discontinuity. Add to this the very real (if neglected) continuity needs of many other probabilistic causal analyses and the balance would seem to tip towards continuity. Enter quantum theory. In what follows, I consider the case for discontinuous probability trajectories based on quantum phenomena, which effectively ties the score and creates a real ‘continuity bind’. I then sketch two possibilities for mediating between the demands for continuity assumptions and the need not to rule out the possibility of discontinuity. Quantum events like the decay of an atom may well have a non-trivial probability of occurring that does not change through time. Accordingly, at the instant they occur, their probability trajectory will jump from a constant value to one as in Figure 4. Quantum events seem to be of a singular kind that does not depend on any ‘ordinary’ causal factors, and hence have a probability trajectory that does not evolve through time until it jumps to one. One might suppose that these discontinuities could be limited to the quantum level, but this is not obviously possible. A straightforward example suggesting otherwise involves nothing more than a Geiger counter that emits a clicking sound (macro-level event) when a micro-level decay event is detected.18 Another response might be to maintain that while at the quantum level such quantum events have discontinuous probability trajectories, macro-level events that involve quantum events will ‘dampen out’ the discontinuity. On such a view, events at the macro-level always have duration; they consist of intervals of time (and space). Thus the discontinuity is avoided at the macro-level because the detection event and the ensuing clicking event have temporal duration during which the probability of the click (detection event) can increase sharply but continuously to one. Let such a quantum-level decay event have a probability of r, and then the graph in Figure 5 represents the probability trajectory of Y= the click of the counter, with interval [ty,ty+δ] being the duration of the detection event, that is, the time from the decay event through the detection and ensuing click. The probability of Y prior to ty is r−ε, for some ε>0, which incorporates the possibility of a mechanical or other macro-level failure to detect the particle or to bring about the click. Figure 4. View largeDownload slide The discontinuous probability trajectory of a quantum-level event. Figure 4. View largeDownload slide The discontinuous probability trajectory of a quantum-level event. Figure 5. View largeDownload slide The continuous probability trajectory of the detection of a quantum event. Figure 5. View largeDownload slide The continuous probability trajectory of the detection of a quantum event. At best this response—which understands macro- and quantum-level (physical) probability and events as distinct kinds, with macro-level events having a duration that ‘dampens out’ the discontinuous probability at the quantum level—saves only macro-level continuity, and with the cost of assuming a bifurcated view of physical probability and/or events. And more problematically, it involves making significant assumptions about how empirical theory will ultimately unfold. Retaining at least the possibility of discontinuous probability trajectories at all levels would seem to be the preferable way to proceed. Thus we have the continuity bind. There are pressing needs for both continuity and the possibility of discontinuity. In particular: requiring systematic discontinuities like DJP or CDJP is problematic, presupposing everywhere continuous probability trajectories is too restrictive, and yet probabilistic causal analyses require some of the stability provided by continuity assumptions. Next, I sketch two possibilities for mediating between the demands for continuity assumptions and the possibility of (empirically or metaphysically motivated) discontinuity, namely, by (1) utilizing discontinuous first derivatives of everywhere continuous probability trajectories, or (2) abandoning point probability trajectories altogether in favour of imprecise (interval) probability trajectories. 4.1 Retaining continuity with discontinuous first derivatives If it turns out that some events falling into place—in particular, probabilistically significant ones—need to involve some kind of qualitative ‘shift’ in the relevant probability trajectory, then, as we have seen in Eells’s work, this shift can be understood as a jump discontinuity. But there is another way. An alternative is to capture such a shift by a jump in the rate of change of the probability trajectory. Formally, this could be put in terms of the first derivative (with respect to time) of the probability trajectory function, PY′(t), which is interpretable as the rate of change of the original function PY(t). The role played by jump discontinuities in PY(t) in causal analyses could be handled in a parallel way by jump discontinuities in PY′(t), and the jumps in the first derivative would still correspond to, and be grounded in, the idea that ‘causes’ constitute shifts in the probability trajectory—in this case the shift would not be a discontinuous jump in the probability trajectory itself, but rather a discontinuous jump in the rate of change of the probability trajectory.19 The probability trajectory, PY(t), could then be assumed or required to be continuous, thereby avoiding the discontinuity issue altogether. There would still be significant work to be done developing the precise conditions, since such conditions would be based in both (1) the discontinuities of the derivative of the probability trajectory, and (2) the shape of the now continuous probability trajectory itself, and these may or may not be easy to put together. In particular, for an event to be a causal factor, it would have to result in the appropriate jump in PY′(t), but it would also have to be the case that PY(t) itself remained high, as per (Eells [1991], Chapter 6), especially Section 6.2). Another issue to work out would be precisely how differentiable PY(t) would have to be. It would not be necessary to require PY(t) to be everywhere differentiable; as long as the left and right derivatives existed at the point in question, the derivative itself could fail to exist.20 While this approach may have promise for causal accounts, and especially for Eells’s discontinuity problems, there is little reason to think that it is viable or even coherent for quantum mechanics. Further, as already discussed, a continuous probability trajectory would at best alleviate the need for macro-level discontinuities, and while the idea of entirely distinct kinds of (physical) probability at the quantum and macro-levels is not incoherent, it is not an attractive philosophical commitment, as it legislates too much from a pre-empirical framework. 4.2 Imprecise (interval) probability trajectories Within the setting of point probabilities, the pull towards and push away from continuity does indeed constitute a bind, but this is not necessarily so in settings in which probabilities are not points, but rather intervals. Imprecise (non-point-valued) probability has been studied for some time in applied and subjective probability settings—for example, (Walley [1991]; Kyburg [1999]; Weichselberger [2000])—and has garnered renewed interest of late; see (Augustin et al.[2014]). And recently imprecise probabilities have been further extended to objective understandings of chance (physical probabilities); see (Fenton-Glynn [forthcoming]; Peressini [2016]). The advantage of interval-valued probability is that the notion of a continuous function opens up when the function in question is an interval-valued function. It turns out that there are multiple ways to generalize the standard point function definition of continuous, and thus one could seek to develop and employ a kind of continuity that is not so restrictive as to decide substantive philosophical questions by definition, stabilizes causally salient probability inequality claims between trajectories, and retains the possibility of jumpiness to capture quantum or other theoretically motivated discontinuity. The area in which to seek more general notions of continuity is generalized set-valued analysis. In set-valued analysis, one finds weaker notions of continuity, for example, upper and lower semi-continuity, that could be employed in a way that minimizes the restrictions on discontinuities, actually allowing some kinds, and yet offering sufficient stability for causal considerations.21 Another approach would be to utilize classes of interval functions generated by distinct generalizations of continuity. For example, Anguelov et al.([2006]) develop three distinct notions of continuity—S-continuity, D-continuity, and H-continuity—that apply to interval functions. The class of S-continuous interval functions seem especially well suited. And Anguelov ([unpublished]) explores how by pairing a lower semi-continuous function, f, with an upper semi-continuous function, f¯, such that f̲≤f¯ produces interval function F=[f̲,f¯], which is a completely novel entity from both algebraic and topological points of view. Such functions can be quite jumpy (discontinuous in the ordinary sense) with the caveat being that the upper endpoint function can only jump up and the lower endpoint function can only jump down, and hence such functions do not have the problematic ‘gaps’ that discontinuous point-valued functions can have. 5 Concluding Remarks I hope to have shown that the question of the continuity of probability trajectories does indeed have important implications for any account of causality that employs a form of the probability raising condition. If the probability functions can be ‘jumping about’ in ways typical of discontinuous functions, then the stability of the relevant probability increase is called into question. Because of this, coupled with pressure from physical (quantum) theory to allow for the possibility of discontinuities, one is faced with a (dis)continuity bind that appears to be difficult to resolve in the standard framework. While a continuous trajectory with discontinuities in its first derivative doing this work is one possibility, exploration of the imprecise probability framework seems the more promising option for a way out of the bind, that is, a way of retaining the possibility of discrete shifts in a qualitative feature of a probability trajectory, while still enjoying some of the stability of continuity. Appendix This appendix contains the detailed proof sketched in Section 2.3.1. Let 0 < L < 1 be the left-hand limit of PY(t) at the time Y occurs, ty. To show that PY(t) is discontinuous from the left at ty, it is sufficient to find a sequence of times, {ti}, converging to ty such that PY(t) evaluated at those points does not converge to L, that is, {PY(ti)}→L. In terms of the definition of convergence, this means showing that there is an ε>0 such that   |PY(ti)−L|≥ε, for someigreater than anyN>0. (A.1) The next task is to actually construct the problematic sequence, {ti}, and show that it satisfies Equation (A.1). Since, by hypothesis, {D(Xi,t̂i)} does not converge to zero, there is an ε̂>0 such that   D(Xi,t̂i)≥ε̂, for someigreater than anyN>0. (A.2) It will simplify notation to define {Li} to be the sequence of left-hand limits for each of the {Xi}, so Li=1−D(Xi,t̂i), from which it follows that |1−Li|≥ε̂ because of Equation (A.2). To construct the sequence of times, {ti}, that will generate the contradiction, we must find a sequence of moments slightly before each of the {t̂i}, since we will be interested in what is happening to the probabilities of the {Xi}s right before the time they occur—recall that PXi(t̂i) is simply equal to one. A natural choice would be to pick moments like t̂i−110i. Such a sequence of moments are always just before the moment {Xi} occurs and are such that as i→∞, they get arbitrarily close to those moments, t̂i. But for our purposes here, we need the new sequence to be more ‘tightly’ tied to PXi(t). The factor required will depend on each of the PXi(t) and how quickly each approach their limit near t̂i. The limit in question is Li, so by the definition of limit, we can find a minimal distance, δ>0, such that the distance between PXi(t) and Li can be made less than an arbitrary ε>0 for all t within the minimal distance, δ of t̂i. The ε we will use here comes from the assumption that D(Xi,t̂i)→0, namely, ε̂2, which is half of the ε̂>0 we have from Equation (6). Now consider the sequence of {δi}, each greater than zero, with the property that if |t̂i−t|<δi, then |Li−PXi(t)|<ε̂2. Again, that the δi>0 exist follows from the definition of the (left) limit of PXi(t) at t̂i being Li. Finally, we define the {ti} as ti=t̂i−δi10i. Thus the sequence of {ti} are such that as i→∞, the tis get arbitrarily close to the t̂is (the times that the Xis occur), and further, each ti is within δi of t̂i, so |Li−PXi(ti)|<ε̂2, for each ti. This entails that   |1−PXi(ti)|=|1−Li+Li−PXi(ti)|≥| |1−Li|−|Li−PXi(ti)| |, after adding and subtracting Li, regrouping, and making use of the identity |a+b|≥||b|−|a||. Because, from above |1−Li|≥ε̂ and |Li−PXi(ti)|<ε̂2, we have that   |1−PXi(ti)|≥|ε̂−ε̂2|=ε̂2, (A.3) for some i greater than any N > 0. This step also depends on the fact that if |a|≥|c| and |b|≤|d|, then |a−b|≥|c−d|. Finally, using Equation A.3, removing unnecessary absolute value signs, and relabelling our crucial value ε̂2 as ε for notational simplicity, we get:   1−PXi(ti)≥ε, for someigreater than anyN>0. (A.4) In making use of Bayes’s theorem below, the conditional probabilities PYXi(ti) and PXiY(ti) will be needed. It will be assumed that limi→∞PYXi(ti)=L and limi→∞PXiY(ti)=1. The reasoning for this is as follows: given that the limit of PY(ti) is L, conditionalizing on the Xis, which are the particular events leading up to Y, should not affect convergence; similarly, since (by definition) the {Xi} are events leading up to Y at times {ti} converging to the time they occur t̂i, the probability of these events at these times conditionalized on Y will naturally converge to one as time converges to the time of Y.22 The convergence of conditional sequence PYXi(ti) to L entails that for any ε1>0, there is an N1>0 such that, for all i > N1,   |PYXi(ti)−L|<ε1. (A.5) Similarly, the convergence of the conditional sequence PXiY(ti) to one entails that for any ε2>0, there is an N2 > 0 such that, for all i > N2,   |1−PXiY(ti)|<ε2. (A.6) Now the particular values to be used for ε1 and ε2 are   ε1=εL2(ε+2)>0 and ε2=ε4>0. Thus from Equation (A.6) and the value for ε2 and multiplying through by L > 0,   |L−LPXiY(ti)|<ε2L=εL4. (A.7) Adding and subtracting L, using the triangle inequality, and Equations (A.5) and (A.7), we may derive that   |PYXi(ti)−LPXiY(ti)|=|(PYXi(ti)−L)+(L−LPXiY(ti))|≤|PYXi(ti)−L|+|L−LPXiY(ti)|<εL2(ε+2)+εL4=2εL+εL(ε+2)4(ε+2)=εL(ε+4)4(ε+2). (A.8) Next, using Equation (A.5) again, with the value for ε1, we have that   PYXi(ti)>L−εL2(ε+2). (A.9) From Equation (A.4), multiplying by PYXi(ti), and using Equation (A.9), we can derive that   PYXi(ti)(1−PXi(ti))≥(L−εL2(ε+2))ε=(2L(ε+2)−εL2(ε+2))ε=(4L+2εL−εL2(ε+2))ε=εL(4+2ε−ε2(ε+2))=εL(ε+4)2(ε+2). (A.10) Finally, returning to Equation (A.1), which expresses the continuity condition   |PY(ti)−L|=|PXi(ti)PYXi(ti)PXiY(ti)−L|                (Bayes’stheorem(2.1))=1PXiY(ti)|PXi(ti)PYXi(ti)−LPXiY(ti)|≥|PXi(ti)PYXi(ti)−LPXiY(ti)|                (since PXiY(ti)≤1)=|PXi(ti)PYXi(ti)−PYXi(ti)+PYXi(ti)−LPXiY(ti)|=|PYXi(ti)(PXi(ti)−1)+(PYXi(ti)−LPXiY(ti))|≥| |PYXi(ti)(1−PXi(ti))|−|(PYXi(ti)−LPXiY(ti))| |(|a+b|≥||b|−|a||)≥|εL(ε+4)2(ε+2)−εL(ε+4)4(ε+2)|                               (from Equations (A.8) and (A.10))23=εL(ε+4)4(ε+2)>0. (A.11)This holds for some i greater than any N>max{N1,N2}>0. Thus PY(t) is discontinuous from the left at ty. 1 Peter Menzies ([1989]) and Igal Kvart ([2004]) in their respective accounts are also sensitive to how the probabilities evolve through time, but without explicitly addressing continuity one way or the other; I take this up below. 2 For this article, I will understand the basic form of these probabilities as unconditional. This is distinct from general probability, which applies to classes of events and whose basic forms are conditional. This is for clarity and convenience only; the continuity issues I deal with here are not sensitive to whether the probabilities are analysed in the standard Kolmogorovian way or some other way, with a different conditionalization rule and/or with conditional probabilities as the basic form; see (Hájek [2003]). 3 A ‘jump’ discontinuity is one in which the left- and right-hand limits exist, but are not equal. It is important to note that in order for there to be a discontinuous jump (jump discontinuity) as Y occurs (or at any other significant time, for example, tx), it is necessary that the trajectory be continuous in some (perhaps very small) interval to the left of the jump discontinuity—this will become significant below. The other two possibilities, that the left- and right-hand limits exist and are equal, or that one (or both) fail to exist are called ‘removable’ and ‘essential’ discontinuities, respectively. The essential discontinuity case will come up again below. 4 Another way of putting this, in terms of type-level probabilistic causality, is that the bump is a token-level positive causal factor for the ball going in, but that bumps of this type are type-level negative causal factors. 5 This is because, by definition, M≤P+, so if B=M−P−>0, then P+−P−>0, so P+≠P−, that is, the limits of PY(t) from the left and the right at tx are not equal, which again entails a jump discontinuity at tx. 6 Eells ([1991], p. 354) makes it explicit that it is a presupposition of his account that the left and right limits of PY(t) at tx exist. The value at the point tx itself is not constrained. At another point, Eells ([1991], p. 355) also requires that the probability trajectory actually be constant in some open interval both to the left and right of tx. He may have sensed the possibility of pathological behaviour around such discontinuities. In any event, I argue below that his required ‘jump’ discontinuity will entail not only that PY(t) cannot be constant to the left of tx, but that it cannot even have a limit from the left. 7 Recall that the causal background context, K, is built into the definition in that PY(t)=P(Y|K & Wt), so for example, PY|X(t) will be P(Y|X & K &Wt). In the case of the example under consideration here, where the Xs are not arbitrary events, but rather ones causally relevant to Y and converging to Y’s time, all the probability functions in what follows will have the same causal background context; I will generally suppress the additional notation in what follows, noting it only where it makes a difference. 8 As will become clear below, a key feature of the Xis is that they be at least probabilistically and/or causally relevant to Y. Thus the construction is unproblematic when there is a space-time process leading up to or constituting event Y, as in the case here. The argument does not, however, necessarily apply to quantum events or certain kinds of macro-level events that embed quantum event, which may not have such probabilistically relevant antecedent events under certain interpretations. (I will not get into the distinction between causally and probabilistically relevant here; there is much debate about when and how these notions coincide. In most settings, probabilistic relevance is weaker and I will employ it here.) While there is debate about whether all macro-level examples of causation need to have such an intermediate process, even accepting a pluralistic view (Hall [2004]), for my argument here it is sufficient that it work for the large class of macro-level cases like Example 1, in which there is such a mediating process. See Footnote 10 for more on the kinds of quantum-embedding macro events that are excluded. 9 Nothing turns on these particular values for the limits of the conditional probabilities. As long as they converge to some 0<L̂≤1, as they must, then the proof can proceed with simple scaler adjustments. 10 I stress again that the argument of this section is against the view embodied in DJP, namely, that all occurring events must jump discontinuously to one; it is does not militate against the (very plausible) view that some classes of events might so jump. In particular, it is compatible with the view that the probability trajectories of quantum events or certain kinds of macro-level events that embed quantum events in particular ways do in fact so jump. As a reviewer for this journal points out, without such a limitation a quantum variation of my Example 1 might be construed as a counterexample to the incoherence horn of the dilemma. Consider Example 1 modified so that the golf ball’s fall into the cup triggers a quantum triggering device that has an irreducible chance of 0.9 of triggering an explosion nearby. One must assume (per impossible) that the triggering process and the detonation, if they happen, will both be instantaneous, so if the explosion happens, it will happen at the same moment, ty, when the ball falls into the hole. Then if the ball being where it was with its favourable trajectory at moments converging to the time it falls into the cup are the events {(Xi,t̂i)}, and the explosion is Y, the {(Xi,t̂i)} events are getting arbitrarily close to Y and their probabilities are getting arbitrarily close to one, but the probability of Y is bound away from one (at 0.9). Whether it is legitimate in this context to collapse such triggering events on to Y in the probability trajectory is not clear, but fortunately it need not be taken up here. 11 See (Kvart [2004], p. 369–70) for a similar use of such an example and discussion. 12 There are provable restrictions on the size and nature of the set of points of discontinuity for real-valued functions. In particular, this set must be an Fσ set, that is, one that can be written as a countable union of closed sets of real numbers (Royden [1988], p. 53). And if the function in question is monotonically increasing or decreasing on an open interval, as a probability trajectory might well be eventually, then there can be no essential discontinuities and at most countably many ‘jump’ discontinuities; see (Rudin [1976], p. 95–7) for details. 13 This is how Eells’s account avoids the putative defect that, for example, an event located far away, say another squirrel kicking a tree, but taking place at the same time as x, would also be deemed a cause of Y because the trajectory would jump at the time of both events, tx. But in fact, there would be two trajectories, one for X and one for the faraway event, and the latter trajectory would not have a ‘because’ jump because its (distinct) Wt would not include the factors (collision, change in the ball’s trajectory, and so on) traceable back only to x being X. 14 Eells ([1991], pp. 344–5) does allow that one might conceive of K as a ‘kind of population’, and this could open the possibility of a non-extensional understanding that might allow one to individuate the trajectories in a more fine-grained way. Of course, such a move brings with it problems of its own. The account could also potentially be modified in a way that allows for (appropriately bounded) essential discontinuities as well as jump discontinuities. As long as a proper bound was in place, one could employ the ‘limit superior’ (least upper bound of the cluster points) from the left, rather than a limit from the left. This would potentially alleviate issues with isolated essential discontinuities, though the problematic proliferation of them due to DJP/CDJP would remain, as would the problems to be taken up in the next section. I owe thanks to an anonymous reviewer for helping me see this possibility. 15 As a reviewer points out, another possibility for handling unruly kinds or numbers of discontinuities in probability trajectories would be to move to an account along the lines of Woodward's ([2003]) or Pearl's ([2009], which distinguishes between the actual situation of interest and causal models of the situation. Such an approach, in utilizing a model to analyse causation, considers only a fixed and generally discrete set of causal variables. Thus even if in the actual situation the probability trajectory had (lots of) essential discontinuities, within the confines of the idealized model such ‘bad behaviour’ might well be modelled sufficiently by a simple jump discontinuity. Indeed, if the continuity bind proves insurmountable in the end, then this could be seen as a case for such model-based accounts. 16 This turns out to be an implausibly strong condition and Menzies ([1996]) himself disavows even an amended version of this theory. This tension between stability in the probability trajectory and cripplingly strong constraints on it is endemic to the point probability framework, as I take up below. 17 Christopher Hitchcock ([2004], p. 414) develops a proposal growing out of Ned Hall’s suggestion that one should evaluate the probability of an effect shortly before the time at which the effect occurs. 18 Another now classic (if rather mean) example used in this context by Dretske and Snyder ([1972]), with a debt to Schrödinger, involves a quantum-mechanical process and detector hooked to a device that fires a revolver at a cat, and is calibrated to do so with a (quantum) probability of 0.01. 19 There is no concern that such discontinuities in the derivative of PY(t) might result in the same or similar problems because there is an important difference: the arguments above cannot be applied to the derivative of PY(t) because the derivative of a probability function is not itself a probability function. 20 Since the number of possible configurations of a probability function and its derivative afford many more complexities than are available when simply classifying discontinuities, it could well be that, as in (Hitchcock [1993], p. 359), there is ‘no natural division of causal relevance into a few simple species, such as positive and negative; rather, causal relevance is infinite in variety’. On this account, causal relevance is infinite in variety because it is represented as an array of conditional probability functions that themselves have infinitely many possible configurations. 21 See (Peressini [2016]) for further details. 22 Again, nothing turns on these particular values for the limits of the conditional probabilities. As long as they converge to some 0<L̂≤1, as they must, then the proof can proceed with simple scaler adjustments. 23 This step also depends on the fact that if |a|≥|c| and |b|≤|d|, then |a−b|≥|c−d|. Acknowledgements I owe most of what I know and much of what I have figured out about probabilistic causality to my friend and teacher, the late Ellery Eells. This project started as a result of a graduate seminar taught by Ellery in 1989 in which we were reading a draft of (Eells [1991]). I was concerned about the continuity of the probability functions and, despite the fact that I could not (yet) substantiate my worries, the ever generous Ellery gave me a footnote in his book (Eells [1991], p. 294). I pursued the issue as one of my dissertation qualifying papers in 1991, but shelved the project because I was not able finish the proof that appears in the appendix of this article. I returned to it while on sabbatical in Berlin in 2012–13, and realized that the proof required considering two cases (Equation (3))—which allowed me to finally finish it. I thank Dörte and Frieder Middelhauve and Marina Malidzanovic for helping make the sabbatical possible, along with Michael Pauen and the Berlin School of Mind and Brain, where I was a visiting scholar. I owe much to Jodi Melamed for comments and support, and also to my research assistants, Shaila Wadhwani and Clark Wolf. Finally, I thank the journal’s anonymous reviewers for their very careful readings and helpful comments. References Anguelov R. [unpublished]: ‘An Introduction to Some Spaces of Interval Functions’, available at <arxiv.org/abs/math/0408013>. Anguelov R. Markov S. Sendov B. [ 2006]: ‘The Set of Hausdorff Continuous Functions: The Largest Linear Space of Interval Functions’, Reliable Computing , 12, pp. 337– 63. Google Scholar CrossRef Search ADS   Augustin T. Coolen F. de Cooman G. Troffaes M. [ 2014]: Introduction to Imprecise Probabilities , Chicester: Wiley. Dretske F. I. Snyder A. [ 1972]: ‘Causal Irregularity’, Philosophy of Science , 39, pp. 69– 71. Google Scholar CrossRef Search ADS   Eells E. [ 1991]: Probabilistic Causality , New York: Cambridge University Press. Eells E. [ 2010]: ‘Objective Probability Theory Theory’, in Eells E. Fetzer J. (eds), The Place of Probability in Science , Dordrecht: Springer, pp. 3– 44. Fenton-Glynn L. ([forthcoming]): ‘Imprecise Best System Chances.’, Proceedings of the European Philosophy of Science Association Conference . Glynn L. [ 2010]: ‘Deterministic Chance’, British Journal for the Philosophy of Science , 61, pp. 51– 80. Google Scholar CrossRef Search ADS   Glynn L. [ 2011]: ‘A Probabilistic Analysis of Causation’, British Journal for the Philosophy of Science , 62, pp. 343– 92. Google Scholar CrossRef Search ADS   Glynn L. [unpublished]: ‘A Probabilistic Analysis of Causation', available at <web.mit.edu/gradphilconf/2008/A%20Probabilistic%20Analysis%20of%20Causation.pdf>. Hájek A. [ 2003]: ‘What Conditional Probability Could Not Be’, Synthese , 137, pp. 273– 323. Google Scholar CrossRef Search ADS   Hall N. [ 2004]: ‘Two Concepts of Causation’, in Collins J. Hall N. Paul L. A. (eds), Causation and Counterfactuals , Cambridge, MA: MIT Press, pp. 181– 276. Hitchcock C. [ 1993]: ‘A Generalized Probabilistic Theory of Causal Relevance’, Synthese , 97, pp. 335– 64. Google Scholar CrossRef Search ADS   Hitchcock C. [ 2004]: ‘Do All and Only Causes Raise the Probabilities of Effects?’, in Collins J. Hall N. Paul L. A. (eds), Causation and Counterfactuals , Cambridge, MA: MIT Press, pp. 403– 17. Ismael J. [ 2011]: ‘A Modest Proposal about Chance’, Journal of Philosophy , 108, pp. 416– 42. Google Scholar CrossRef Search ADS   Kvart I. [ 2004]: ‘Causation: Probabilistic and Counterfactual Analyses’, in Collins J. Hall N. Paul L. A. (eds), Causation and Counterfactuals , Cambridge, MA: MIT Press, pp. 359– 86. Kyburg H. [ 1999]: ‘Interval-Valued Probabilities’, available at <www.sipta.org/documentation/interval_prob/kyburg.pdf>. Lewis D. [ 1986]: ‘Postscripts to “Causation”’, in his Philosophical Papers: Volume II , Oxford: Oxford University Press, pp. 172– 213. Menzies P. [ 1989]: ‘Probabilistic Causation and Causal Processes: A Critique of Lewis’, Philosophy of Science , 56, pp. 642– 63. Google Scholar CrossRef Search ADS   Menzies P. [ 1996]: ‘Probabilistic Causation and the Pre-emption Problem’, Mind , 105, pp. 85– 117. Google Scholar CrossRef Search ADS   Noordhof P. [ 1999]: ‘Probabilistic Causation, Preemption, and Counterfactuals’, Mind , 108, pp. 95– 125. Google Scholar CrossRef Search ADS   Northcott R. [ 2010]: ‘Natural-Born Determinists: A New Defense of Causation as Probability-Raising’, Philosophical Studies , 150, pp. 1– 20. Google Scholar CrossRef Search ADS   Pearl J. [ 2009]: Causality , Cambridge: Cambridge University Press. Peressini A. F. [ 2016]: ‘Imprecise Probability and Chance’, Erkenntnis , 81, pp. 604– 13. Google Scholar CrossRef Search ADS   Rosen D. A. [ 1978]: ‘In Defense of a Probabilistic Theory of Causality’, Philosophy of Science , 45, pp. 561– 86. Google Scholar CrossRef Search ADS   Royden H. L. [ 1988]: Real Analysis , New York: Macmillan. Rudin W. [ 1976]: Principles of Mathematical Analysis , New York: McGraw-Hill. Sober E. [ 2010]: ‘Evolutionary Theory and the Reality of Macro Probabilities’, in Eells E. Fetzer J. (eds), The Place of Probability in Science , Dordrecht: Springer, pp. 133– 62. van Fraassen C. [ 1984]: ‘Belief and the Will’, Journal of Philosophy , 81, pp. 235– 56. Google Scholar CrossRef Search ADS   Walley P. [ 1991]: Statistical Reasoning with Imprecise Probabilities , London: Chapman and Hall. Weichselberger K. [ 2000]: ‘The Theory of Interval-Probability as a Unifying Concept for Uncertainty’, International Journal of Approximate Reasoning , 24, pp. 149– 70. Google Scholar CrossRef Search ADS   Woodward J. [ 2003]: Making Things Happen: A Theory of Causal Explanation , Oxford: Oxford University Press. © The Author 2017. Published by Oxford University Press on behalf of British Society for the Philosophy of Science. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Journal

The British Journal for the Philosophy of ScienceOxford University Press

Published: Jan 18, 2017

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off