Quantum Common Causes and Quantum Causal Models

Quantum Common Causes and Quantum Causal Models Selected for a Viewpoint in Physics PHYSICAL REVIEW X 7, 031021 (2017) 1 1 2 3 4 John-Mark A. Allen, Jonathan Barrett, Dominic C. Horsman, Ciarán M. Lee, and Robert W. Spekkens Department of Computer Science, University of Oxford, Wolfson Building, Parks Road, Oxford OX1 3QD, United Kingdom Department of Physics, University of Durham, South Road, Durham DH1 3LE, United Kingdom Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT, United Kingdom Perimeter Institute for Theoretical Physics, Waterloo, Ontario N2L 2Y5, Canada (Received 5 April 2017; revised manuscript received 5 June 2017; published 31 July 2017) Reichenbach’s principle asserts that if two observed variables are found to be correlated, then there should be a causal explanation of these correlations. Furthermore, if the explanation is in terms of a common cause, then the conditional probability distribution over the variables given the complete common cause should factorize. The principle is generalized by the formalism of causal models, in which the causal relationships among variables constrain the form of their joint probability distribution. In the quantum case, however, the observed correlations in Bell experiments cannot be explained in the manner Reichenbach’s principle would seem to demand. Motivated by this, we introduce a quantum counterpart to the principle. We demonstrate that under the assumption that quantum dynamics is fundamentally unitary, if a quantum channel with input A and outputs B and C is compatible with A being a complete common cause of B and C, then it must factorize in a particular way. Finally, we show how to generalize our quantum version of Reichenbach’s principle to a formalism for quantum causal models and provide examples of how the formalism works. DOI: 10.1103/PhysRevX.7.031021 Subject Areas: Quantum Physics, Quantum Information I. INTRODUCTION independent from one another, or for the results of one scientific team to be regarded as an independent confirma- It is a general principle of scientific thought—and indeed tion of the results of another. of everyday common sense—that if physical variables are This principle of causal explanation was first made found to be statistically correlated, then there ought to be a explicit by Reichenbach [1]. It is key in scientific inves- causal explanation of this fact. If the dog barks every time the tigations that aim to find causal accounts of phenomena telephone rings, we do not ascribe this to coincidence. from observed statistical correlations. A likely explanation is that the sound of the telephone ringing Despite the central role of causal explanations in science, is causing the dog to bark. This is a case where one of the there are significant challenges to providing them for the variables is a cause of the other. If sales of ice cream are high correlations that are observed in quantum experiments [2]. on the same days of the year that many people get sunburned, In a Bell experiment, a pair of systems are prepared a likely explanation is that the sun was shining on these days together, then removed to distant locations where a meas- and that the hot sun causes both sunburns and the desire to urement is implemented on each. The choice of the have an ice cream. Here, the explanation is not that buying ice measurement made at one wing of the experiment is cream causes people to get sunburned, nor vice versa, but presumed to be made at spacelike separation from that instead that there is a common cause of both: the hot sun. at the other wing. The natural causal explanation of the That the principle is highly natural is most apparent correlations that one observes in such experiments is that when it is expressed in its contrapositive form: if there is no each measurement outcome is influenced by the local causal relationship between two variables (i.e., neither is a measurement setting as well by a common cause located cause of the other and there is no common cause), then the in the joint past of the two measurement events. But Bell’s variables will not be correlated. In particular, without a theorem [3] famously rules out this possiblity: within the general commitment to this latter statement, it would be standard framework of causal models, if the correlations impossible ever to regard two different experiments as violate a Bell inequality [4]—as is predicted by quantum theory and verified experimentally [5–7]—then a common- cause explanation of the correlations is ruled out. Published by the American Physical Society under the terms of Furthermore, Ref. [2] proves that it is not possible to the Creative Commons Attribution 4.0 International license. explain Bell correlations with classical causal models Further distribution of this work must maintain attribution to without unwelcome fine-tuning of the parameters. This the author(s) and the published article’s title, journal citation, and DOI. includes any attempt to explain Bell correlations with 2160-3308=17=7(3)=031021(22) 031021-1 Published by the American Physical Society ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) exotic causal influences, such as retrocausality and super- matter physics [12] and novel means for inferring the luminal signaling. In the study of classical causation, it is underlying causal structure from quantum correlations typically assumed that causal explanations should not be [13,14]. fine-tuned [8]. The structure of the paper is as follows. Section II However, the verdict of fine-tuning applies only to provides a formal statement of Reichenbach’s principle and classical models of causation. It was suggested in shows how it can be rigorously justified under certain Ref. [2] that it might be possible to provide a satisfactory philosophical assumptions. The main body of results causal explanation of Bell inequality violations, in particu- is in Sec. III. Here, our quantum generalization of lar one that preserves the spirit of Reichenbach’s principle Reichenbach’s principle is presented and justified by and does not require fine-tuning, using a quantum gener- reasoning parallel to that of the classical case. This is then alization of the notion of a causal model. This article seeks fleshed out with alternative characterizations of our quan- to develop such a generalization by first suggesting an tum version of conditional independence and some specific intrinsically quantum version of Reichenbach’s principle. examples. We return to the classical world in Sec. IV, Specifically, we consider the case of a quantum system A discussing classical causal models and providing a rigorous in the causal past of a bipartite quantum system BC and ask justification of the Markov condition, which plays the role what constraints on the channel from A to BC follow from of Reichenbach’s principle for general causal structures. the assumption that A is the complete common cause of B Section V then generalizes these ideas to the quantum sphere and presents our proposal for quantum causal and C. In this scenario we are able to find a natural quantum models. Finally, in Sec. VI, we describe the relationship analogue to Reichenbach’s principle. This analogue can be expressed in several equivalent forms, each of which of our proposal to prior work on quantum causal models, naturally generalizes a corresponding classical expression. and in Sec. VII, we summarize and describe some direc- In particular, one of these conditions states that A is a tions for future work. complete common cause of BC if one can dilate the channel from A to BC to a unitary by introducing two ancillary II. REICHENBACH’S PRINCIPLE systems, contained in the causal past of BC, such that A. Statement each ancillary system can influence only one of B and C. This unitary dilation codifies the causal relationship Reichenbach gave his principle a formal statement in between A and BC and illustrates the fact that no other Ref. [1]. Following Ref. [15], we here distinguish two parts system can influence both B and C. Moreover, our quantum of the formalized principle. First is the qualitative part, Reichenbach’s principle contains the classical version as which expresses the intuitions described at the beginning of a special case in the appropriate limit. This suggests that the Introduction. The other is the quantitative part, which our quantum version is the correct way to generalize constrains the sorts of probability distributions one should Reichenbach’s principle. assign in the case of a common-cause explanation. The mathematical framework of causal models [8,9] can The qualitative part of Reichenbach’s principle may be be seen as a direct generalization of Reichenbach’s prin- stated as follows: if two physical variables Y and Z are ciple to arbitrary causal structures. By following this found to be statistically dependent, then there should be a classical example, we are able to generalize our quantum causal explanation of this fact, either (1) Y is a cause of Z, Reichenbach’s principle to a framework for quantum causal (2) Z is a cause of Y, (3) there is no causal link between Y models. In each case, the original Reichenbach’s principle and Z, but there is a common cause X influencing Y and Z, becomes a special case of the framework. Just as with (4) Y is a cause of Z and there is a common cause X classical causal models, the framework of quantum causal influencing Y and Z, or (5) Z is a cause of Y and there is a models allows us to analyze the causal structure of arbitrary common cause X influencing Y and Z. quantum experiments. It also does so while preserving an Note that the causal influences we consider here may be appropriate form of Reichenbach’s principle (by construc- indirect (mediated by other variables). If none of these tion) and avoiding fine-tuning. causal relations hold between Y and Z, then we refer to Although our main motivation for developing quantum them as ancestrally independent (because their respective causal models is the possibility of finding a satisfactory causal ancestries constitute disjoint sets). Using this termi- (i.e., non-fine-tuned) causal explanation of Bell inequality nology, the qualitative part of Reichenbach’s principle can violations [2,10], they are also likely to have practical be expressed particularly succinctly in its contrapositive applications. For instance, finding quantum-classical sep- form as follows: ancestral independence implies statistical arations in the correlations achievable in novel causal independence, i.e., PðYZÞ¼ PðYÞPðZÞ [16]. scenarios might lead to new device-independent protocols The quantitative part of Reichenbach’s principle applies [11], such as randomness extraction and secure key dis- only to the case where the correlation between Y and Z is tribution. Quantum causal models may also provide novel due purely to a common cause [case (3) above]. It states schemes for simulating many-body systems in condensed that, in that case, if X is a complete common cause for Y 031021-2 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) and Z, meaning that X is the collection of all variables acting as common causes, then Y and Z must be condi- tionally independent given X, so the joint probability distribution PðXYZÞ satisfies PðYZjXÞ¼ PðYjXÞPðZjXÞ: ð1Þ FIG. 1. A causal structure represented as a directed acyclic B. Justifying the quantitative part graph depicting that X is the complete common cause of Y and Z. of Reichenbach’s principle Within the philosophy of causality, providing an adequate justification of Reichenbach’s principle is a delicate issue. It We now apply this to the situation depicted in Fig. 1, rests on controversy over basic questions, such as what it where X is the complete common cause of Y and Z. The means for one variable to have a causal influence on another conditional distribution PðYZjXÞ admits of a dilation in and what is the correct interpretation of probabilistic state- terms of an ancillary unobserved variable λ, for some 0 0 ments. In this section, we discuss one way of justifying the distribution PðλÞ and a function f ¼ðf ;f Þ from ðλ;XÞ Y Z 0 0 principle, using an assumption of determinism, which to ðY; ZÞ, such that Y ¼ f ðλ;XÞ and Z ¼ f ðλ;XÞ. The Y Z provides a clean motivational story with a natural quantum assumption that X is the complete common cause of Y and analogue. Other justifications may be possible. Z implies that the ancillary variable λ can be split into a pair Suppose we adopt a Bayesian point of view on prob- of ancestrally independent variables, λ and λ , where λ Y Z Y abilities: they are the degrees of belief of a rational agent. influences only Y and λ influences only Z [18]. It follows Dutch book arguments—based on the principle that a that there must exist λ and λ that are causally related to X, Y Z rational agent will never accept a set of bets on which Y, and Z, as depicted in Fig. 2, where the causal depend- they are certain to lose money—can then be given as to why ences are deterministic and given by a pair of functions f probabilities should be non-negative, sum to 1, and so and f such that Y ¼ f ðλ ;XÞ and Z ¼ f ðλ ;XÞ. Z Y Y Z Z forth. But why should an agent who takes X to be a In this case, we have complete common cause for Y and Z arrange their beliefs such that PðYZjXÞ¼ PðYjXÞPðZjXÞ? If the agent does not PðYZjXÞ do this, are they irrational? ¼ δ(Y; f ðλ ;XÞ)δ(Z; f ðλ ;XÞ)Pðλ ; λ Þ: ð3Þ One way to justify a positive answer to this question is to Y Y Z Z Y Z λ ;λ assume that in a classical world there is always an under- Y Z lying deterministic dynamics. In this case, one variable is causally influenced by another if it has a nontrivial func- Finally, given the qualitative part of Reichenbach’s prin- tional dependence upon it in the dynamics. Probabilities ciple, the ancestral independence of λ and λ in the causal Y Z can be understood as arising merely due to ignorance of the structure implies that Pðλ ; λ Þ¼ Pðλ ÞPðλ Þ. It then Y Z Y Z values of unobserved variables. Under these assumptions, follows that PðYZjXÞ¼ PðYjXÞPðZjXÞ, which establishes one can show that the qualitative part of Reichenbach’s the quantitative part of Reichenbach’s principle. principle implies the quantitative part. A well-known converse statement is also worth noting: In general, a classical channel describing the effective any classical channel PðYZjXÞ satisfying PðYZjXÞ¼ influence of random variable X on Y is given by a PðYjXÞPðZjXÞ admits of a dilation where X is the complete conditional probability distribution PðYjXÞ. If we assume common cause of Y and Z [8]. underlying deterministic dynamics, then although the value Summarizing, we can identify what it means for of the variable Y might not be completely determined by the PðYZjXÞ to be explainable in terms of X being a complete value of X, it must be determined by the value of X along common cause of Y and Z by appealing to the quali- with the values of some extra, unobserved, variables in the tative part of Reichenbach’s principle and fundamental past of Y which can collectively be denoted λ.Any variation in the value of Y for a given value of X is then explained by variation in the value λ. This can be formalized as follows. Definition 1 (Classical dilation).—For a classical chan- nel PðYjXÞ, a classical deterministic dilation is given by some random variable λ with probability distribution PðλÞ and some deterministic function Y ¼ fðX; λÞ, such that FIG. 2. The causal structure of Fig. 1, expanded so that Y and Z PðYjXÞ¼ δ(Y; fðX; λÞ)PðλÞ; ð2Þ each have a latent variable as a causal parent in addition to X,so that both Y and Z can be made to depend functionally on their where δðX; YÞ¼ 1 if X ¼ Y and 0 otherwise [17]. parents. 031021-3 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) determinism. The definition can be formalized into a the dual space. If a quantum system is initially uncorrelated mathematical condition as follows. with any other system, then the most general time evolution Definition 2 (Classical compatibility).—PðYZjXÞ is said of the system corresponds to a quantum channel, i.e., a to be compatible with X being the complete common cause completely positive trace-preserving (CPTP) map. If the of Y and Z if one can find variables λ and λ , distributions system at the initial time is labeled A, with Hilbert space Y Z Pðλ Þ and Pðλ Þ, a function f from ðλ ;XÞ to Y, and a H , and the system at the later time is labeled B, with Y Z Y Y A function f from ðλ ;XÞ to Z, such that these constitute a Hilbert space H , then the CPTP map is Z Z dilation of PðYZjXÞ, that is, such that E ∶ LðH Þ → LðH Þ; ð5Þ BjA A B PðYZjXÞ ¼ δ(Y; f ðλ ;XÞ)δ(Z; f ðλ ;XÞ)Pðλ ÞPðλ Þ: ð4Þ where LðHÞ is the set of linear operators on H. Y Y Z Z Y Z λ ;λ Y Z An alternative way to express the channel E is as BjA an operator, using a variant of the Choi-Jamiołkowski isomorphism [19,20]: With this definition, we can summarize the result described above as follows. ρ ≔ E ðjii hjjÞ ⊗ jii hjj: ð6Þ BjA BjA A A Theorem 1.—Given a conditional probability distribu- ij tion PðYZjXÞ, the following are equivalent. (1) PðYZjXÞ is compatible with X being the complete Here, the vectors fjii g form an orthonormal basis of the common cause of Y and Z. Hilbert space H . The vectors fjii g form the dual basis, A A (2) PðYZjXÞ¼ PðYjZÞPðZjXÞ. belonging to H . The operator ρ therefore acts on the BjA The ð1Þ → ð2Þ implication is what establishes that a Hilbert space H ⊗ H . Although the expression above rational agent should espouse the quantitative part of A involves an arbitrary choice of orthonormal basis, the Reichenbach’s principle if they espouse the qualitative part operator ρ thus defined is independent of the choice and fundamental determinism. BjA of basis. This version of the Choi-Jamiołkowski isomor- The ð2Þ → ð1Þ implication allows one to deduce a phism is chosen because it is both basis independent and a possible causal explanation of an observed distribution positive operator. Following Ref. [21], we choose the from a feature of that distribution. However, it is important operator ρ to be normalized in such a way that to stress that it only establishes a possible causal explan- BjA ation. It does not state that this is the only causal Tr ðρ Þ¼ I [in analogy with the normalization con- B BjA A explanation. Indeed, it may be possible to satisfy this dition PðYjXÞ¼ 1 for a classical channel PðYjXÞ]. conditional independence relation within alternative causal Suppose that ρ ¼ E ðρ Þ. Given that the operator ρ B BjA A BjA structures by fine-tuning the strengths of the causal depend- contains all of the information about the channel E , the BjA ences. However, as noted above, fine-tuned causal explan- question arises of how one can express ρ in terms of ρ B BjA ations are typically rejected as bad explanations in the field and ρ . Recall that ρ is defined on H ⊗ H , while ρ A BjA B A A of causal inference. Therefore, the best explanation of the is defined on H . As we discuss further in Sec. V,by conditional independence of Y and Z given X is that X is defining an appropriate “linking operator” on H ≔ the complete common cause of Y and Z. H ⊗ H , A A III. QUANTUM VERSION OF id τ ≔ jli hmj ⊗ jli hmj; ð7Þ A A A REICHENBACH’S PRINCIPLE lm In this section, we introduce our quantum version of Reichenbach’s principle. The definition of a quantum where fjli g and fjli g are orthonormal bases on H A l A l A id causal model that we provide in Sec. V can be seen as and H , respectively, one can write ρ ¼ Tr ðρ τ ρ Þ. A B A BjA A generalizing these ideas in much the same way that This expression is meant to be reminiscent of the classical classical causal models generalize the classical version formula PðYÞ¼ PðYjXÞPðXÞ. of Reichenbach’s principle. Given an operator ρ ,actingon H ⊗H ⊗ ABjCD A B ⊗H ⊗H ⊗, we use the same expression with missing C D A. Quantum preliminaries indices to denote the result of taking partial traces on the corresponding factor spaces. For example, given a channel For simplicity, we assume throughout that all quantum ρ , we write ρ ≔ Tr ðρ Þ. systems are finite dimensional. Given a quantum system A, ABjCD AjCD B ABjCD we write H for the corresponding Hilbert space, d for the When writing products of operators, we sometimes A A dimension of H , and I for the identity on H . We also suppress tensor products with identities. For example, A A A write H for the dual space to H and I for the identity on ðρ ⊗ I Þðρ ⊗ I Þ is written simply as ρ ρ . A A A BjA C CjA B BjA CjA 031021-4 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) B. Main result some extra ancillary system λ in the past of B. This can be formalized as follows. The qualitative part of Reichenbach’s principle can be Definition 3 (Unitary dilation).—For a quantum channel applied to quantum theory with almost no change: if E , a quantum unitary dilation is given by some ancillary BjA quantum systems B and C are correlated, then this must quantum system λ with state ρ and some unitary U from have a causal explanation in one of the five forms listed in H ⊗ H to H ⊗ H , such that A λ B Sec. II A (except with classical variables X, Y, and Z B replaced by quantum systems A, B, and C). Here, for two quantum systems to be correlated means that their joint E ð·Þ¼ Tr (Uð· ⊗ ρ ÞU ); BjA B λ quantum state does not factorize. Finding a quantum version of the quantitative part of where the dimension of B is fixed by the requirement for Reichenbach’s principle is more subtle. If a quantum unitarity that d d ¼ d d ¯ . A λ B B system A is a complete common cause of B and C (as If we represent the channels by our variant of the Choi- depicted in Fig. 3), then one expects there to be some Jamiołkowski isomorphism, Eq. (6), with ρ representing BjA constraint analogous to the classical constraint that U † E and ρ representing Uð·ÞU , then the dilation BjA ¯ BBjAλ PðYZjXÞ¼ PðYjXÞPðZjXÞ. If one tries to do this by equation has the form generalizing the joint distribution PðXYZÞ, then one immediately faces the problem that textbook quantum U id ρ ¼ Tr ðρ τ ρ Þ; theory has no analogue of a joint distribution for a ¯ BjA Bλ ¯ λ BBjAλ collection of quantum systems in which some are causal descendants of others. The situation is improved if one id where τ is the linking operator defined in Eq. (7). focuses on finding an analogue of PðYZjXÞ instead. In Just as in the classical case, we would like to apply this to standard quantum theory, as long as a system A is initially the situation depicted in Fig. 3, where A is the complete uncorrelated with its environment, then the evolution from common cause for B and C. This is easy classically, as it is A to BC is described by a channel E . The operator that BCjA clear what it means for a classical variable X to have no is isomorphic to this channel by Eq. (6), denoted ρ , BCjA causal influence on another, Y, in a deterministic system. seems to be a natural analogue of PðYZjXÞ.However,even Specifically, if the collection of inputs other than X is in this case, it is not obvious what constraint on ρ BCjA denoted X so there is a deterministic function f such that should serve as the analogue of the classical constraint ¯ Y ¼ fðX; XÞ, then the assumption that X has no causal PðYZjXÞ¼ PðYjXÞPðZjXÞ. ¯ ¯ influence on Y is formalized as fðX; XÞ¼ f ðXÞ for some The treatment of generic causal networks of quantum function f . In unitary quantum theory the corresponding systems is deferred to the full definition of quantum causal condition is less obvious, so we spell it out explicitly with a models in Sec. V. This section focuses on the case of a definition. channel ρ . BCjA Definition 4 (No influence).—Consider a unitary In Sec. II B, we demonstrated how to justify the U ¯ ¯ channel ρ from AA to BB. A has no causal influence ¯ ¯ BBjAA quantitative part of Reichenbach’s principle from the U on B if and only if for ρ ≔ tr ρ , we have ¯ ¯ BjAA ¯ ¯ BBjAA qualitative part in the classical case under the assumption ρ ¯ ¼ I ⊗ ρ ¯ . BjAA A BjA that all dynamics are fundamentally deterministic. We shall An equivalent definition states that A has no causal now make an analogous argument in the quantum case by influence on B in some unitary channel if and only if the assuming that quantum dynamics are fundamentally uni- following holds: for every initial state ρ , if an operation is tary. Just as in the classical case, this assumption simply AA performed on the A system alone, followed by the action of provides a clean way to motivate our result, and alternative the unitary channel, then the marginal output state at B is justifications may be possible. independent of the choice of operation on A. The equiv- In general, a quantum channel from A to B is given by a alence is shown in Ref. [22], where other equivalent CPTP map E . Assuming underlying unitary dynamics, BjA definitions are also presented (under the terminology the output state at B must depend unitarily on A along with “nonsignalling” rather than “no causal influence”). There is in fact a rich literature concerning related properties of unitary operators from various perspectives: see, for exam- ple, Refs. [23–25]. We can now apply this to the complete common-cause situation of Fig. 3. The channel E admits a unitary BCjA dilation in terms of an ancillary system λ, for some state ρ and unitary U from λA to BDC. Here, an ancillary output D is generally required so that dimensions of inputs and FIG. 3. A causal structure relating three quantum systems with A the complete common cause of B and C. outputs match, but is not important and will always be 031021-5 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) independent given the input if and only if the classical channel defined by the diagonal elements of the matrix has the property that the outputs are conditionally independent given the input. With this terminological convention in hand, we can express our quantum version of the quantitative part of Reichenbach’s principle as follows: if a channel ρ is BCjA FIG. 4. The causal structure of Fig. 3, expanded so that B and C compatible with A being a complete common cause of B each have a latent system as a causal parent in addition to A.By and C, then for this channel, B and C are quantum analogy to the classical case, we take B and C to depend unitarily conditionally independent given A. on λ , A, and λ . B C The ð1Þ → ð2Þ implication in the theorem is what traced out. This dilation is such that E ð·Þ¼ establishes the quantum version of the quantitative part BCjA of Reichenbach’s principle. Tr (Uð· ⊗ ρ ÞU ). D λ The ð2Þ → ð1Þ implication is pertinent to causal infer- Just as in Sec. II B, the assumption that A is a complete ence: analogously to the classical case, if one grants the common cause for B and C implies that the ancilla λ can be implausibility of fine-tuning, then one must grant that the factorized into ancestrally independent λ and λ , where λ B C B most plausible explanation of the quantum conditional has no causal influence on C and λ has no causal influence independence of outputs B and C given input A is that A is a on B. It follows that systems λ and λ are causally related B C complete common cause of B and C. to A, B, and C as depicted in Fig. 4. Theorem 2, and the surrounding discussion, motivates The ancestral independence of λ and λ implies that the B C the definition of quantum causal models given in Sec. V. quantum state on λ factorizes across the λ , λ partition, B C For the rest of this section, we make some further remarks ρ ¼ ρ ρ , suggesting the following quantum analogue to λ λ λ B C about the quantum version of Reichenbach’s principle. our classical compatibility condition of Definition 2. Definition 5 (Quantum compatibility).—ρ is said to BCjA C. Alternative expressions for quantum conditional be compatible with A being a complete common cause of B independence of outputs given input and C, if it is possible to find ancillary quantum systems λ and λ , states ρ and ρ , and a unitary channel where λ Classically, conditional independence of Y and Z given C λ λ B B C has no causal influence on C and λ has no causal influence X is standardly expressed as PðYZjXÞ¼ PðYjXÞPðZjXÞ. on B, such that these constitute a dilation of ρ . However, there are alternative ways of expressing this BCjA constraint. All that remains is to show that this, together with the For instance, if one defines the joint distribution overX,Y,Z qualitative part of the quantum Reichenbach’s principle, that one obtains by feeding the uniform distribution on X into implies an appropriate quantitative part (generalizing Theorem 1). the channel PðYZjXÞ—that is, PðXYZÞ≔PðYZjXÞð1=d Þ, Theorem 2.—The following are equivalent. where d is the cardinality of X—then Y and Z being (1) ρ is compatible with A being the complete conditionally independent given X in PðYZjXÞ can be BCjA common cause of B and C. expressed as the vanishing of the conditional mutual infor- (2) ρ ¼ ρ ρ . ˆ BCjA BjA CjA mation of Y and Z given X in the distribution PðXYZÞ [8]. The proof is in Appendix A. Note that there is no This conditional mutual information is defined as ordering ambiguity on the right-hand side of the second IðY∶ZjXÞ ≔ HðY; XÞþ HðZ; XÞ − HðX; Y; ZÞ − HðXÞ, condition, because the two terms must commute. This is withHð·Þ denoting the Shannon entropy of the marginal on the seen by taking the Hermitian conjugate of both sides of the subset of variables indicated in its argument. Therefore, the equation and recalling that ρ is Hermitian. BCjA condition is simply IðY∶ZjXÞ¼ 0 [26]. The strong analogy that exists between Theorems 1 and Similarly, if Y and Z are conditionally independent given 2 suggests the following definition. X in PðYZjXÞ, then it is possible to mathematically Definition 6 (Quantum conditional independence of represent the channel PðYZjXÞ as the following sequence outputs given input).—Given a quantum channel ρ , BCjA of operations: copy X, then process one copy into Y via the the outputs are said to be quantum conditionally indepen- channel PðYjXÞ and process the other into Z via the dent given the input if and only if ρ ¼ ρ ρ . BCjA BjA CjA channel PðZjXÞ. It is easily seen that the quantum definition reduces to the We present here the quantum analogues of these alter- classical definition in the case that the channel ρ is native expressions. They are found to be useful for devel- BCjA invariant under the operation of completely dephasing the oping intuitions about quantum conditional independence systems A, B, and C in some basis. More precisely, if fixed and in proving Theorem 2. Recall that the quantum condi- bases are chosen for H , H , H , and the operator ρ is tional mutual information of B and C given A is defined as A B C BCjA diagonal when written with respect to the tensor product of IðB∶CjAÞ≔ SðB;AÞþSðC;AÞ−SðA;B;CÞ−SðAÞ, where these bases, then the outputs are quantum conditionally Sð·Þ denotes the von Neumann entropy of the reduced state 031021-6 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) on the subsystem that is specified by its argument. Analogously to the classical case, we use a hat to denote an operator renormalized such that the trace is 1. For example, if ρ is the operator representing a channel from BjA A to B, then ρ ˆ ≔ ð1=d Þρ . BjA A BjA Theorem 3.—Given a channel ρ , the following BCjA conditions are also equivalent to the quantum conditional independence of the outputs given the input [condition (2) of Theorem 2]. (3) IðB∶CjAÞ¼ 0, where IðB∶CjAÞ is the quantum conditional mutual information of B and C given A evaluated on the (positive, trace-one) operator ρ ˆ ≔ ð1=d Þρ . BCjA A BCjA (4) The Hilbert space for theA system can be decomposed as H ¼⨁ H L ⊗ H R and ρ ¼ ðρ L⊗ A A A BCjA BjA i i i i i ρ RÞ, where for each i, ρ L represents a CPTP CjA BjA i i map BðH LÞ → BðH Þ, and ρ R aCPTP A B CjA i i FIG. 5. Diagrammatic representation of Theorem 1 and of map BðH Þ → BðH Þ. A C alternative expressions for conditional independence of outputs The proof is in Appendix A. That conditions (3) and (4) given input (the classical analogue of Theorem 3). are equivalent follows as a corollary of Theorem 6 of Ref. [27]. Our main contribution is showing that these are beginning of Sec. III C, this is one way of expressing the fact also equivalent to condition (2) of Theorem 2. that Y and Z are conditionally independent given X. Equality The final condition can be described as follows. First, (3) asserts that PðYjXÞ and PðZjXÞ separately admit classical one imagines decomposing the system A into a direct sum dilations. Finally, equality (2) asserts that PðYZjXÞ is of subspaces, each of which is denoted A . For each i, the compatible with X being a complete common cause of Y L R subspace A is split into two factors, denoted A and A , i i and Z by depicting conditions under which λ has no with one factor evolving via a channel ρ into system B, BjA influence on Z and λ has no influence on Y. i Z and the other factor evolving via ρ R into system C. In the Analogous circuit diagrams can be provided in the CjA quantum case, as depicted in Fig. 6, with analogous inter- special case where there is only a single value of i, this is pretations of the various equalities. Since quantum systems simply a factorization of the A system into two parts. In the L R cannot be copied, however, something must replace the dot special case where all of the A and A are one-dimensional i i Hilbert spaces, the channel ρ may be thought of as an BCjA incoherent copy operation applied to the A system with respect to the i basis, followed by the processing of one copy into B and one copy into C. It is noteworthy that in the general case, B only gets the information carried by the A and C only gets the information carried by the A : hence, the only information about A that both B and C receive is the classical information carried by the index i. D. Circuit representations It is instructive to summarize the contents of Theorems 1 and 2 using circuit diagrams. The classical case is shown in Fig. 5, where four equivalent circuits represent the action of a channel PðYZjXÞ,for which the outputs YZ are conditionally independent given the input X. The dot in the lower two circuits represents a classical copy operation. Equality (1) simply asserts that the conditional probability distribution PðYZjXÞ admits a FIG. 6. Diagrammatic representation of Theorem 2 and of classical dilation, as in Definition 1. Equality (4) asserts that alternative expressions for quantum conditional independence of the channel is equivalent to a sequence of operations in which outputs given input (Theorem 3). Following Ref. [28],we use to X is copied, with one copy the input to a channel PðYjXÞ and denote partial trace (here, slightly generalized to include the partial one copy the input to a channel PðZjXÞ. As we discuss at the trace of a wire carrying an i index, defined in an obvious way). 031021-7 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) that appears in the lower two circuits of Fig. 5. For the lower two circuits of Fig. 6, we introduce a new symbol that indicates the decomposition of the Hilbert space H into a direct sum of tensor products, as per condition (4) of Theorem 3. The symbol is a circle decorated with the set fig, where the value i indexes the terms in the direct sum. For FIG. 7. For a generic unitary transformation from AD to BC,the each value of i, the left-hand wire carries the factor H L and complete common cause of B and C is the composite system AD. the right-hand wire the factor H . In the lower right circuit of Fig. 6, the gates represent Fig. 7, we illustrate the circuit and the corresponding causal unitary channels, and are labeled with the corresponding diagram. unitary operators V and W (as opposed to the Choi- The channel ρ which one obtains in this case is BCjAD Jamiołkowski channel operators). The unitary operator V, compatible with the complete common cause of B and C for example, labels a gate whose action is confined to the left- being the composite system AD. This follows from the fact hand factors in this decomposition, along with the system λ . that ρ has a trivial dilation, which is to say that the BCjAD The interpretation, roughly, is that the form of V must respect ancillary system is not required, and therefore trivially the decomposition of H . More precisely, the unitary satisfies the condition for compatibility laid out in operator can be written as a matrix that is block diagonal Definition 5. It follows from Theorem 2 that for such a with respect to the subspace decomposition, with the ith ρ , the outputs B and C are quantum conditionally block being of the form V ⊗ I R for a unitary matrix V BCjAD i A i independent given the input AD, which means that acting on H ⊗ H L. Similarly, W can be written as a block ρ ¼ ρ ρ , as can also be verified by direct BCjAD BjAD CjAD diagonal matrix, with theith block of the formI ⊗ W for a A i calculation. Similarly, the alternative expressions for this unitary matrix W acting on H R ⊗ H . i A λ sort of quantum conditional independence, namely, con- In the lower left circuit of Fig. 6, in a slight mixing of ditions (3) and (4) of Theorem 3, can be verified to hold. notation, gates are labeled with the channel operators ρ BjA and ρ [29]. Suppose that, as in the figure, a channel CjA 2. Coherent copy versus incoherent copy operator ρ labels a gate whose action is confined to the BjA left-hand factors in the decomposition, along with another Consider the simple example of a classical channel, system λ . This indicates that the channel corresponds to a taking X to Y, Z, where X, Y, Z are bit-valued and the mapping between input strings and output strings is set of Kraus operators fK g, where for each j, the Kraus operator K is block diagonal, with the ith block being of j j 0 → 0 0 ; R L X Y Z the form K ⊗ I , with K acting on H ⊗ H . The i A i λ A i i channel operator ρ has a similar form, with a non-trivial 1 → 1 1 : ð8Þ CjA X Y Z action on the right-hand factors and the system λ [30,31] The equivalences of Fig. 6 can now be summarized as The outputs of the channel are conditionally independent follows. Equality (1) simply asserts the fact that ρ admits given the input; variation in X fully explains any correlation BCjA between Y and Z. Indeed, this example may be seen as the a unitary dilation. Equality (4) asserts that the channel paradigmatic case of the explanation of classical correla- ρ is such that B and C are quantum conditionally BCjA tions via a complete common cause. independent given A, according to the definition we propose One quantum analogue of this channel is the incoherent (Definition 6). This equality follows from the expression for copy of a qubit: a qubit A is measured in the computational quantum conditional independence described in condition basis; if 0 is obtained, then prepare the state j00i , and if 1 (4) of Theorem 3. Equality (3) asserts that the channels ρ BC BjA is obtained, prepare j11i . The operator representing this and ρ separately admit unitary dilations. Equality (2) BC CjA channel is asserts that ρ is compatible with A being a complete BCjA common cause of B and C by depicting conditions under ρ ¼j000ih000j  þj111ih111j : BCjA BCA BCA which λ has no influence on C and λ has no influence on B. B C Here, the unitary matrix U is decomposed as U ¼ It is easily verified that this operator satisfies each of the ðI ⊗ WÞðV ⊗ I Þ, as per the proof of Theorem 2. λ λ B C conditions of Theorem 2, so that B and C are quantum conditionally independent given A for this channel. The E. Examples decomposition of the A Hilbert space implied by condition 1. Unitary transformation (4) is Consider the case in which inputs A and D evolve, via a H ¼ðC ⊗ CÞ ⊕ ðC ⊗ CÞ; generic unitary transformation U into outputs B and C.In 031021-8 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) 0 0 → 0 0 ; X λ Y Z 0 1 → 0 1 ; X λ Y Z 1 0 → 1 1 ; X λ Y Z 1 1 → 1 0 ð10Þ X λ Y Z FIG. 8. Classical realization of a copy operation using an ancilla and classical CNOT gate. [which one easily verifies to reduce to the classical copy of Eq. (8) when one sets λ to 0], has the causal structure where C is the one-dimensional complex Hilbert space, i.e., depicted in Fig. 8, so that λ does not act as a common cause the complex numbers. of Y and Z but only a local cause of Z. The other direct quantum analogue of the classical copy In the quantum case, neither reason applies. Concerning above is the channel that makes a coherent copy of a qubit, the second reason, the quantum CNOT has the causal where the mapping from input states to output states is structure depicted in Fig. 9: the quantum CNOT is such that not only does A have a causal influence on C,but λ has αj0i þ βj1i → αj0i j0i þ βj1i j1i : ð9Þ a causal influence on B as well. In other words, unlike the A A B C B C classical CNOT, there is a backaction of the target on the This channel is represented by the operator control. It follows that in the quantum case, λ can act as a common cause of B and C. Furthermore, the ancilla is ρ ¼ðj000i  þj111i Þðh000j  þh111j Þ; prepared in a quantum pure state j0i. This is disanalogous BCjA BCA BCA BCA BCA to a point distribution on the value 0 for the classical which corresponds to an unnormalized Greenberger- variable λ if one takes the view that a quantum pure state Horne-Zeilinger state. It can easily be verified that represents maximal but incomplete information about a IðB∶CjAÞ¼ 1 for a trace-one version of this state; hence, quantum system [34–38]. In this case, one must allow for it is not the case that outputs B and C are quantum the possibility that some correlation between B and C is due conditionally independent given the input A. There is, to the ancilla, in which case A is not the complete common then, no way in which this channel can arise as a marginal cause of B and C [39]. channel in a situation in which A is the complete common cause of B and C. F. Generalization to one input, k outputs At first blush, this conclusion may seem surprising. Theorems 2 and 3, which apply to quantum channels Given the mapping described by Eq. (9), where would with one input and two outputs, can be generalized to the correlations between outputs B and C come from, other case of one input and k outputs. than being completely explained by the input A? Consider a channel ρ , and let B denote the B …B jA i The puzzle is resolved by considering the dilation of the 1 k collection of all outputs apart from B . The notion of coherent copy to a unitary transformation, and the inter- i quantum compatibility from Definition 5 generalizes in the pretation of quantum pure states. Consider Figs. 8 and 9, obvious way: ρ is said to be compatible with A B …B jA which, respectively, show a classical copy operation via the 1 k being a complete common cause of B …B , if it is possible classical CNOT gate and a quantum coherent copy operation 1 k to find ancillary quantum systems λ ; …; λ , states via the quantum CNOT gate [32]. 1 k ρ ; …; ρ , and a unitary channel where, for each i, λ In the classical case, there are two reasons why any λ λ i 1 k correlation between Y and Z must be entirely explained by has no causal influence on B , such that these constitute a statistical variation in the value of X. First, the ancillary dilation of ρ . B …B jA 1 k variable λ is prepared deterministically with value 0, so The generalization of Theorems 2 and 3, consolidated there is no possibility that statistical variation in the value of into a single theorem, is as follows. λ underwrites the correlations between B and C. Second, Theorem 4.—The following are equivalent. the mapping between input strings and output strings for (1) ρ is compatible with A being a complete B …B jA 1 k the classical CNOT gate, common cause of B …B . 1 k (2) ρ ¼ ρ  ρ , where for all i, B …B jA B jA B jA 1 k 1 k j, ½ρ ; ρ ¼ 0. B jA B jA i j ¯ ¯ (3) For each i, IðB ∶B jAÞ¼ 0, where IðB ∶B jAÞ is the i i i i quantum conditional mutual information evaluated on the (positive, trace-one) operator ρ ˆ . B …B jA 1 k (4) The Hilbert space for the A system can be decom- posed as H ¼⨁ H 1 ⊗  ⊗ H k, such that i A A FIG. 9. Quantum realization of the coherent copy using an i i 1 k ancilla and quantum CNOT gate. ρ ¼ ðρ ⊗  ⊗ ρ Þ, where for B …B jA i B jA B jA 1 k 1 k 031021-9 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) each i, and each l ∈ f1; …;kg, ρ represents a where NondescðiÞ is the set of nondescendants of node X . BjA i The intuitive idea is that the parents of a node screen off that CPTP map BðH lÞ → BðH Þ. node from the other nondescendants: once the values of the The proof is in Appendix B. By analogy to the classical parents are fixed, the values of other nondescendant nodes case, if conditions (2)–(4) of Theorem 4 hold, we say that are irrelevant to the value of X . B …B are quantum conditionally independent given A for 1 k Note also that Reichenbach’s principle is easily seen to the channel ρ . B …B jA 1 k be a special case of the requirement that for a joint distribution to be explainable by the causal structure of IV. CLASSICAL CAUSAL MODELS some DAG, it must be Markov for that DAG: if two variables, Y and Z, are ancestrally independent in the graph, A. Definitions then any distribution that is Markov for this graph must Reichenbach’s principle is important because it general- factorize on these, PðYZÞ¼ PðYÞPðZÞ, which is the izes to the modern formalism of causal models [8,9]. qualitative part of Reichenbach’s principle in its contra- A causal model consists of two entities: (i) a causal positive form; if two variables, Y and Z, have a variable X structure, represented by a directed acyclic graph (DAG) as a complete common cause, as in the DAG of Fig. 1, then where the nodes represent random variables and the any distribution that is Markov for the graph must satisfy directed edges represent the directed causal influences PðYZjXÞ¼ PðYjXÞPðZjXÞ, which is the quantitative part among these (several examples have already been presented of Reichenbach’s principle. in this article), and (ii) some parameters, which specify the strength of the causal dependences and the probability B. Justifying the Markov condition distributions for the variables associated to root nodes in the DAG (i.e., those with no incoming arrows). Some termi- Just as we previously asked whether there was some nology is required to present the formal definitions. principle that forced a rational agent to assign probability Given a DAG with nodes X ; …;X , let ParentsðiÞ distributions in accordance with the quantitative part of 1 n denote the parents of node X , that is, the set of nodes Reichenbach’s principle, we can similarly ask why a that have an arrow into X , and let ChildrenðiÞ denote the i rational agent who takes causal relationships to be given children of node X , that is, the set of nodes X such that i j by a particular DAG should arrange their beliefs so that the there is an arrow from X to X . The descendants of X are joint distribution is Markov for the DAG. i j i those nodes X , j ≠ i, such that there is a directed path from The justification of the Markov condition parallels the justification of the quantitative part of Reichenbach’s X to X . The ancestors of X are those nodes X such that i j i j principle that we presented in Sec. II B. We begin by X is a descendant of X . i j outlining what the qualitative part of Reichenbach’s prin- Definition 7.—A causal model specifies a DAG, with ciple and the assumption of fundamental determinism nodes corresponding to random variables X ; …;X , 1 n imply for any arbitrary causal structure. and a family of conditional probability distributions Definition 9 (Classical compatibility with a fP½X jParentsðiÞg, one for each i. DAG).—PðX …X Þ is said to be compatible with a Definition 8.—Given a DAG, with random variables 1 n DAG G with nodes X ; …;X if one can find a DAG G X ; …;X for nodes, and given an arbitrary joint distribu- 1 n 1 n that is obtained from G by adding extra root nodes tion PðX …X Þ, the distribution is said to be Markov for 1 n λ ; …; λ , such that for each i, the node λ has a single 1 n i the graph if and only if it can be written in the form of outgoing arrow, to X , and one can find, for each i,a Y distribution Pðλ Þ and a function f from ½λ ; ParentsðiÞ to i i i PðX …X Þ¼ P½X jParentsðiÞ: ð11Þ 1 n i X , such that i¼1 [Recall that each conditional P½X jParentsðiÞ can be PðX …X Þ 1 n computed from the joint PðX …X Þ.] 1 n X Y The generalization of Reichenbach’s principle that is ¼ δ(X ;f ½λ ; ParentsðiÞ)Pðλ Þ : i i i i afforded by the formalism of causal models is as follows: if λ …λ i¼1 1 n there are statistical dependences among variables X ; …;X , expressed in the particular form of the joint 1 n distribution PðX …X Þ, then there should be a causal 1 n explanation of these dependences in terms of a DAG Theorem 5 (Ref. [8])—Given a joint distribution relative to which the distribution PðX …X Þ is Markov. PðX …X Þ and a DAG G with nodes X ; …;X , the 1 n 1 n 1 n Note that an alternative way of formalizing the Markov following are equivalent. property is that PðX …X Þ is Markov for the graph if and (1) PðX …X Þ is compatible with the causal structure 1 n 1 n only if, for each i, P½X jParentsðiÞ ¼ P½X jNondescðiÞ, described by the DAG G. i i 031021-10 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) (2) PðX …X Þ is Markov for G; that is, various different approaches in the literature, including the 1 n multitime formalism [41–44], the quantum combs formal- ism [45–47], the process matrices formalism [48–50], and a PðX …X Þ¼ P½X jParentsðiÞ: 1 n i number of other works as well [14,51–53]. i¼1 The discussion of classical causal models in Sec. IV, and The ð1Þ → ð2Þ implication in Theorem 5 can be read as the results of Sec. III for the special case where A is a follows: if it is granted that causal relationships are complete common cause of B and C, suggest the following indicative of underlying deterministic dynamics, and that generalization. the qualitative part of Reichenbach’s principle is valid, Definition 10.—A quantum causal model specifies a then, on pain of irrationality, an agent’s assignment DAG, with nodes A ; …;A , supplemented with the follow- 1 n PðX …X Þ must be Markov for the original graph. 1 n ing. For each node A , there is associated a finite-dimensional The ð2Þ → ð1Þ implication in Theorem 5, like that of Hilbert space H (the “input” Hilbert space) and the dual Theorem 1, is pertinent for causal inference. It asserts that if space H (the “output” Hilbert space). For each node A , i i one observes a distribution PðX …X Þ that is Markov for a 1 n there is associated a quantum channel, described by an graph G, then the causal structure described by G provides operator ρ ∈ BðH ⊗ H Þ, where H A jParentsðiÞ i i ParentsðiÞ ParentsðiÞ a possible causal explanation of the observed distribution. is the tensor product of the output Hilbert spaces associated Note that, given PðX …X Þ, there is not in general a 1 n with the parents of A . These channels commute pairwise; unique graph G such that PðX …X Þ is Markov for G; 1 n i.e., for any i, j, ½ρ ; ρ ¼ 0 [which is a A jParentsðiÞ A jParentsðjÞ i j hence there are in general competing causal explanations. nontrivial constraint whenever ParentsðiÞ∩ParentsðjÞ is Those causal models that require fine-tuning of parameters nonempty]. are typically rejected. Recall from Sec. III that, given a quantum channel ρ , BCjA it is compatible with A being the complete common cause V. QUANTUM CAUSAL MODELS of B and C if and only if ρ ¼ ρ ρ , and if this holds, BCjA BjA CjA A. Proposed definition then ½ρ ; ρ ¼ 0. The definition of a quantum causal BjA CjA In our treatment of the simple causal scenario where A is model, in particular, the stipulation that the channels a complete common cause of B and C (the DAG of Fig. 3), commute pairwise, generalizes this idea. we focused on what form is implied for the quantum Definition 11.—Given a quantum causal model, speci- channel ρ . But there has not been any attempt to define fying a DAG with nodes A ; …;A , and commuting BCjA 1 n channels ρ , the state is an operator σ on a quantity analogous to the classical joint distribution, that A jParentsðiÞ A …A i 1 n is, a quantity analogous to PðXYZÞ in the case of the DAG ⊗ H , where H ≔ H ⊗ H , given by i¼1 A A A A i i i of Fig. 1, nor indeed other classical Bayesian conditionals such as PðXjYZÞ. For works that aim to achieve such σ ¼ ρ : ð12Þ A …A A jParentsðiÞ 1 n i analogues, see Refs. [21,37]. See also Ref. [40], however, i¼1 where it is shown that if one associates a single Hilbert space to a system at a given time, then there are significant The operator σ is referred to as the state of the A …A 1 n obstacles to establishing an analogue of a classical joint quantum causal model since, as we discuss in the next distribution when the set of quantum systems includes subsection, σ is used to calculate the probabilities for A …A 1 n some that are causal descendants of others outcomes when measurements are performed on the This work takes a different approach. The interpretation systems that the model describes [54]. of a quantum causal model will be that each node represents a local region of time and space, with channels B. Making predictions such as ρ describing the evolution of quantum systems BCjA In order to see how a quantum causal model is used to in between these regions. At each node, there is the calculate probabilities for the outcomes of agents’ inter- possibility that an agent is present with the ability to ventions, consider a quantum causal model with nodes intervene inside that local region. Each node A will then be A ; …;A and state σ . Let the intervention at node A 1 n A …A i 1 n associated with two Hilbert spaces, one corresponding to have classical outcomes labeled by k . The intervention is the incoming system (before the agent’s intervention) and defined by a quantum instrument (that is, by a set of the dual space, which corresponds to the outgoing system completely positive trace-nonincreasing maps, one for each (after the agent’s intervention). A quantum causal model outcome, which sum to a trace-preserving map). In order to will consist of a specification, for every node, of the write the probabilities for the outcomes in a simple form, it quantum channel from its parents to the node, with the is useful to define the instrument in such a way that the map operational significance of a network being that it is used to associated to each outcome takes operators on H into calculate joint probabilities for the agents to obtain the operators on H . Hence, suppose that the outcome k A i various possible joint outcomes for their interventions. This i way of treating quantum systems over time has appeared in corresponds to the map E ∶ BðH Þ → BðH Þ and let A A A i i i 031021-11 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) k k i i Our association of each node A of the DAG with a pair of τ ¼ E ðjli hmjÞ ⊗ jli hmj: i A A A A i i i Hilbert spaces, H and H , is simply a quantum version lm i A O I of the splitting of a classical variable X into X and X , and i i The outcome k of the agent’s intervention can then be our joint state σ is the quantum analogue of the A …A 1 n represented by the (positive, basis-independent) operator I I O O conditional probability PðX …X jX …X Þ. n n 1 1 k k i i τ isomorphic to E . A A In a classical interventional causal model, one can If an agent does not intervene at the node A , this i imagine an intervention at node X as a causal process I O corresponds to the linking operator itself: acting between X and X and possibly outputting an i i X additional classical variable k which acts as a record of id τ ¼ jli hmj ⊗ jli hmj: some aspect of the intervention. The most general such A A A i i lm intervention is described by a conditional probability O I distribution Pðk ;X jX Þ [55]. After specifying the nature i i i The joint probability for the agents to obtain outcomes O I of the intervention at each node, fPðk ;X jX Þg , one can i i i i k ; …;k is given by 1 n compute the joint probability distribution over the record variables to be k k 1 n Pðk …k Þ¼ Tr½σ ðτ ⊗   ⊗ τ Þ: ð13Þ 1 n A …A A A 1 n 1 n I I O O Pðk …k Þ¼ PðX …X jX …X Þ 1 n n n 1 1 We can also define operations on the state σ A …A 1 n I O I O X ;X …X ;X n n 1 1 corresponding to marginalization over the outcome k of an intervention on node A by Tr . In this case, the i k A i O I × Pðk ;X jX Þ: ð15Þ i i i joint state on the rest of the nodes after such marginali- i¼1 zation is X Clearly, our intervention operators τ are the quantum k A σ ¼ Tr ðσ τ Þ: A …A A …A A A …A O I 1 ði−1Þ ðiþ1Þ n i 1 n A analogue of the intervention conditionals Pðk ;X jX Þ, and i i our Eq. (13) is the quantum analogue of Eq. (15). If the intervention at node A is trivial, then D. Examples id σ ¼ Tr ðσ τ Þ: A …A A …A A A …A A 1. Confounding common cause 1 ði−1Þ ðiþ1Þ n i 1 n i Consider a quantum causal model with the DAG C. Classical interventional models depicted in Fig. 10. The DAG is supplemented with the quantum channels ρ , ρ , and ρ , where the latter is CjAB BjA A Given the proposed definition of a quantum causal simply a quantum state on H (which can also be thought model, and the interpretation in terms of agents intervening of as a channel from the trivial, or one-dimensional, system at nodes, there is a stronger analogy to be made with a into A). classical formalism that similarly involves interventions The corresponding state is σ ¼ ρ ρ ρ , where σ ABC CjBA BjA A than there is to the standard classical causal models acts on the Hilbert space H ⊗ H ⊗ H ⊗ H ⊗ C C B B introduced in Sec. IV. H ⊗ H . By stipulation, the channels commute pairwise. In order to make this explicit, consider a classical This is immediate in the case of, say, ρ and ρ , since BjA A interventional causal model constructed as follows. For a these operators are nontrivial on distinct Hilbert spaces. But given DAG, split every node X into a pair of disconnected O I nodes, denoted X and X , such that in the DAG i i that results, X has as parents the set of nodes O O O Parents ðiÞ ≔ fX ∶X ∈ ParentsðiÞg, and X has as chil- j i dren fX ∶X ∈ ChidrenðiÞg. In other words, the “I” version of each node X has as parents the “O” version of each node that was a parent of X in the original graph, and the “O” version of each node X has as children the “I” version of each node that was a child of X in the original graph. In this case, one can represent the resulting DAG by a conditional probability distribution: I I O O I O PðX …X jX …X Þ¼ P½X jParents ðiÞ: ð14Þ n n FIG. 10. A causal network with A a common cause for B and C 1 1 i i¼1 and with B a parent of C. 031021-12 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) it is significant in the case of ρ and ρ , both of which strongly constrained as it was in the previous example. CjBA BjA Physically, this is important: if the qubit, for example, is act nontrivially on H . From Theorem 2, this implies that interacting only weakly with the environment, then its H decomposes as H ¼⨁ H ⊗ H , with ρ act- L R A A i CjBA A A i i evolution certainly could not be paraphrased in terms of a ing trivially on (say) the right-hand factors and ρ acting BjA strong von Neumann measurement, as it was for evolutions trivially on the left-hand factors. compatible with Fig. 10. The fact that the output Hilbert space of the A system One further remark concerning this example will help to decomposes in this manner is a significant constraint on the illustrate a distinction between quantum and classical kinds of quantum evolution that can be compatible with the causal models. Suppose that ρ is the pure state j0ih0j DAG of Fig. 10. In words, the evolution undergone by and that we marginalize over λ under the assumption that an the system emerging from A is as follows: a (possibly agent at the λ node does not intervene. In classical causal degenerate) von Neumann measurement is performed and, models, if a root note has a point distribution, then controlled on the outcome, the A system is split into two marginalizing over that node yields a distribution over pieces. One piece evolves to become the input at B. The the remaining variables that is compatible with the DAG output at B is then recombined with the other piece and obtained by removing that node and its outgoing arrows. evolves to become the input at C. This does not hold in the quantum case: even for ρ a pure By way of contrast, it is also instructive to consider state, marginalizing over the λ node (assuming no inter- quantum causal models with the causal structure shown in vention there) in general yields an operator σ that is not ABC Fig. 11. Such a quantum causal model may represent, for compatible with the DAG obtained by removing λ (Fig. 10). example, the non-Markovian evolution of a qubit over three As with the example of the coherent copy in Sec. III E 2, time steps, with A, B, and C representing the qubit at each this makes intuitive sense if one takes the view that a time step, and where the qubit interacts with an environ- quantum pure state represents maximal but incomplete ment whose initial state is ρ . The qubit is initially information. Incomplete information about the λ system uncorrelated with the environment. Suppose that the state may underwrite correlations between B and C, so that such of the environment at the second and third time steps is not correlations cannot be attributed entirely to system A as of interest; hence, corresponding nodes do not appear in the Fig. 10 requires. Hence, even for the environment initially DAG. Given that over the course of this evolution infor- in a pure state, the non-Markovian evolution of a qubit need mation can flow from the qubit to the environment, and not obey the strong constraint implied by the causal back again, it is necessary to include an arrow from A to C, structure of Fig. 10. as well as from λ to B and λ to C. A quantum causal model with this DAG defines com- 2. Simple case of Bayesian updating muting channels ρ , ρ , ρ , ρ . From the fact that CjBAλ BjAλ A λ This section discusses another sense in which the ρ and ρ commute, we conclude that the Hilbert CjBAλ BjAλ quantum notion of conditional independence of the outputs space H ⊗ H decomposes as a direct sum over direct A λ of a channel given the input mirrors qualitatively an products. However, a decomposition of H ⊗ H as a A λ important aspect of the classical case. direct sum over direct products does not imply a decom- Consider a classical causal model with the DAG of position of the Hilbert space H alone as a direct sum over Fig. 1 and distribution PðXYZÞ such that PðYZjXÞ¼ direct products. Hence, the evolution of the qubit is not PðYjXÞPðZjXÞ. A particular feature of this causal scenario is that if new information is obtained about the variable Y, for example, if an agent learns that the value of Y is y, then the process of Bayesian updating proceeds as follows. First, update the distribution over X by applying the rule PðY ¼ yjXÞPðXÞ PðXÞ ≔ PðXjY ¼ yÞ¼ : PðY ¼ yÞ Then use the new probability distribution on X, PðXÞ,to get an updated distribution for Z: PðZjY ¼ yÞ¼ PðZjXÞPðXÞ; ð16Þ FIG. 11. The causal structure of Fig. 10 with an extra node λ, which is a common cause for B and C. A causal model with this where the sum ranges over the values that X may take. DAG may describe a qubit interacting with an environment: A, B, Roughly speaking, the process of Bayesian updating C represent the qubit system at three different times and λ the environment at the initial time. “follows the arrows” of the graph. For this it is crucial 031021-13 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) that the joint distribution PðXYZÞ satisfies PðYZjXÞ¼ settings and classical outcomes of measurements. PðYjXÞPðZjXÞ, otherwise the term PðZjXÞ in Eq. (16) Henson, Lal, and Pusey [63] and Fritz [64] independently would have to be replaced with PðZjX; Y ¼ yÞ. proposed definitions of quantum causal models with the Consider now a quantum causal model, with the DAG of purpose of expressing such constraints. In these appro- Fig. 3 and with state σ ¼ ρ ρ ρ . Suppose that an aches, each node of the DAG represents a process (which ABC BjA CjA A agent at B intervenes, obtaining outcome k , corresponding may have a classical outcome), while each directed edge is associated with a system that is passed between processes. to the operator τ . The agent wishes to calculate the However, despite the fact that their frameworks incorporate probability that an intervention at C yields outcome k the possibility of post-classical resources, they do not have corresponding to τ , conditioned on having obtained the sufficient structure to define conditional independences outcome k , and assuming that there is no intervention at A. between quantum systems. This can be done as follows. First, update the state assigned Operational reformulations of quantum theory such as to A given the knowledge of k to Refs. [65–70] helped to set the stage for the development of k quantum causal models. Although they were conceived Tr ðσ τ Þ B AB σ~ ≔ σ ¼ : independently of the framework of classical causal models, A Ajk B k id B Tr½σ ðτ ⊗ τ Þ AB A B they were quite similar to that framework insofar as they made heavy use of DAGs—in the form of circuit Then apply the channel ρ to σ~ to get the state assigned CjA A diagrams—to depict structural features of a set of proc- to C given the knowledge of k : esses. When the authors of these formulations turned their attention to relativistic causal structure, the frameworks id σ ¼ Tr ðρ σ~ τ Þ: Cjk A CjA A B A they devised drew even closer in spirit to that of causal models. Prominent examples include the causaloid frame- Finally, calculate the probability of k : work of Ref. [71], the multitime formalism [41–44], quantum combs [45–47], the causal categories of Pðk jk Þ¼ Trðσ τ Þ: C B Cjk B C Ref. [28], and the process matrix formalism [48,49].A common aim of these approaches is to be able to compute Again, the process of Bayesian updating follows the arrows the consequences of an intervention upon a particular of the graph. Note that for this to work, it is crucial that the quantum system within the circuit, and this is precisely channel ρ satisfies ρ ¼ ρ ρ . BCjA BCjA BjA CjA one of the tasks that a quantum analogue of a causal model should be able to handle. VI. RELATION TO PRIOR WORK Many of these frameworks represent a quantum system at a given region of space-time by two copies of its Hilbert We now present a short review of prior works on space, one corresponding to the system that is input into quantum causal models and describe how our own proposal the region and one corresponding to the system that is relates to these. output from it. In this way, the region becomes a “locus of Generalizations of Reichenbach’s principle of common cause were discussed in Refs. [56–59], although the intervention” for the system. By inserting a particular approach is quite different from ours. In these works, quantum process into the “slot,” one determines the nature the focus of attention is a Boolean algebra of events (in the of the intervention. This is the approach taken, for instance, classical case), or a nondistributive algebra of projectors (in in the multitime formalism of Ref. [42], the quantum combs the quantum case), with probabilities induced in each case of Ref. [45], and the process matrices of Ref. [48]. This by a state on the algebra. Given a pair of events, or representation of interventions has a counterpart in classical projectors, a common cause is a third event, or projector, causal models, for instance, in the work of Ref. [72],aswas such that the probabilities satisfy certain constraints. For a noted in Refs. [14,50]. critical analysis of Refs. [58,59], see Ref. [15]. Costa and Shrapnel [50] in particular have sought to Preliminary work more directly pertaining to quantum explicitly cast this sort of framework as a quantum causal models took the form of explorations of Bell-type generalization of a causal model. In their approach, the inequalities (and whether they admit quantum violations) nodes of the DAG are associated with a quantum system for novel causal scenarios [60,61]. Several researchers localized in a region (understood as a potential locus of recognized that the formalism of classical causal models intervention) and the collection of edges from one set of could provide a unifying framework in which to pose the nodes to another represent causal processes. problem of deriving Bell-type constraints, and that this An approach of this sort is required if one seeks to find framework might be extended to address the problem of intrinsically quantum versions of important theorems of deriving constraints on the correlations that can be obtained classical causal models. For instance, while Henson, Lal, and Pusey [63] derive a generalization of the d-separation with quantum resources [2,10,11,62]. Note that such constraints are expressed entirely in terms of classical theorem of classical causal models, it only infers 031021-14 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) Finally, we mention a third purpose to which quantum conditional independence relations from d-separation rela- tions for the classical variables in the graph. An intrinsically causal models can be put. Theorems about classical causal quantum version of the d-separation theorem, by contrast, models often concern the sorts of inferences one can make about one variable given information about another. As an would be one which concerns the causal relations among example, if Z is a common effect of X and Y, then learning quantum systems (see, for instance, Ref. [73]). If a set of Z can induce correlations between X and Y. As such, one nodes representing quantum systems can be described by a might expect quantum causal models to also constrain the joint or conditional state, then one can seek to determine sort of inferences one can make among quantum variables. whether factorization conditions on this state are implied by Early work by Leifer and Spekkens [21] had this purpose in d-separation relations among the quantum systems on the mind. The authors noted various scenarios in which their graph. Similarly, while the approaches both of Henson, Lal, proposal could not be applied, and subsequent work [40] and Pusey [63] and of Fritz [64] allow one to derive, from has narrowed down the scope of possibilities for any such the structure of the DAG, constraints on the joint distri- proposal. Our own proposal provides the means of making bution over classical variables embedded therein, they do many of the Bayesian inferences considered in Ref. [21]. not address an intrinsically quantum version of this The case we discuss in Sec. VD 2 is one such example. problem. If a set of nodes representing quantum systems There is also prior work on quantum causal models that can be described by a joint or conditional state, then one takes a significantly different approach to the ones can seek to derive constraints on this state directly from the described above and for which the relation to our work structure of the DAG. is less clear. The work of Tucci [74,75], which is in fact the Our own approach aims at an intrinsically quantum earliest attempt at a quantum generalization of a causal generalization of the notion of a causal model. We therefore model, represents causal dependences by complex transi- associate to each node of the DAG a quantum system tion amplitudes rather than quantum channels. localized to a space-time region, and we represent it by a pair of Hilbert spaces, corresponding to the input and VII. CONCLUSIONS output of an intervention upon the system. Consequently, The field of classical statistics has benefited greatly from our approach is very similar to that of Costa and Shrapnel analysis provided by the formalism of causal models [8,9]. [50]. Nonetheless, there are significant differences in how In particular, this formalism allows one to infer facts about we represent common causes. the underlying causal structure purely from uncontrolled First, in the work of Costa and Shrapnel, any node with statistical data, a tool with significant applications in all multiple outgoing edges is represented as a locus of branches of the physical and social sciences. Given that intervention where the output Hilbert space is a tensor some seemingly paradoxical features of classical correla- product of Hilbert spaces, one for each outgoing edge. As tions have found satisfying resolutions when viewed such, any node acting as a common cause must be through a causal lens, one might wonder to what extent associated with a composite quantum system. It cannot, the same is true of quantum correlations. for instance, be associated with a single qubit. By contrast, Starting with the idea that whatever innovation quantum our approach does not constrain the representation of theory might hold for causal models, the intuition contained common causes in this fashion. Any quantum system, in Reichenbach’s principle ought to be preserved, we including a single qubit, may constitute a complete motivate the problem of finding a quantum version of common cause of a collection of other quantum systems. the principle. This requires us to determine what constraint This extra generality is required since, as our examples a channel from A to BC must satisfy if A is the complete show, the complete common cause of a set of systems can common cause of B and C. We solve the problem by be a single qubit. Second, and more importantly, our work considering a unitary dilation of the channel and by noting shows that for a quantum channel whose input is the that there is no ambiguity in how to represent the absence of complete common cause of its n outputs, it is not the case causal influences between certain inputs and certain outputs that the channel must split the input into n components, of a unitary. From this, we derive a notion of quantum each of which exerts a causal influence on a different conditional independence for the outputs of the channel output. This is merely one special case of the most general given its input. This inference from a common-cause form that such a channel can take. Third, if the complete structure to quantum conditional independence is then common causes consist of multiple nodes in the DAG, then generalized to obtain our quantum version of causal it is only the joint Hilbert space of the collection of these models. that must satisfy the condition of factorizing in subspaces, Given a state on a quantum causal model, we describe while each individual Hilbert space need not. how to construct a marginal state for a subset of nodes. We These differences are likely to have a significant impact discuss a number of simple examples of quantum channels on the form of any intrinsically quantum d-separation and causal structures. A theme of the examples is that when theorem. there is a difference between the quantum and classical 031021-15 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) combination of different causal structures [46,48,83,84]. cases, this can often be understood if one takes the view that a quantum pure state represents maximal but incomplete It has been argued that the possibility of such indefinite information about a system, and hence may underwrite causal structure may be significant for the project of unifying quantum theory with general relativity [71]. correlations between other systems in a way that a classical pure state cannot. There are many directions for future work. In the case of ACKNOWLEDGMENTS classical causal networks, an important theorem states that R. W. S. thanks Elie Wolfe for helpful discussions. This the d-separation relation among nodes of a DAG is sound work was supported by EPSRC grants, the EPSRC and complete for a conditional independence relation to hold National Quantum Technology Hub in Networked among the associated variables in the joint probability Quantum Information Technologies, an FQXi Large distribution [8]. Here, for arbitrary subsets of nodes S, T, Grant, University College Oxford, the Wiener-Anspach and U, subsets S and T are said to be d-separated by U if a Foundation, and by the Perimeter Institute for Theoretical certain criterion holds, where this is determined purely by the Physics. Research at Perimeter Institute is supported by the structure of the DAG. An important question is, therefore, Government of Canada through the Department of whether d-separation is sound and complete for some natural Innovation, Science and Economic Development Canada, property of the state σ on a quantum causal network. and by the Province of Ontario through the Ministry of It is also desirable to relate properties of a quantum Research, Innovation and Science. This project/publication causal network to operational statements involving the were made possible through the support of a grant from the outcomes of agents’ interventions: under what circum- John Templeton Foundation. The opinions expressed in this stances, for example, does it follow that there is an publication are those of the author(s) and do not necessarily intervention by the agents at nodes in a subset U, such reflect the views of the John Templeton Foundation. that, conditioned on its outcome, the outcomes of any interventions by agents at S and T must be independent? Such a result would have an application to quantum APPENDIX A: PROOF OF THEOREMS 2 AND 3 protocols. Imagine, for example, a cryptographic scenario in which agents at S and T desire shared correlations that We here provide the proof of Theorems 2 and 3. This are not screened off by the information held by agents at U. amounts to proving that for a channel ρ , the following BCjA In the classical case, there has been a great deal of work four conditions are equivalent. on the problem of causal inference [8,9,76–78]: given only (1) ρ admits of a unitary dilation where A is a BCjA certain facts about the joint probabilities, for instance, a set complete common cause of B and C. of conditional independences, what can be inferred about (2) ρ ¼ ρ ρ . BCjA BjA CjA the underlying causal structure? For an initial approach to (3) IðB∶CjAÞ¼ 0, where IðB∶CjAÞ is the quantum quantum causal inference, with a quantum-over-classical conditional mutual information evaluated on the advantage in a simple scenario, see Ref. [14]. The formal- (positive, trace-one) operator ρ ˆ . BCjA ism of quantum causal networks described here is the (4) The Hilbert space for the A system can be appropriate framework for inferring facts about underlying, decomposed as H ¼⨁ H L ⊗ H R and ρ ¼ A BCjA i A A i i intrinsically quantum, causal structure, given observed ðρ L ⊗ρ RÞ, where for each i, ρ L represents i BjA CjA BjA i i i facts about the outcomes of interventions by agents. a completely positive map BðH LÞ → BðH Þ, and A B Recently, there has been much interest in deriving ρ R a completely positive map BðH RÞ → BðH Þ. bounds on the correlations achievable in classical causal CjA A C i i models [76,77,79,80] using insights from the literature on We show various implications that collectively give Bell’s theorem. Such bounds constitute Bell-like inequal- Theorem 2. ities for arbitrary causal structures. The main technical Proof that ð3Þ ↔ ð4Þ.—This follows easily from the results of Ref. [27], where a characterization is given of challenge in deriving such inequalities is that the set of correlations is generally not convex if the DAG has more tripartite quantum states over systems A, B, C that than one latent variable, so that standard techniques for satisfy IðB∶CjAÞ¼ 0. deriving Bell inequalities are not applicable. By adapting Lemma 1 [Ref. [27], Theorem 6).—For any tripartite these new techniques to the formalism we present here, one quantum state ρ , the quantum conditional mutual ABC information IðB∶CjAÞ¼ 0 if and only if the Hilbert space could perhaps systematically derive bounds on the quantum of the A system decomposes as H ¼⨁ H L ⊗ H R, correlations achievable in certain quantum causal models, A A A i i thereby providing a general method of deriving Tsirelson- such that like bounds [81,82] for arbitrary causal structures. Finally, it would be interesting to extend the formalism X X to explore the possibility that certain quantum scenarios ρ ¼ p ðρ L ⊗ ρ RÞ;p ≥ 0; p ¼ 1; ðA1Þ ABC i i i BA CA i i i i are best understood as involving a quantum coherent 031021-16 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) L L where for each i, ρ is a quantum state on H ⊗ H and To see this, write BA B A i i ρ R is a quantum state on H ⊗ H R. CA C A i i U U IðB∶λ jλ AÞ¼ Sðρ ˆ Þþ Sðρ ˆ Þ C B Bjλ A ·jλ Aλ Theorem 2 concerns the channel operator ρ , which B B C BCjA U U satisfies Tr ðρ Þ¼ I . Applying Lemma 1 to the BC BCjA A − Sðρ ˆ Þ − Sðρ ˆ Þ: ðA6Þ Bjλ Aλ ·jλ A B C B operator ρ ˆ ¼ð1=d Þρ yields the decomposition BCjA A BCjA The second and fourth terms are entropies of maximally ρ ˆ ¼ p ðρ ˆ L ⊗ ρ ˆ RÞ: mixed states on their respective systems, and hence, sum to BCjA i BjA CjA i i log d . For the first and third terms, it follows from the assumption that there is no causal influence from λ to B in Using Tr ðρ ˆ Þ¼ð1=d ÞI , it follows that for each i, U U BC BCjA A A U that ρ ˆ ¼ ρ ˆ ⊗ ð1=d ÞI . Hence, the third λ ðλ Þ Bjλ Aλ Bjλ A C C B C B the components satisfy Tr ðρ ˆ LÞ¼ð1=d LÞI L , and B BjA A ðA Þ i i i term is equal to Sðρ ˆ Þþ logðd Þ, which gives Eq. (A5). Bjλ A C Tr ðρ ˆ RÞ¼ð1=d RÞI R , with p ¼ðd Ld RÞ=d . The C i A CjA A ðA Þ A A i i i i i Fourth, result follows. Proof that ð1Þ → ð4Þ.—Let ρ be the Choi- BFCjAλ λ IðC∶λ jAλ Þ¼ 0: ðA7Þ B C B C Jamiołkowski operator for the unitary U, defined according to the conventions set out in the main text. Let missing This follows from a similar argument as Eq. (A5), using the indices indicate that a partial trace is taken, as also in assumption that there is no influence from λ to C in U. the main text. Note that, in general, ρ ≠ ρ , since BCjA BCjA The aim is now to use Eqs. (A2), (A4), (A5), and (A7) to the latter is obtained via a particular choice of input states show that ρ ˆ satisfies IðB∶CjAÞ¼ 0. This follows using BCjA for λ and λ . The proof proceeds by proving relations a result from Ref. [12], which states that quantum condi- B C between quantum conditional mutual information tional mutual information on partial traces of a multipartite evaluated on the renormalized operator ρ ˆ ¼ quantum state satisfies the semigraphoid axioms familiar BFCjAλ λ B C from the classical formalism of causal networks [8]. The ð1=d d d Þρ and its partial traces. λ A λ B C BFCjAλ λ B C semigraphoid axioms are as follows: First, ½IðX∶YjZÞ¼ 0 ⇒ ½IðY∶XjZÞ¼ 0; ðA8Þ IðB∶FCjλ Aλ Þ¼ 0: ðA2Þ B C ½IðX∶YWjZÞ¼ 0 ⇒ ½IðX∶YjZÞ¼ 0; ðA9Þ This follows by expanding in terms of von Neumann entropies: ½IðX∶YWjZÞ¼ 0 ⇒ ½IðX∶YjZWÞ¼ 0; ðA10Þ U U IðB∶FCjλ Aλ Þ¼ Sðρ ˆ Þþ Sðρ ˆ Þ B C Bjλ Aλ FCjλ Aλ B C B C ½IðX∶YjZÞ¼ 0∧ ½IðX∶WjYZÞ¼ 0⇒ ½IðX∶YWjZÞ¼ 0: U U − Sðρ ˆ Þ − Sðρ ˆ Þ: ðA3Þ BFCjλ Aλ ·jλ Aλ B C B C ðA11Þ Applying Eqs. (A8)–(A11) to Eqs. (A2), (A4), (A5), and The third term is zero, since the unitarity of U implies that (A7) gives ρ ˆ is a pure state. The final term is logðd d d Þ, λ A λ BFCjλ Aλ B C B C since ρ ˆ ¼ð1=d d d ÞI . Noting also that λ A λ ðλ Aλ Þ ·jλ Aλ B C B C B C ½IðB∶FCjλ Aλ Þ¼ 0 ⇒ ½IðB∶Cjλ Aλ Þ¼ 0; ðA12Þ B C B C Tr ðρ ˆ Þ¼ð1=d d d ÞI , and using λ Aλ λ A λ ðλ Aλ Þ B C BFCjλ Aλ B C B C B C the fact that the von Neumann entropy of the partial trace ½IðC∶λ jAλ Þ¼ 0 ∧ ½IðB∶Cjλ Aλ Þ¼ 0 B C B C of a pure state is equal to the von Neumann entropy of the ⇒ ½IðC∶Bλ jAλ Þ¼ 0; ðA13Þ B C complementary partial trace, yields that the first two terms equal logðd d Þ and logðd Þ, respectively, hence, their F C B ½Iðλ ∶λ jAÞ¼ 0 ∧ ½Iðλ ∶Bjλ AÞ¼ 0 B C C B sum is equal to logðd d d Þ, and Eq. (A2) follows. λ A λ B C Second, ⇒ ½Iðλ ∶Bλ jAÞ¼ 0; ðA14Þ C B Iðλ ∶λ jAÞ¼ 0: ðA4Þ ½IðBλ ∶λ jAÞ¼ 0 ∧ ½IðBλ ∶CjAλ Þ¼ 0 B C B C B C ⇒ ½IðBλ ∶Cλ jAÞ¼ 0: ðA15Þ B C This follows immediately from ρ ˆ ¼ð1=d d d Þ× λ A λ ·jλ Aλ B C B C I . ðλ Aλ Þ B C Hence, condition (1) of the theorem implies that Third, IðBλ ∶Cλ jAÞ¼ 0, where this quantity is calculated on B C the trace-one Choi-Jamiołkowski operator representing the IðB∶λ jλ AÞ¼ 0: ðA5Þ C B dilation unitary U. Using Lemma 1 gives 031021-17 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) U U U W, it follows immediately that there is no influence from λ ρ ˆ ¼ p ðρ ˆ ⊗ ρ ˆ Þ; ðA16Þ L R BCjλ Aλ Bjλ A CjA λ B C B C i i to C in U. Proof that ð2Þ → ð3Þ.—As remarked in the main text, for some appropriate decomposition of ðH Þ and proba- taking the Hermitian conjugate of ρ ¼ ρ ρ BCjA BjA CjA bility distribution fp g . The form of the decomposition, immediately gives ½ρ ; ρ ¼ 0. Hence, i i BjA CjA and the fact that Tr ðρ ˆ Þ¼ð1=d d d ÞI , BC λ A λ ðλ Aλ Þ BCjλ Aλ B C B C B C ρ ¼ ρ ρ ; ðA20Þ gives BCjA BjA CjA U U U ρ ¼ exp½log ρ þ log ρ ; ðA21Þ ρ ¼ ðρ ⊗ ρ Þ; ðA17Þ BCjA BjA CjA L R BCjλ Aλ Bjλ A CjA λ B C B C i i log ρ ¼ log ρ þ log ρ ; ðA22Þ BCjA BjA CjA where for each i, the components satisfy Tr ðρ Þ¼ Bjλ A log ρ þ log ρ ¼ log ρ þ log ρ ; ðA23Þ BCjA ·jA BjA CjA I L  and Tr ðρ Þ¼ I R . The operator ρ is ðA Þ C R ðA Þ BCjA Cjλ A i i obtained by acting with this channel on the input states −1 −1 logðd ρ Þþ logðd ρ Þ BCjA ·jA A A j0i for λ and j0i for λ . This gives B C λ λ B C −1 −1 ¼ logðd ρ Þþ logðd ρ Þ: ðA24Þ BjA CjA A A ρ ¼ ðρ L ⊗ ρ RÞ; BCjA BjA CjA i i The second line follows as ½ρ ; ρ ¼ 0; the fourth BjA CjA because ρ ¼ I , and therefore has the zero matrix as its ·jA A −1 where Tr ðρ LÞ¼ I L  and Tr ðρ RÞ¼ I R ,as B BjA ðA Þ C CjA ðA Þ logarithm; and the final line by adding 2 log d to both i i i i required. sides. It is proved in Ref. [85] that for any trace-one density Proof that ð4Þ → ð1Þ.—Let H ¼⨁ H , with operator ρ , log ρ þ log ρ ¼ log ρ þ log ρ is A A XYZ XYZ Z XZ YZ i i H ¼ H L ⊗ H R, and ρ ¼ ðρ L ⊗ ρ RÞ. equivalent to the condition IðX∶YjZÞ¼ 0. A A A BCjA BjA CjA i i i i i i Proof that ð4Þ → ð2Þ.—Condition (4) is that H ¼ Each term ρ L corresponds to a valid quantum channel, A BjA ⨁ H L ⊗ H R, with ρ ¼ ðρ L ⊗ ρ RÞ. It fol- BCjA i A A i BjA CjA i.e., a CPTP map BðH Þ → BðH Þ. Similarly, each term i i i i A B lows that ρ R corresponds to a CPTP map BðH RÞ → BðH Þ. CjA A C i i The channel ρ L can be dilated to a unitary trans- BjA L R ρ ¼ ðρ ⊗ I Þ; ðA25Þ BjA BjA ðA Þ i i formation V , with ancilla input λ in a fixed state j0i , i B λ such that V acts on the Hilbert space H ⊗ H L. i λ A i X Similarly, ρ R can be dilated to a unitary transformation ρ ¼ ðI L  ⊗ ρ RÞ: ðA26Þ CjA CjA ðA Þ CjA i i W , with ancilla λ in a fixed state j0i , acting on i C λ H R ⊗ H . By choosing the dimension of λ large λ B i The product is enough, we can identify the system λ and the state j0i that are used for each value of i, and similarly λ . λ C ρ ρ ¼ ðρ L ⊗ I R ÞðI L  ⊗ ρ RÞ: ðA27Þ BjA CjA BjA ðA Þ ðA Þ CjA i i j j Let V be the operator that acts as V ⊗ I R on the i;j i A subspace H ⊗ H , and as zero on the subspace λ A B i The only nonzero terms correspond to i ¼ j; hence, H ⊗ H , for j ≠ i. Similarly, let W be the operator λ A i B j L X that acts as I ⊗ W on the subspace H ⊗ H , and as A i A λ i C L R ρ ρ ¼ ρ ⊗ ρ ¼ ρ : ðA28Þ BjA CjA BjA CjA BCjA i i zero on the subspace H ⊗ H for j ≠ i. Let A λ j C V ¼ V ; ðA18Þ i APPENDIX B: PROOF OF THEOREM 4 Proof that ð3Þ → ð2Þ.—The proof proceeds via an W ¼ W ; ðA19Þ inductive argument. Consider where W and V are unitary and ½V ⊗ I ;I ⊗ W¼ 0. ρ ¼ Tr ðρ Þ; λ λ B …B jA B …B B …B jA C B 1 n nþ1 k 1 k The channel represented by ρ can be dilated to the BCjA with 2 ≤ n<k, and assume that the claim holds for this unitary transformation U ¼ðI ⊗ WÞðV ⊗ I Þ, with λ λ B C channel; hence, ancillas λ and λ . From the form of V and W, it follows B C immediately that there is no causal influence from λ to B ρ ¼ ρ  ρ ; ðB1Þ B …B jA B jA B jA 1 n 1 n in U. From ½V ⊗ I ;I ⊗ W¼ 0 and the form of V and λ λ C B 031021-18 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) X X with ½ρ ; ρ ¼ 0, for i; j ¼ 1; …;n. It is shown Sðρ ˆ Þ¼HðpÞþ p logd L þ p Sðρ ˆ RÞ; B jA B jA i j B B …B jA i A i B …B jA 2 3 k 2 k i i i i that the claim remains true if one fewer system is traced X X out. To see this, recall that condition (3) gives Sðρ ˆ Þ¼HðpÞþ p logd L þ p Sðρ ˆ RÞ; B jA i A i B jA i i ¯ ¯ IðB ∶B jAÞ¼ 0, where IðB ∶B jAÞ is evaluated nþ1 nþ1 nþ1 nþ1 i i X X on ρ ˆ . Using Theorem 2 gives B …B jA 1 k Sðρ ˆ Þ¼HðpÞþ p logd L þ p Sðρ ˆ RÞ; B …B jA i i A B …B jA 3 k 3 k i i i i ρ ¼ ρ ¯ ρ ; X X B …B jA B jA B jA 1 k nþ1 nþ1 L R Sðρ ˆ Þ¼HðpÞþ p logd þ p Sðρ ˆ Þ: ·jA i A i ·jA i i with ½ρ ¯ ; ρ ¼ 0. Tracing out systems B …B i i B jA B jA nþ2 k kþ1 nþ1 results in Substituting into ρ ¼ ρ ρ ; B …B jA B …B jA B jA 1 nþ1 1 n nþ1 IðB ∶B ; …;B jAÞ¼ Sðρ ˆ Þþ Sðρ ˆ Þ 2 3 k B jA B …B jA 2 3 k with ½ρ ; ρ ¼ 0. Since ρ satisfies B …B jA B jA B …B jA 1 n nþ1 1 n − Sðρ ˆ Þ − Sðρ ˆ Þ; B B …B jA ·jA 2 3 k Eq. (B1), it follows that the HðpÞ terms and the p log d L terms cancel, and one i A ρ ¼ ρ  ρ : B …B jA B jA B jA 1 nþ1 1 nþ1 is left with For any i ¼ 1; …;n, trace out all systems but B , B and i nþ1 R IðB ∶B ; …;B jAÞ¼ p IðB ∶B ; …;B jA Þ¼ 0: 2 3 k i 2 3 k A to see that ½ρ ; ρ ¼ 0. B jA B jA i nþ1 Hence, if ρ satisfies the claim, so too does B …B jA 1 n ρ .As ρ ¼ ρ ρ ,with ½ρ ;ρ ¼ 0, B …B jA B B jA B jA B jA B jA B jA Non-negativity of both the conditional mutual information 1 nþ1 1 2 1 2 1 2 follows from IðB ∶B jAÞ¼ 0, and tracing out all but and the p implies 1 1 i systems B , B , and A, the proof is complete. 1 2 Proof that ð2Þ → ð3Þ.—This is immediate from Theorem IðB ∶B …B jA Þ¼ 0: 2 3 k 2, by grouping outputs into B and B for each i. i i Proof that ð3Þ ↔ ð4Þ.—The proof that ð4Þ → ð3Þ is Hence, each H R in the above decomposition further immediate from Theorem 2, by grouping outputs into B decomposes into a direct sum of tensor products. Iterating and B for each i. i this procedure results in the required decomposition. It remains to show that if IðB ∶B jAÞ¼ 0 for all i, then i i Proof that ð1Þ ↔ ð4Þ.—The proof that ð4Þ → ð1Þ is a there exists a decomposition straightforward extension of the proof in Appendix A that condition ð4Þ → ð1Þ in Theorem 2. To show that ð1Þ → ð4Þ, first use Definition 4 to show that H ¼⨁ ⊗ H j ; ðB2Þ i ¯ j¼1 i if, for each i, there is no causal influence from λ to B ,it i i follows that, for each i, there is no causal influence from λ with ρ ¼ ðρ 1 ⊗  ⊗ ρ kÞ. B …B jA ¯ i B jA B jA 1 k 1 k to B . Partitioning the output systems into B and B , and i i i i i Given IðB ∶B jAÞ¼ 0, Theorem 2 implies that H ¯ 1 1 A the ancilla systems λ ; …; λ into λ and λ , it follows from 1 k i i decomposes as Theorem 2 that IðB ∶B jAÞ¼ 0. Hence, condition i i ð1Þ → ð3Þ, and since condition ð3Þ → ð4Þ, the result follows. H ¼⨁H L ⊗ H R; A A A i i with ρ ¼ ρ L ⊗ ρ R. By assumption, B …B jA i B jA B …B jA 1 k 1 2 k i i IðB ∶B jAÞ¼ IðB ∶B ;B ; …;B jAÞ¼ 0.Asthe 2 2 2 1 3 k [1] H. Reichenbach, The Direction of Time, edited by M. conditional mutual information never increases if systems Reichenbach (University of California Press, Berkeley, are discarded, we have 0 ¼ IðB ∶B ;B ;…;B jAÞ≥ CA, 1991). 2 1 3 k IðB ∶B ;…;B jAÞ. Non-negativity of the conditional mutual [2] C. J. Wood and R. W. Spekkens, The Lesson of 2 3 k Causal Discovery Algorithms for Quantum Correlations: information then yields IðB ∶B ; …;B jAÞ¼ 0. 2 3 k Causal Explanations of Bell-Inequality Violations Require The above decomposition ensures Fine-Tuning, New J. Phys. 17, 033002 (2015). [3] J. S. Bell, Speakable and Unspeakable in Quantum I L ρ ˆ ¼ p ⊗ ρ ˆ R; Mechanics (Cambridge University Press, Cambridge, B …B jA i B …B jA 2 k 2 k d L i England, 1964). [4] John F. Clauser, M. A. Horne, A. Shimony, and R. A. Holt, with p ¼ d Ld R=d . As the terms in the sum on the rhs Proposed Experiment to Test Local Hidden-Variable i A A A i i have support on orthogonal subspaces, Theories, Phys. Rev. Lett. 23, 880 (1969). 031021-19 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) [5] B. Hensen et al., Loophole-Free Bell Inequality Violation [20] M.-D. Choi, Completely Positive Linear Maps on Complex Using Electron Spins Separated by 1.3 Kilometres, Nature Matrices, Linear Algebra Appl. 10, 285 (1975). (London) 526, 682 (2015). [21] M. S. Leifer and R. W. Spekkens, Towards a Formulation of [6] L. K. Shalm, E. Meyer-Scott, B. G. Christensen, P. Bierhorst, Quantum Theory as a Causally Neutral Theory of Bayesian M. A. Wayne, M. J. Stevens, T. Gerrits, S. Glancy, D. R. Inference, Phys. Rev. A 88, 052130 (2013). Hamel, M. S. Allman et al., Strong Loophole-Free Test of [22] B. Schumacher and M. D. Westmoreland, Locality and Local Realism, Phys. Rev. Lett. 115, 250402 (2015). Information Transfer in Quantum Operations, Quantum [7] M. Giustina, M. A. M. Versteegh, S. Wengerowsky, Inf. Process. 4, 13 (2005). J. Handsteiner, A. Hochrainer, K. Phelan, F. Steinlechner, [23] D. Beckman, D. Gottesman, M. A. Nielsen, and J. Preskill, J. Kofler, J.-Å. Larsson, C. Abellán et al., Significant- Causal and Localizable Quantum Operations, Phys. Rev. A Loophole-Free Test of Bell’s Theorem with Entangled 64, 052309 (2001). Photons, Phys. Rev. Lett. 115, 250401 (2015). [24] T. Eggeling, D. Schlingemann, and R. F. Werner, [8] J. Pearl, Causality: Models, Reasoning, and Inference, 2nd Semicausal Operations are Semilocalizable, Europhys. ed. (Cambridge University Press, Cambridge, England, Lett. 57, 782 (2002). 2009). [25] G. M. D’Ariano, S. Facchini, and P. Perinotti, No-Signaling, Entanglement-Breaking, and Localizability in Bipartite [9] P. Spirtes, C. Glymour, and R. Scheines, Causation, Channels, Phys. Rev. Lett. 106, 010501 (2011). Prediction, and Search, 2nd ed. (MIT Press, Cambridge, [26] One could equally well evaluate the conditional mutual MA, 2001). information on any distribution PðXYZÞ obtained from [10] R. Chaves, R. Kueng, J. B. Brask, and D. Gross, Unifying PðYZjXÞ and an input distribution PðXÞ that has full Framework for Relaxations of the Causal Assumptions in support; again, the statement IðY∶ZjXÞ¼ 0 is equivalent Bell’s Theorem, Phys. Rev. Lett. 114, 140403 (2015). to the conditional independence of Y and Z given X.We [11] R. Chaves, C. Majenz, and D. Gross, Information-Theoretic consider the particular case of a uniform distribution over X Implications of Quantum Causal Structures, Nat. Commun. in order to maintain the strongest possible analogy with the 6, 5766 (2015). quantum case. [12] M. S. Leifer and D. Poulin, Quantum Graphical Models and [27] P. Hayden, R. Jozsa, D. Petz, and A. Winter, Structure of Belief Propagation, Ann. Phys. (Amsterdam) 323, 1899 States which Satisfy Strong Subadditivity of Quantum (2008). Entropy with Equality, Commun. Math. Phys. 246, 359 [13] J. F. Fitzsimons, J. A. Jones, and V. Vedral, Quantum (2004). Correlations which Imply Causation, Sci. Rep. 5, 18281 [28] B. Coecke and R. Lal, Causal Categories: Relativistically (2015). Interacting Processes, Found. Phys. 43, 458 (2013). [14] K. Ried, M. Agnew, L. Vermeyden, D. Janzing, R. W. [29] The analogous mixed notation also appears in Fig. 5 for the Spekkens, and K. J. Resch, A Quantum Advantage for classical case. Inferring Causal Structure, Nat. Phys. 11, 414 (2015). [30] In introducing the circle notation, we define the action of [15] E. G. Cavalcanti and R. Lal, On Modifications of gates on the left or right factors in such a way that coherence Reichenbach’s Principle of Common Cause in Light of between different subspaces in the direct sum can be Bell’s Theorem, J. Phys. A 47, 424018 (2014). maintained. This corresponds to the fact that the condition [16] Reichenbach’s principle assumes that it cannot happen both of decomposing into the appropriate form is applied to the that Y is a cause of Z and that Z is a cause of Y. This is unitary or Kraus operators, rather than to the channel natural if Y and Z are physical variables pertaining to operator itself. We have done this on the grounds that with systems that are localized in space and time. But it is an this definition, the circle notation is most likely to be useful assumption that may not hold for the generic case in which in future applications. In the lower two circuits of Fig. 6, causal explanations are sought for statistical data, since there however, note that coherence between the different sub- can be causal feedback loops: partaking of an addiction, say, spaces is lost. In the lower right circuit, coherence is lost may cause a low mood, which in turn may worsen the when the partial traces are performed on the extra outgoing addiction. Ultimately, it is the specific application that will wires. In the lower left circuit, the final output admits a determine whether adoption of the qualitative part of global factorization of the form B ⊗ C, and output wires Reichenbach’s principle (more generally, the formalism of carrying an i index do not even appear, indicating that this directed acyclic graphs introduced below) is appropriate. degree of freedom has been traced out. Each Kraus operator, [17] In the community studying classical causal inference, a in this case, must act nontrivially only on the ith subspace, deterministic causal dependence such as Y ¼ fðX; λÞ is for some i, and one may deduce that ρ is of the form BjA termed a structural equation, and a causal model wherein all ρ L ⊗ I R, and similarly ρ , consistently with con- causal dependences are deterministic is termed a functional BjA A CjA i i causal model (see, e.g., Ref. [8]). dition (4) of Theorem 3. [18] This is because any other λ would necessarily introduce new [31] Clearly, the notation can be extended in various ways to common causes for Y and Z that are not screened through X, include circles with multiple output wires, circles indicating which would violate the assumption that X is a complete a further decomposition following another circle, and so common cause. on. A fully general interpretation and calculus for these [19] A. Jamiołkowski, Linear Transformations which Preserve extended circuit diagrams is left for future work. Trace and Positive Semidefiniteness of Operators, Rep. [32] The quantum version in Fig. 9 was studied for similar Math. Phys. 3, 275 (1972). reasons in Ref. [33], though from a different perspective. 031021-20 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) [33] B. Schumacher and M. D. Westmoreland, Isolation and [48] O. Oreshkov, F. Costa, and Č. Brukner, Quantum Corre- Information Flow in Quantum Dynamics, Found. Phys. 42, lations with No Causal Order, Nat. Commun. 3, 1092 926 (2012). (2012). [34] C. M. Caves, C. A. Fuchs, and R. Schack, Quantum Prob- [49] M. Araújo, F. Costa, and Č. Brukner, Computational abilities as Bayesian Probabilities, Phys. Rev. A 65, 022305 Advantage from Quantum-Controlled Ordering of Gates, (2002). Phys. Rev. Lett. 113, 250402 (2014). [35] C. A. Fuchs, Quantum Mechanics as Quantum Information [50] F. Costa and S. Shrapnel, Quantum Causal Modelling, New (and Only a Little More), arXiv:quant-ph/0205039. J. Phys. 18, 063032 (2016). [36] R. W. Spekkens, Evidence for the Epistemic View of [51] R. Oeckl, A General Boundary, Formulation for Quantum Quantum States: A Toy Theory, Phys. Rev. A 75, 032110 Mechanics and Quantum Gravity, Phys. Lett. B 575, 318 (2007). (2003). [37] M. S. Leifer, Quantum Dynamics as an Analog of Condi- [52] O. Oreshkov and N. J. Cerf, Operational Quantum Theory tional Probability, Phys. Rev. A 74, 042310 (2006). without Predefined Time, New J. Phys. 18, 073037 (2016). [38] R. W. Spekkens, in Quantum Theory: Informational Foun- [53] O. Oreshkov and N. J. Cerf, Operational Formulation of dations and Foils, edited by G. Chiribella and R. W. Time Reversal in Quantum Theory, Nat. Phys. 11, 853 Spekkens (Springer, New York, 2016), pp. 83–135. (2015). [39] It is interesting to consider an exactly analogous scenario, as [54] This might be seen as a generalization of standard usage, it arises in the toy theory of Ref. [36]. Here, a system since in most treatments of quantum theory, the state analogous to a qubit can exist in one of four distinct classical describes a collection of systems at a single time, i.e., with states (the ontic states of the system). But an agent who none being causal descendants of others. prepares systems and measures them can only ever have [55] Classical interventional models, as we describe them, seem partial information about which of the four ontic states a rarely to be studied in full generality in the classical system is in. The toy equivalent of a CNOT gate corresponds literature. However, among the possible intervention to a reversible deterministic map, i.e., a permutation of the schemes are included the following special cases: ignoring ontic states. By considering the probability distribution over I O X and repreparing X with a fixed value x of which one ontic states of the various systems, one may verify directly i i O I keeps a record, corresponding to Pðk ;X jX Þ¼ δ δ O that the ontic states of toy systems B and C are not i i i k ;x X ;x determined by the ontic state of toy system A. Rather, (the standard notion of intervention, as set out, e.g., in I O the ontic states of B and C depend also on the ontic state Ref. [8]); ignoring X and repreparing X with a value that i i of λ. Furthermore, the analogue of a pure quantum state for λ is is sampled randomly and independently of X and of which a probability distribution on λ that is not a point distribution. O I one keeps a record, corresponding to Pðk ;X jX Þ¼ i i i In this way, statistical correlations between B and C can be O I δ OPðX Þ (a randomized trial); observing X , keeping k ;X i i underwritten by statistical variation in the ontic state of λ. a record of this value, and preparing X to have this value, [40] D. Horsman, C. Heunen, M. F. Pusey, J. Barrett, and R. W. O I corresponding to Pðk ;X jX Þ¼ δ I δ O I (passive ob- i k ;X X ;X Spekkens, Can a Quantum State Over Time Resemble a i i i i i Quantum State at a Single Time?, arXiv:1607.03637. servation of X ); observing X , keeping a record of its value, i i [41] Y. Aharonov and L. Vaidman, in Time in Quantum and repreparing X to have a fixed value x, corresponding to O I Mechanics, edited by G. Muga, R. S. Mayato, and I. Pðk ;X jX Þ¼ δ I δ O (the sort of intervention consid- i i k ;X X ;x i i Egusquiza (Springer, New York, 2007), pp. 399–447. ered in single-world intervention graphs [72]); simply [42] Y. Aharonov, S. Popescu, J. Tollaksen, and L. Vaidman, O I letting the value of X track the value of X , corresponding i i Multiple-Time States and Multiple-Time Measurements in O I to k being trivial, and PðX jX Þ¼ δ O I (no observation i i X ;X i i Quantum Mechanics, Phys. Rev. A 79, 052110 (2009). being made); and many others besides. [43] Y. Aharonov, S. Popescu, and J. Tollaksen, Each Instant of [56] M. Rédei, in Non-Locality and Modality: Proceedings of Time a New Universe,in Quantum Theory: A Two-Time the Nato Advanced Research Workshop on Modality, Success Story, Yakir Aharonov Festschrift, edited by D. C. Probability, and Bell’s Theorems, NATO Science Series, Struppa and J. M. Tollaksen (Springer, New York, 2013), Vol. 64, edited by T. Placek and J. Butterfield (Springer, pp. 21–36. Dordrecht, 2002), pp. 259–270. [44] R. Silva, Y. Guryanova, N. Brunner, N. Linden, A. J. Short, [57] G. Hofer-Szabó, M. Rédei, and L. E. Szabó, On Reich- and S. Popescu, Pre- and Postselected Quantum States: enbach’s Common Cause Principle and Reichenbach’s Density Matrices, Tomography, and Kraus Operators, Notion of Common Cause, Br. J. Philos. Sci. 50, 377 (1999). Phys. Rev. A 89, 012121 (2014). [58] G. Hofer-Szabó and P. Vecsernyés, Noncommuting Local [45] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Theoretical Common Causes for Correlations Violating the Clauser- Framework for Quantum Networks, Phys. Rev. A 80, Horne Inequality, J. Math. Phys. (N.Y.) 53, 122301 (2012). 022339 (2009). [59] G. Hofer-Szabó and P. Vecsernyés, Bell Inequality and [46] G. Chiribella, Perfect Discrimination of No-Signalling Common Causal Explanation in Algebraic Quantum Field Channels via Quantum Superposition of Causal Structures, Theory, Stud. Hist. Phil. Mod. Phys. 44, 404 (2013). Phys. Rev. A 86, 040301 (2012). [47] G. Chiribella, G. M. D’Ariano, P. Perinotti, and B. Valiron, [60] C. Branciard, N. Gisin, and S. Pironio, Characterizing Quantum Computations without Definite Causal Structure, the Nonlocal Correlations Created via Entanglement Swap- Phys. Rev. A 88, 022318 (2013). ping, Phys. Rev. Lett. 104, 170401 (2010). 031021-21 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) [61] T. Fritz, Beyond Bell’s Theorem: Correlation Scenarios, [74] R. R. Tucci, Quantum Bayesian Nets, Int. J. Mod. Phys. B New J. Phys. 14, 103001 (2012). 09, 295 (1995). [62] R. Chaves, L. Luft, and D. Gross, Causal Structures [75] R. R. Tucci, An Introduction to Quantum Bayesian from Entropic Information: Geometry and Novel Scenarios, Networks for Mixed States, arXiv:1204.1550. New J. Phys. 16, 043001 (2014). [76] C. M. Lee and R. W. Spekkens, Causal Inference via [63] J. Henson, R. Lal, and M. F. Pusey, Theory-Independent Algebraic Geometry: Necessary and Sufficient Conditions Limits on Correlations from Generalized Bayesian for the Feasibility of Discrete Causal Models, arXiv: Networks, New J. Phys. 16, 113043 (2014). 1506.03880. [64] T. Fritz, Beyond Bell’s Theorem II: Scenarios with Arbitrary [77] E. Wolfe, R. W. Spekkens, and T. Fritz, The Inflation Causal Structure, Commun. Math. Phys. 341, 391 (2016). Technique for Causal Inference with Latent Variables, [65] L. Hardy, Quantum Theory From Five Reasonable Axioms, arXiv:1609.00672. arXiv:quant-ph/0101012. [78] R. Chaves, L. Luft, T. O. Maciel, D. Gross, D. Janzing, and [66] J. Barrett, Information Processing in Generalized Probabi- B. Schölkopf, in Proceedings of the 30th Conference on listic Theories, Phys. Rev. A 75, 032304 (2007). Uncertainty in Artificial Intelligence (UAI 2014) (AUAI [67] S. Abramsky and B. Coecke, in Handbook of Quantum Press, Corvallis, 2014), pp. 112–121. Logic and Quantum Structures: Quantum Logic, edited [79] R. Chaves, Polynomial Bell Inequalities, Phys. Rev. Lett. by K. Engesser, D. Gabbay, and D. Lehmann (Elsevier, 116, 010402 (2016). New York, 2009), pp. 261–324. [80] D. Rosset, C. Branciard, T. J. Barnea, G. Pütz, N. Brunner, [68] B. Coecke, Quantum Picturalism, Contemp. Phys. 51,59 and N. Gisin, Nonlinear Bell Inequalities Tailored for (2010). Quantum Networks, Phys. Rev. Lett. 116, 010403 [69] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Probabi- (2016). listic Theories with Purification, Phys. Rev. A 81, 062348 [81] B. S. Tsirelson, Quantum Generalizations of Bell’s Inequal- (2010). ity, Lett. Math. Phys. 4, 93 (1980). [70] L. Hardy, The Operator Tensor Formulation of Quantum [82] S. Wehner, Tsirelson Bounds for Generalized Clauser- Theory, Phil. Trans. R. Soc. A 370, 3385 (2012). Horne-Shimony-Holt Inequalities, Phys. Rev. A 73, [71] L. Hardy, Quantum Gravity Computers: On the Theory 022110 (2006). of Computation with Indefinite Causal Structure, arXiv: [83] J.-P. W. MacLean, K. Ried, R. W. Spekkens, and K. J. quant-ph/0701019. Resch, Quantum-Coherent Mixtures of Causal Relations, [72] T. S. Richardson and J. M. Robins, Single World Interven- arXiv:1606.04523. tion Graphs (SWIGs): A Unification of the Counterfactual [84] A. Feix and Č. Brukner, Quantum Superpositions of and Graphical Approaches to Causality, Center for “Common-Cause” and “Direct-Cause” Causal Structures, Statistics and the Social Sciences Working Papers Series arXiv:1606.09241. No. 128 (University of Washington, Seattle, 2013). [85] M. B. Ruskai, Inequalities for Quantum Entropy: A Review [73] J. Pienaar and Č. Brukner, A Graph-Separation Theorem for with Conditions for Equality, J. Math. Phys. (N.Y.) 43, 4358 Quantum Causal Models, New J. Phys. 17, 073020 (2015). (2002). 031021-22 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Physical Review X American Physical Society (APS)

Quantum Common Causes and Quantum Causal Models

Free
22 pages

Loading next page...
 
/lp/aps_physical/quantum-common-causes-and-quantum-causal-models-VQts0xyAgu
Publisher
The American Physical Society
Copyright
Copyright © Published by the American Physical Society
eISSN
2160-3308
D.O.I.
10.1103/PhysRevX.7.031021
Publisher site
See Article on Publisher Site

Abstract

Selected for a Viewpoint in Physics PHYSICAL REVIEW X 7, 031021 (2017) 1 1 2 3 4 John-Mark A. Allen, Jonathan Barrett, Dominic C. Horsman, Ciarán M. Lee, and Robert W. Spekkens Department of Computer Science, University of Oxford, Wolfson Building, Parks Road, Oxford OX1 3QD, United Kingdom Department of Physics, University of Durham, South Road, Durham DH1 3LE, United Kingdom Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT, United Kingdom Perimeter Institute for Theoretical Physics, Waterloo, Ontario N2L 2Y5, Canada (Received 5 April 2017; revised manuscript received 5 June 2017; published 31 July 2017) Reichenbach’s principle asserts that if two observed variables are found to be correlated, then there should be a causal explanation of these correlations. Furthermore, if the explanation is in terms of a common cause, then the conditional probability distribution over the variables given the complete common cause should factorize. The principle is generalized by the formalism of causal models, in which the causal relationships among variables constrain the form of their joint probability distribution. In the quantum case, however, the observed correlations in Bell experiments cannot be explained in the manner Reichenbach’s principle would seem to demand. Motivated by this, we introduce a quantum counterpart to the principle. We demonstrate that under the assumption that quantum dynamics is fundamentally unitary, if a quantum channel with input A and outputs B and C is compatible with A being a complete common cause of B and C, then it must factorize in a particular way. Finally, we show how to generalize our quantum version of Reichenbach’s principle to a formalism for quantum causal models and provide examples of how the formalism works. DOI: 10.1103/PhysRevX.7.031021 Subject Areas: Quantum Physics, Quantum Information I. INTRODUCTION independent from one another, or for the results of one scientific team to be regarded as an independent confirma- It is a general principle of scientific thought—and indeed tion of the results of another. of everyday common sense—that if physical variables are This principle of causal explanation was first made found to be statistically correlated, then there ought to be a explicit by Reichenbach [1]. It is key in scientific inves- causal explanation of this fact. If the dog barks every time the tigations that aim to find causal accounts of phenomena telephone rings, we do not ascribe this to coincidence. from observed statistical correlations. A likely explanation is that the sound of the telephone ringing Despite the central role of causal explanations in science, is causing the dog to bark. This is a case where one of the there are significant challenges to providing them for the variables is a cause of the other. If sales of ice cream are high correlations that are observed in quantum experiments [2]. on the same days of the year that many people get sunburned, In a Bell experiment, a pair of systems are prepared a likely explanation is that the sun was shining on these days together, then removed to distant locations where a meas- and that the hot sun causes both sunburns and the desire to urement is implemented on each. The choice of the have an ice cream. Here, the explanation is not that buying ice measurement made at one wing of the experiment is cream causes people to get sunburned, nor vice versa, but presumed to be made at spacelike separation from that instead that there is a common cause of both: the hot sun. at the other wing. The natural causal explanation of the That the principle is highly natural is most apparent correlations that one observes in such experiments is that when it is expressed in its contrapositive form: if there is no each measurement outcome is influenced by the local causal relationship between two variables (i.e., neither is a measurement setting as well by a common cause located cause of the other and there is no common cause), then the in the joint past of the two measurement events. But Bell’s variables will not be correlated. In particular, without a theorem [3] famously rules out this possiblity: within the general commitment to this latter statement, it would be standard framework of causal models, if the correlations impossible ever to regard two different experiments as violate a Bell inequality [4]—as is predicted by quantum theory and verified experimentally [5–7]—then a common- cause explanation of the correlations is ruled out. Published by the American Physical Society under the terms of Furthermore, Ref. [2] proves that it is not possible to the Creative Commons Attribution 4.0 International license. explain Bell correlations with classical causal models Further distribution of this work must maintain attribution to without unwelcome fine-tuning of the parameters. This the author(s) and the published article’s title, journal citation, and DOI. includes any attempt to explain Bell correlations with 2160-3308=17=7(3)=031021(22) 031021-1 Published by the American Physical Society ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) exotic causal influences, such as retrocausality and super- matter physics [12] and novel means for inferring the luminal signaling. In the study of classical causation, it is underlying causal structure from quantum correlations typically assumed that causal explanations should not be [13,14]. fine-tuned [8]. The structure of the paper is as follows. Section II However, the verdict of fine-tuning applies only to provides a formal statement of Reichenbach’s principle and classical models of causation. It was suggested in shows how it can be rigorously justified under certain Ref. [2] that it might be possible to provide a satisfactory philosophical assumptions. The main body of results causal explanation of Bell inequality violations, in particu- is in Sec. III. Here, our quantum generalization of lar one that preserves the spirit of Reichenbach’s principle Reichenbach’s principle is presented and justified by and does not require fine-tuning, using a quantum gener- reasoning parallel to that of the classical case. This is then alization of the notion of a causal model. This article seeks fleshed out with alternative characterizations of our quan- to develop such a generalization by first suggesting an tum version of conditional independence and some specific intrinsically quantum version of Reichenbach’s principle. examples. We return to the classical world in Sec. IV, Specifically, we consider the case of a quantum system A discussing classical causal models and providing a rigorous in the causal past of a bipartite quantum system BC and ask justification of the Markov condition, which plays the role what constraints on the channel from A to BC follow from of Reichenbach’s principle for general causal structures. the assumption that A is the complete common cause of B Section V then generalizes these ideas to the quantum sphere and presents our proposal for quantum causal and C. In this scenario we are able to find a natural quantum models. Finally, in Sec. VI, we describe the relationship analogue to Reichenbach’s principle. This analogue can be expressed in several equivalent forms, each of which of our proposal to prior work on quantum causal models, naturally generalizes a corresponding classical expression. and in Sec. VII, we summarize and describe some direc- In particular, one of these conditions states that A is a tions for future work. complete common cause of BC if one can dilate the channel from A to BC to a unitary by introducing two ancillary II. REICHENBACH’S PRINCIPLE systems, contained in the causal past of BC, such that A. Statement each ancillary system can influence only one of B and C. This unitary dilation codifies the causal relationship Reichenbach gave his principle a formal statement in between A and BC and illustrates the fact that no other Ref. [1]. Following Ref. [15], we here distinguish two parts system can influence both B and C. Moreover, our quantum of the formalized principle. First is the qualitative part, Reichenbach’s principle contains the classical version as which expresses the intuitions described at the beginning of a special case in the appropriate limit. This suggests that the Introduction. The other is the quantitative part, which our quantum version is the correct way to generalize constrains the sorts of probability distributions one should Reichenbach’s principle. assign in the case of a common-cause explanation. The mathematical framework of causal models [8,9] can The qualitative part of Reichenbach’s principle may be be seen as a direct generalization of Reichenbach’s prin- stated as follows: if two physical variables Y and Z are ciple to arbitrary causal structures. By following this found to be statistically dependent, then there should be a classical example, we are able to generalize our quantum causal explanation of this fact, either (1) Y is a cause of Z, Reichenbach’s principle to a framework for quantum causal (2) Z is a cause of Y, (3) there is no causal link between Y models. In each case, the original Reichenbach’s principle and Z, but there is a common cause X influencing Y and Z, becomes a special case of the framework. Just as with (4) Y is a cause of Z and there is a common cause X classical causal models, the framework of quantum causal influencing Y and Z, or (5) Z is a cause of Y and there is a models allows us to analyze the causal structure of arbitrary common cause X influencing Y and Z. quantum experiments. It also does so while preserving an Note that the causal influences we consider here may be appropriate form of Reichenbach’s principle (by construc- indirect (mediated by other variables). If none of these tion) and avoiding fine-tuning. causal relations hold between Y and Z, then we refer to Although our main motivation for developing quantum them as ancestrally independent (because their respective causal models is the possibility of finding a satisfactory causal ancestries constitute disjoint sets). Using this termi- (i.e., non-fine-tuned) causal explanation of Bell inequality nology, the qualitative part of Reichenbach’s principle can violations [2,10], they are also likely to have practical be expressed particularly succinctly in its contrapositive applications. For instance, finding quantum-classical sep- form as follows: ancestral independence implies statistical arations in the correlations achievable in novel causal independence, i.e., PðYZÞ¼ PðYÞPðZÞ [16]. scenarios might lead to new device-independent protocols The quantitative part of Reichenbach’s principle applies [11], such as randomness extraction and secure key dis- only to the case where the correlation between Y and Z is tribution. Quantum causal models may also provide novel due purely to a common cause [case (3) above]. It states schemes for simulating many-body systems in condensed that, in that case, if X is a complete common cause for Y 031021-2 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) and Z, meaning that X is the collection of all variables acting as common causes, then Y and Z must be condi- tionally independent given X, so the joint probability distribution PðXYZÞ satisfies PðYZjXÞ¼ PðYjXÞPðZjXÞ: ð1Þ FIG. 1. A causal structure represented as a directed acyclic B. Justifying the quantitative part graph depicting that X is the complete common cause of Y and Z. of Reichenbach’s principle Within the philosophy of causality, providing an adequate justification of Reichenbach’s principle is a delicate issue. It We now apply this to the situation depicted in Fig. 1, rests on controversy over basic questions, such as what it where X is the complete common cause of Y and Z. The means for one variable to have a causal influence on another conditional distribution PðYZjXÞ admits of a dilation in and what is the correct interpretation of probabilistic state- terms of an ancillary unobserved variable λ, for some 0 0 ments. In this section, we discuss one way of justifying the distribution PðλÞ and a function f ¼ðf ;f Þ from ðλ;XÞ Y Z 0 0 principle, using an assumption of determinism, which to ðY; ZÞ, such that Y ¼ f ðλ;XÞ and Z ¼ f ðλ;XÞ. The Y Z provides a clean motivational story with a natural quantum assumption that X is the complete common cause of Y and analogue. Other justifications may be possible. Z implies that the ancillary variable λ can be split into a pair Suppose we adopt a Bayesian point of view on prob- of ancestrally independent variables, λ and λ , where λ Y Z Y abilities: they are the degrees of belief of a rational agent. influences only Y and λ influences only Z [18]. It follows Dutch book arguments—based on the principle that a that there must exist λ and λ that are causally related to X, Y Z rational agent will never accept a set of bets on which Y, and Z, as depicted in Fig. 2, where the causal depend- they are certain to lose money—can then be given as to why ences are deterministic and given by a pair of functions f probabilities should be non-negative, sum to 1, and so and f such that Y ¼ f ðλ ;XÞ and Z ¼ f ðλ ;XÞ. Z Y Y Z Z forth. But why should an agent who takes X to be a In this case, we have complete common cause for Y and Z arrange their beliefs such that PðYZjXÞ¼ PðYjXÞPðZjXÞ? If the agent does not PðYZjXÞ do this, are they irrational? ¼ δ(Y; f ðλ ;XÞ)δ(Z; f ðλ ;XÞ)Pðλ ; λ Þ: ð3Þ One way to justify a positive answer to this question is to Y Y Z Z Y Z λ ;λ assume that in a classical world there is always an under- Y Z lying deterministic dynamics. In this case, one variable is causally influenced by another if it has a nontrivial func- Finally, given the qualitative part of Reichenbach’s prin- tional dependence upon it in the dynamics. Probabilities ciple, the ancestral independence of λ and λ in the causal Y Z can be understood as arising merely due to ignorance of the structure implies that Pðλ ; λ Þ¼ Pðλ ÞPðλ Þ. It then Y Z Y Z values of unobserved variables. Under these assumptions, follows that PðYZjXÞ¼ PðYjXÞPðZjXÞ, which establishes one can show that the qualitative part of Reichenbach’s the quantitative part of Reichenbach’s principle. principle implies the quantitative part. A well-known converse statement is also worth noting: In general, a classical channel describing the effective any classical channel PðYZjXÞ satisfying PðYZjXÞ¼ influence of random variable X on Y is given by a PðYjXÞPðZjXÞ admits of a dilation where X is the complete conditional probability distribution PðYjXÞ. If we assume common cause of Y and Z [8]. underlying deterministic dynamics, then although the value Summarizing, we can identify what it means for of the variable Y might not be completely determined by the PðYZjXÞ to be explainable in terms of X being a complete value of X, it must be determined by the value of X along common cause of Y and Z by appealing to the quali- with the values of some extra, unobserved, variables in the tative part of Reichenbach’s principle and fundamental past of Y which can collectively be denoted λ.Any variation in the value of Y for a given value of X is then explained by variation in the value λ. This can be formalized as follows. Definition 1 (Classical dilation).—For a classical chan- nel PðYjXÞ, a classical deterministic dilation is given by some random variable λ with probability distribution PðλÞ and some deterministic function Y ¼ fðX; λÞ, such that FIG. 2. The causal structure of Fig. 1, expanded so that Y and Z PðYjXÞ¼ δ(Y; fðX; λÞ)PðλÞ; ð2Þ each have a latent variable as a causal parent in addition to X,so that both Y and Z can be made to depend functionally on their where δðX; YÞ¼ 1 if X ¼ Y and 0 otherwise [17]. parents. 031021-3 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) determinism. The definition can be formalized into a the dual space. If a quantum system is initially uncorrelated mathematical condition as follows. with any other system, then the most general time evolution Definition 2 (Classical compatibility).—PðYZjXÞ is said of the system corresponds to a quantum channel, i.e., a to be compatible with X being the complete common cause completely positive trace-preserving (CPTP) map. If the of Y and Z if one can find variables λ and λ , distributions system at the initial time is labeled A, with Hilbert space Y Z Pðλ Þ and Pðλ Þ, a function f from ðλ ;XÞ to Y, and a H , and the system at the later time is labeled B, with Y Z Y Y A function f from ðλ ;XÞ to Z, such that these constitute a Hilbert space H , then the CPTP map is Z Z dilation of PðYZjXÞ, that is, such that E ∶ LðH Þ → LðH Þ; ð5Þ BjA A B PðYZjXÞ ¼ δ(Y; f ðλ ;XÞ)δ(Z; f ðλ ;XÞ)Pðλ ÞPðλ Þ: ð4Þ where LðHÞ is the set of linear operators on H. Y Y Z Z Y Z λ ;λ Y Z An alternative way to express the channel E is as BjA an operator, using a variant of the Choi-Jamiołkowski isomorphism [19,20]: With this definition, we can summarize the result described above as follows. ρ ≔ E ðjii hjjÞ ⊗ jii hjj: ð6Þ BjA BjA A A Theorem 1.—Given a conditional probability distribu- ij tion PðYZjXÞ, the following are equivalent. (1) PðYZjXÞ is compatible with X being the complete Here, the vectors fjii g form an orthonormal basis of the common cause of Y and Z. Hilbert space H . The vectors fjii g form the dual basis, A A (2) PðYZjXÞ¼ PðYjZÞPðZjXÞ. belonging to H . The operator ρ therefore acts on the BjA The ð1Þ → ð2Þ implication is what establishes that a Hilbert space H ⊗ H . Although the expression above rational agent should espouse the quantitative part of A involves an arbitrary choice of orthonormal basis, the Reichenbach’s principle if they espouse the qualitative part operator ρ thus defined is independent of the choice and fundamental determinism. BjA of basis. This version of the Choi-Jamiołkowski isomor- The ð2Þ → ð1Þ implication allows one to deduce a phism is chosen because it is both basis independent and a possible causal explanation of an observed distribution positive operator. Following Ref. [21], we choose the from a feature of that distribution. However, it is important operator ρ to be normalized in such a way that to stress that it only establishes a possible causal explan- BjA ation. It does not state that this is the only causal Tr ðρ Þ¼ I [in analogy with the normalization con- B BjA A explanation. Indeed, it may be possible to satisfy this dition PðYjXÞ¼ 1 for a classical channel PðYjXÞ]. conditional independence relation within alternative causal Suppose that ρ ¼ E ðρ Þ. Given that the operator ρ B BjA A BjA structures by fine-tuning the strengths of the causal depend- contains all of the information about the channel E , the BjA ences. However, as noted above, fine-tuned causal explan- question arises of how one can express ρ in terms of ρ B BjA ations are typically rejected as bad explanations in the field and ρ . Recall that ρ is defined on H ⊗ H , while ρ A BjA B A A of causal inference. Therefore, the best explanation of the is defined on H . As we discuss further in Sec. V,by conditional independence of Y and Z given X is that X is defining an appropriate “linking operator” on H ≔ the complete common cause of Y and Z. H ⊗ H , A A III. QUANTUM VERSION OF id τ ≔ jli hmj ⊗ jli hmj; ð7Þ A A A REICHENBACH’S PRINCIPLE lm In this section, we introduce our quantum version of Reichenbach’s principle. The definition of a quantum where fjli g and fjli g are orthonormal bases on H A l A l A id causal model that we provide in Sec. V can be seen as and H , respectively, one can write ρ ¼ Tr ðρ τ ρ Þ. A B A BjA A generalizing these ideas in much the same way that This expression is meant to be reminiscent of the classical classical causal models generalize the classical version formula PðYÞ¼ PðYjXÞPðXÞ. of Reichenbach’s principle. Given an operator ρ ,actingon H ⊗H ⊗ ABjCD A B ⊗H ⊗H ⊗, we use the same expression with missing C D A. Quantum preliminaries indices to denote the result of taking partial traces on the corresponding factor spaces. For example, given a channel For simplicity, we assume throughout that all quantum ρ , we write ρ ≔ Tr ðρ Þ. systems are finite dimensional. Given a quantum system A, ABjCD AjCD B ABjCD we write H for the corresponding Hilbert space, d for the When writing products of operators, we sometimes A A dimension of H , and I for the identity on H . We also suppress tensor products with identities. For example, A A A write H for the dual space to H and I for the identity on ðρ ⊗ I Þðρ ⊗ I Þ is written simply as ρ ρ . A A A BjA C CjA B BjA CjA 031021-4 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) B. Main result some extra ancillary system λ in the past of B. This can be formalized as follows. The qualitative part of Reichenbach’s principle can be Definition 3 (Unitary dilation).—For a quantum channel applied to quantum theory with almost no change: if E , a quantum unitary dilation is given by some ancillary BjA quantum systems B and C are correlated, then this must quantum system λ with state ρ and some unitary U from have a causal explanation in one of the five forms listed in H ⊗ H to H ⊗ H , such that A λ B Sec. II A (except with classical variables X, Y, and Z B replaced by quantum systems A, B, and C). Here, for two quantum systems to be correlated means that their joint E ð·Þ¼ Tr (Uð· ⊗ ρ ÞU ); BjA B λ quantum state does not factorize. Finding a quantum version of the quantitative part of where the dimension of B is fixed by the requirement for Reichenbach’s principle is more subtle. If a quantum unitarity that d d ¼ d d ¯ . A λ B B system A is a complete common cause of B and C (as If we represent the channels by our variant of the Choi- depicted in Fig. 3), then one expects there to be some Jamiołkowski isomorphism, Eq. (6), with ρ representing BjA constraint analogous to the classical constraint that U † E and ρ representing Uð·ÞU , then the dilation BjA ¯ BBjAλ PðYZjXÞ¼ PðYjXÞPðZjXÞ. If one tries to do this by equation has the form generalizing the joint distribution PðXYZÞ, then one immediately faces the problem that textbook quantum U id ρ ¼ Tr ðρ τ ρ Þ; theory has no analogue of a joint distribution for a ¯ BjA Bλ ¯ λ BBjAλ collection of quantum systems in which some are causal descendants of others. The situation is improved if one id where τ is the linking operator defined in Eq. (7). focuses on finding an analogue of PðYZjXÞ instead. In Just as in the classical case, we would like to apply this to standard quantum theory, as long as a system A is initially the situation depicted in Fig. 3, where A is the complete uncorrelated with its environment, then the evolution from common cause for B and C. This is easy classically, as it is A to BC is described by a channel E . The operator that BCjA clear what it means for a classical variable X to have no is isomorphic to this channel by Eq. (6), denoted ρ , BCjA causal influence on another, Y, in a deterministic system. seems to be a natural analogue of PðYZjXÞ.However,even Specifically, if the collection of inputs other than X is in this case, it is not obvious what constraint on ρ BCjA denoted X so there is a deterministic function f such that should serve as the analogue of the classical constraint ¯ Y ¼ fðX; XÞ, then the assumption that X has no causal PðYZjXÞ¼ PðYjXÞPðZjXÞ. ¯ ¯ influence on Y is formalized as fðX; XÞ¼ f ðXÞ for some The treatment of generic causal networks of quantum function f . In unitary quantum theory the corresponding systems is deferred to the full definition of quantum causal condition is less obvious, so we spell it out explicitly with a models in Sec. V. This section focuses on the case of a definition. channel ρ . BCjA Definition 4 (No influence).—Consider a unitary In Sec. II B, we demonstrated how to justify the U ¯ ¯ channel ρ from AA to BB. A has no causal influence ¯ ¯ BBjAA quantitative part of Reichenbach’s principle from the U on B if and only if for ρ ≔ tr ρ , we have ¯ ¯ BjAA ¯ ¯ BBjAA qualitative part in the classical case under the assumption ρ ¯ ¼ I ⊗ ρ ¯ . BjAA A BjA that all dynamics are fundamentally deterministic. We shall An equivalent definition states that A has no causal now make an analogous argument in the quantum case by influence on B in some unitary channel if and only if the assuming that quantum dynamics are fundamentally uni- following holds: for every initial state ρ , if an operation is tary. Just as in the classical case, this assumption simply AA performed on the A system alone, followed by the action of provides a clean way to motivate our result, and alternative the unitary channel, then the marginal output state at B is justifications may be possible. independent of the choice of operation on A. The equiv- In general, a quantum channel from A to B is given by a alence is shown in Ref. [22], where other equivalent CPTP map E . Assuming underlying unitary dynamics, BjA definitions are also presented (under the terminology the output state at B must depend unitarily on A along with “nonsignalling” rather than “no causal influence”). There is in fact a rich literature concerning related properties of unitary operators from various perspectives: see, for exam- ple, Refs. [23–25]. We can now apply this to the complete common-cause situation of Fig. 3. The channel E admits a unitary BCjA dilation in terms of an ancillary system λ, for some state ρ and unitary U from λA to BDC. Here, an ancillary output D is generally required so that dimensions of inputs and FIG. 3. A causal structure relating three quantum systems with A the complete common cause of B and C. outputs match, but is not important and will always be 031021-5 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) independent given the input if and only if the classical channel defined by the diagonal elements of the matrix has the property that the outputs are conditionally independent given the input. With this terminological convention in hand, we can express our quantum version of the quantitative part of Reichenbach’s principle as follows: if a channel ρ is BCjA FIG. 4. The causal structure of Fig. 3, expanded so that B and C compatible with A being a complete common cause of B each have a latent system as a causal parent in addition to A.By and C, then for this channel, B and C are quantum analogy to the classical case, we take B and C to depend unitarily conditionally independent given A. on λ , A, and λ . B C The ð1Þ → ð2Þ implication in the theorem is what traced out. This dilation is such that E ð·Þ¼ establishes the quantum version of the quantitative part BCjA of Reichenbach’s principle. Tr (Uð· ⊗ ρ ÞU ). D λ The ð2Þ → ð1Þ implication is pertinent to causal infer- Just as in Sec. II B, the assumption that A is a complete ence: analogously to the classical case, if one grants the common cause for B and C implies that the ancilla λ can be implausibility of fine-tuning, then one must grant that the factorized into ancestrally independent λ and λ , where λ B C B most plausible explanation of the quantum conditional has no causal influence on C and λ has no causal influence independence of outputs B and C given input A is that A is a on B. It follows that systems λ and λ are causally related B C complete common cause of B and C. to A, B, and C as depicted in Fig. 4. Theorem 2, and the surrounding discussion, motivates The ancestral independence of λ and λ implies that the B C the definition of quantum causal models given in Sec. V. quantum state on λ factorizes across the λ , λ partition, B C For the rest of this section, we make some further remarks ρ ¼ ρ ρ , suggesting the following quantum analogue to λ λ λ B C about the quantum version of Reichenbach’s principle. our classical compatibility condition of Definition 2. Definition 5 (Quantum compatibility).—ρ is said to BCjA C. Alternative expressions for quantum conditional be compatible with A being a complete common cause of B independence of outputs given input and C, if it is possible to find ancillary quantum systems λ and λ , states ρ and ρ , and a unitary channel where λ Classically, conditional independence of Y and Z given C λ λ B B C has no causal influence on C and λ has no causal influence X is standardly expressed as PðYZjXÞ¼ PðYjXÞPðZjXÞ. on B, such that these constitute a dilation of ρ . However, there are alternative ways of expressing this BCjA constraint. All that remains is to show that this, together with the For instance, if one defines the joint distribution overX,Y,Z qualitative part of the quantum Reichenbach’s principle, that one obtains by feeding the uniform distribution on X into implies an appropriate quantitative part (generalizing Theorem 1). the channel PðYZjXÞ—that is, PðXYZÞ≔PðYZjXÞð1=d Þ, Theorem 2.—The following are equivalent. where d is the cardinality of X—then Y and Z being (1) ρ is compatible with A being the complete conditionally independent given X in PðYZjXÞ can be BCjA common cause of B and C. expressed as the vanishing of the conditional mutual infor- (2) ρ ¼ ρ ρ . ˆ BCjA BjA CjA mation of Y and Z given X in the distribution PðXYZÞ [8]. The proof is in Appendix A. Note that there is no This conditional mutual information is defined as ordering ambiguity on the right-hand side of the second IðY∶ZjXÞ ≔ HðY; XÞþ HðZ; XÞ − HðX; Y; ZÞ − HðXÞ, condition, because the two terms must commute. This is withHð·Þ denoting the Shannon entropy of the marginal on the seen by taking the Hermitian conjugate of both sides of the subset of variables indicated in its argument. Therefore, the equation and recalling that ρ is Hermitian. BCjA condition is simply IðY∶ZjXÞ¼ 0 [26]. The strong analogy that exists between Theorems 1 and Similarly, if Y and Z are conditionally independent given 2 suggests the following definition. X in PðYZjXÞ, then it is possible to mathematically Definition 6 (Quantum conditional independence of represent the channel PðYZjXÞ as the following sequence outputs given input).—Given a quantum channel ρ , BCjA of operations: copy X, then process one copy into Y via the the outputs are said to be quantum conditionally indepen- channel PðYjXÞ and process the other into Z via the dent given the input if and only if ρ ¼ ρ ρ . BCjA BjA CjA channel PðZjXÞ. It is easily seen that the quantum definition reduces to the We present here the quantum analogues of these alter- classical definition in the case that the channel ρ is native expressions. They are found to be useful for devel- BCjA invariant under the operation of completely dephasing the oping intuitions about quantum conditional independence systems A, B, and C in some basis. More precisely, if fixed and in proving Theorem 2. Recall that the quantum condi- bases are chosen for H , H , H , and the operator ρ is tional mutual information of B and C given A is defined as A B C BCjA diagonal when written with respect to the tensor product of IðB∶CjAÞ≔ SðB;AÞþSðC;AÞ−SðA;B;CÞ−SðAÞ, where these bases, then the outputs are quantum conditionally Sð·Þ denotes the von Neumann entropy of the reduced state 031021-6 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) on the subsystem that is specified by its argument. Analogously to the classical case, we use a hat to denote an operator renormalized such that the trace is 1. For example, if ρ is the operator representing a channel from BjA A to B, then ρ ˆ ≔ ð1=d Þρ . BjA A BjA Theorem 3.—Given a channel ρ , the following BCjA conditions are also equivalent to the quantum conditional independence of the outputs given the input [condition (2) of Theorem 2]. (3) IðB∶CjAÞ¼ 0, where IðB∶CjAÞ is the quantum conditional mutual information of B and C given A evaluated on the (positive, trace-one) operator ρ ˆ ≔ ð1=d Þρ . BCjA A BCjA (4) The Hilbert space for theA system can be decomposed as H ¼⨁ H L ⊗ H R and ρ ¼ ðρ L⊗ A A A BCjA BjA i i i i i ρ RÞ, where for each i, ρ L represents a CPTP CjA BjA i i map BðH LÞ → BðH Þ, and ρ R aCPTP A B CjA i i FIG. 5. Diagrammatic representation of Theorem 1 and of map BðH Þ → BðH Þ. A C alternative expressions for conditional independence of outputs The proof is in Appendix A. That conditions (3) and (4) given input (the classical analogue of Theorem 3). are equivalent follows as a corollary of Theorem 6 of Ref. [27]. Our main contribution is showing that these are beginning of Sec. III C, this is one way of expressing the fact also equivalent to condition (2) of Theorem 2. that Y and Z are conditionally independent given X. Equality The final condition can be described as follows. First, (3) asserts that PðYjXÞ and PðZjXÞ separately admit classical one imagines decomposing the system A into a direct sum dilations. Finally, equality (2) asserts that PðYZjXÞ is of subspaces, each of which is denoted A . For each i, the compatible with X being a complete common cause of Y L R subspace A is split into two factors, denoted A and A , i i and Z by depicting conditions under which λ has no with one factor evolving via a channel ρ into system B, BjA influence on Z and λ has no influence on Y. i Z and the other factor evolving via ρ R into system C. In the Analogous circuit diagrams can be provided in the CjA quantum case, as depicted in Fig. 6, with analogous inter- special case where there is only a single value of i, this is pretations of the various equalities. Since quantum systems simply a factorization of the A system into two parts. In the L R cannot be copied, however, something must replace the dot special case where all of the A and A are one-dimensional i i Hilbert spaces, the channel ρ may be thought of as an BCjA incoherent copy operation applied to the A system with respect to the i basis, followed by the processing of one copy into B and one copy into C. It is noteworthy that in the general case, B only gets the information carried by the A and C only gets the information carried by the A : hence, the only information about A that both B and C receive is the classical information carried by the index i. D. Circuit representations It is instructive to summarize the contents of Theorems 1 and 2 using circuit diagrams. The classical case is shown in Fig. 5, where four equivalent circuits represent the action of a channel PðYZjXÞ,for which the outputs YZ are conditionally independent given the input X. The dot in the lower two circuits represents a classical copy operation. Equality (1) simply asserts that the conditional probability distribution PðYZjXÞ admits a FIG. 6. Diagrammatic representation of Theorem 2 and of classical dilation, as in Definition 1. Equality (4) asserts that alternative expressions for quantum conditional independence of the channel is equivalent to a sequence of operations in which outputs given input (Theorem 3). Following Ref. [28],we use to X is copied, with one copy the input to a channel PðYjXÞ and denote partial trace (here, slightly generalized to include the partial one copy the input to a channel PðZjXÞ. As we discuss at the trace of a wire carrying an i index, defined in an obvious way). 031021-7 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) that appears in the lower two circuits of Fig. 5. For the lower two circuits of Fig. 6, we introduce a new symbol that indicates the decomposition of the Hilbert space H into a direct sum of tensor products, as per condition (4) of Theorem 3. The symbol is a circle decorated with the set fig, where the value i indexes the terms in the direct sum. For FIG. 7. For a generic unitary transformation from AD to BC,the each value of i, the left-hand wire carries the factor H L and complete common cause of B and C is the composite system AD. the right-hand wire the factor H . In the lower right circuit of Fig. 6, the gates represent Fig. 7, we illustrate the circuit and the corresponding causal unitary channels, and are labeled with the corresponding diagram. unitary operators V and W (as opposed to the Choi- The channel ρ which one obtains in this case is BCjAD Jamiołkowski channel operators). The unitary operator V, compatible with the complete common cause of B and C for example, labels a gate whose action is confined to the left- being the composite system AD. This follows from the fact hand factors in this decomposition, along with the system λ . that ρ has a trivial dilation, which is to say that the BCjAD The interpretation, roughly, is that the form of V must respect ancillary system is not required, and therefore trivially the decomposition of H . More precisely, the unitary satisfies the condition for compatibility laid out in operator can be written as a matrix that is block diagonal Definition 5. It follows from Theorem 2 that for such a with respect to the subspace decomposition, with the ith ρ , the outputs B and C are quantum conditionally block being of the form V ⊗ I R for a unitary matrix V BCjAD i A i independent given the input AD, which means that acting on H ⊗ H L. Similarly, W can be written as a block ρ ¼ ρ ρ , as can also be verified by direct BCjAD BjAD CjAD diagonal matrix, with theith block of the formI ⊗ W for a A i calculation. Similarly, the alternative expressions for this unitary matrix W acting on H R ⊗ H . i A λ sort of quantum conditional independence, namely, con- In the lower left circuit of Fig. 6, in a slight mixing of ditions (3) and (4) of Theorem 3, can be verified to hold. notation, gates are labeled with the channel operators ρ BjA and ρ [29]. Suppose that, as in the figure, a channel CjA 2. Coherent copy versus incoherent copy operator ρ labels a gate whose action is confined to the BjA left-hand factors in the decomposition, along with another Consider the simple example of a classical channel, system λ . This indicates that the channel corresponds to a taking X to Y, Z, where X, Y, Z are bit-valued and the mapping between input strings and output strings is set of Kraus operators fK g, where for each j, the Kraus operator K is block diagonal, with the ith block being of j j 0 → 0 0 ; R L X Y Z the form K ⊗ I , with K acting on H ⊗ H . The i A i λ A i i channel operator ρ has a similar form, with a non-trivial 1 → 1 1 : ð8Þ CjA X Y Z action on the right-hand factors and the system λ [30,31] The equivalences of Fig. 6 can now be summarized as The outputs of the channel are conditionally independent follows. Equality (1) simply asserts the fact that ρ admits given the input; variation in X fully explains any correlation BCjA between Y and Z. Indeed, this example may be seen as the a unitary dilation. Equality (4) asserts that the channel paradigmatic case of the explanation of classical correla- ρ is such that B and C are quantum conditionally BCjA tions via a complete common cause. independent given A, according to the definition we propose One quantum analogue of this channel is the incoherent (Definition 6). This equality follows from the expression for copy of a qubit: a qubit A is measured in the computational quantum conditional independence described in condition basis; if 0 is obtained, then prepare the state j00i , and if 1 (4) of Theorem 3. Equality (3) asserts that the channels ρ BC BjA is obtained, prepare j11i . The operator representing this and ρ separately admit unitary dilations. Equality (2) BC CjA channel is asserts that ρ is compatible with A being a complete BCjA common cause of B and C by depicting conditions under ρ ¼j000ih000j  þj111ih111j : BCjA BCA BCA which λ has no influence on C and λ has no influence on B. B C Here, the unitary matrix U is decomposed as U ¼ It is easily verified that this operator satisfies each of the ðI ⊗ WÞðV ⊗ I Þ, as per the proof of Theorem 2. λ λ B C conditions of Theorem 2, so that B and C are quantum conditionally independent given A for this channel. The E. Examples decomposition of the A Hilbert space implied by condition 1. Unitary transformation (4) is Consider the case in which inputs A and D evolve, via a H ¼ðC ⊗ CÞ ⊕ ðC ⊗ CÞ; generic unitary transformation U into outputs B and C.In 031021-8 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) 0 0 → 0 0 ; X λ Y Z 0 1 → 0 1 ; X λ Y Z 1 0 → 1 1 ; X λ Y Z 1 1 → 1 0 ð10Þ X λ Y Z FIG. 8. Classical realization of a copy operation using an ancilla and classical CNOT gate. [which one easily verifies to reduce to the classical copy of Eq. (8) when one sets λ to 0], has the causal structure where C is the one-dimensional complex Hilbert space, i.e., depicted in Fig. 8, so that λ does not act as a common cause the complex numbers. of Y and Z but only a local cause of Z. The other direct quantum analogue of the classical copy In the quantum case, neither reason applies. Concerning above is the channel that makes a coherent copy of a qubit, the second reason, the quantum CNOT has the causal where the mapping from input states to output states is structure depicted in Fig. 9: the quantum CNOT is such that not only does A have a causal influence on C,but λ has αj0i þ βj1i → αj0i j0i þ βj1i j1i : ð9Þ a causal influence on B as well. In other words, unlike the A A B C B C classical CNOT, there is a backaction of the target on the This channel is represented by the operator control. It follows that in the quantum case, λ can act as a common cause of B and C. Furthermore, the ancilla is ρ ¼ðj000i  þj111i Þðh000j  þh111j Þ; prepared in a quantum pure state j0i. This is disanalogous BCjA BCA BCA BCA BCA to a point distribution on the value 0 for the classical which corresponds to an unnormalized Greenberger- variable λ if one takes the view that a quantum pure state Horne-Zeilinger state. It can easily be verified that represents maximal but incomplete information about a IðB∶CjAÞ¼ 1 for a trace-one version of this state; hence, quantum system [34–38]. In this case, one must allow for it is not the case that outputs B and C are quantum the possibility that some correlation between B and C is due conditionally independent given the input A. There is, to the ancilla, in which case A is not the complete common then, no way in which this channel can arise as a marginal cause of B and C [39]. channel in a situation in which A is the complete common cause of B and C. F. Generalization to one input, k outputs At first blush, this conclusion may seem surprising. Theorems 2 and 3, which apply to quantum channels Given the mapping described by Eq. (9), where would with one input and two outputs, can be generalized to the correlations between outputs B and C come from, other case of one input and k outputs. than being completely explained by the input A? Consider a channel ρ , and let B denote the B …B jA i The puzzle is resolved by considering the dilation of the 1 k collection of all outputs apart from B . The notion of coherent copy to a unitary transformation, and the inter- i quantum compatibility from Definition 5 generalizes in the pretation of quantum pure states. Consider Figs. 8 and 9, obvious way: ρ is said to be compatible with A B …B jA which, respectively, show a classical copy operation via the 1 k being a complete common cause of B …B , if it is possible classical CNOT gate and a quantum coherent copy operation 1 k to find ancillary quantum systems λ ; …; λ , states via the quantum CNOT gate [32]. 1 k ρ ; …; ρ , and a unitary channel where, for each i, λ In the classical case, there are two reasons why any λ λ i 1 k correlation between Y and Z must be entirely explained by has no causal influence on B , such that these constitute a statistical variation in the value of X. First, the ancillary dilation of ρ . B …B jA 1 k variable λ is prepared deterministically with value 0, so The generalization of Theorems 2 and 3, consolidated there is no possibility that statistical variation in the value of into a single theorem, is as follows. λ underwrites the correlations between B and C. Second, Theorem 4.—The following are equivalent. the mapping between input strings and output strings for (1) ρ is compatible with A being a complete B …B jA 1 k the classical CNOT gate, common cause of B …B . 1 k (2) ρ ¼ ρ  ρ , where for all i, B …B jA B jA B jA 1 k 1 k j, ½ρ ; ρ ¼ 0. B jA B jA i j ¯ ¯ (3) For each i, IðB ∶B jAÞ¼ 0, where IðB ∶B jAÞ is the i i i i quantum conditional mutual information evaluated on the (positive, trace-one) operator ρ ˆ . B …B jA 1 k (4) The Hilbert space for the A system can be decom- posed as H ¼⨁ H 1 ⊗  ⊗ H k, such that i A A FIG. 9. Quantum realization of the coherent copy using an i i 1 k ancilla and quantum CNOT gate. ρ ¼ ðρ ⊗  ⊗ ρ Þ, where for B …B jA i B jA B jA 1 k 1 k 031021-9 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) each i, and each l ∈ f1; …;kg, ρ represents a where NondescðiÞ is the set of nondescendants of node X . BjA i The intuitive idea is that the parents of a node screen off that CPTP map BðH lÞ → BðH Þ. node from the other nondescendants: once the values of the The proof is in Appendix B. By analogy to the classical parents are fixed, the values of other nondescendant nodes case, if conditions (2)–(4) of Theorem 4 hold, we say that are irrelevant to the value of X . B …B are quantum conditionally independent given A for 1 k Note also that Reichenbach’s principle is easily seen to the channel ρ . B …B jA 1 k be a special case of the requirement that for a joint distribution to be explainable by the causal structure of IV. CLASSICAL CAUSAL MODELS some DAG, it must be Markov for that DAG: if two variables, Y and Z, are ancestrally independent in the graph, A. Definitions then any distribution that is Markov for this graph must Reichenbach’s principle is important because it general- factorize on these, PðYZÞ¼ PðYÞPðZÞ, which is the izes to the modern formalism of causal models [8,9]. qualitative part of Reichenbach’s principle in its contra- A causal model consists of two entities: (i) a causal positive form; if two variables, Y and Z, have a variable X structure, represented by a directed acyclic graph (DAG) as a complete common cause, as in the DAG of Fig. 1, then where the nodes represent random variables and the any distribution that is Markov for the graph must satisfy directed edges represent the directed causal influences PðYZjXÞ¼ PðYjXÞPðZjXÞ, which is the quantitative part among these (several examples have already been presented of Reichenbach’s principle. in this article), and (ii) some parameters, which specify the strength of the causal dependences and the probability B. Justifying the Markov condition distributions for the variables associated to root nodes in the DAG (i.e., those with no incoming arrows). Some termi- Just as we previously asked whether there was some nology is required to present the formal definitions. principle that forced a rational agent to assign probability Given a DAG with nodes X ; …;X , let ParentsðiÞ distributions in accordance with the quantitative part of 1 n denote the parents of node X , that is, the set of nodes Reichenbach’s principle, we can similarly ask why a that have an arrow into X , and let ChildrenðiÞ denote the i rational agent who takes causal relationships to be given children of node X , that is, the set of nodes X such that i j by a particular DAG should arrange their beliefs so that the there is an arrow from X to X . The descendants of X are joint distribution is Markov for the DAG. i j i those nodes X , j ≠ i, such that there is a directed path from The justification of the Markov condition parallels the justification of the quantitative part of Reichenbach’s X to X . The ancestors of X are those nodes X such that i j i j principle that we presented in Sec. II B. We begin by X is a descendant of X . i j outlining what the qualitative part of Reichenbach’s prin- Definition 7.—A causal model specifies a DAG, with ciple and the assumption of fundamental determinism nodes corresponding to random variables X ; …;X , 1 n imply for any arbitrary causal structure. and a family of conditional probability distributions Definition 9 (Classical compatibility with a fP½X jParentsðiÞg, one for each i. DAG).—PðX …X Þ is said to be compatible with a Definition 8.—Given a DAG, with random variables 1 n DAG G with nodes X ; …;X if one can find a DAG G X ; …;X for nodes, and given an arbitrary joint distribu- 1 n 1 n that is obtained from G by adding extra root nodes tion PðX …X Þ, the distribution is said to be Markov for 1 n λ ; …; λ , such that for each i, the node λ has a single 1 n i the graph if and only if it can be written in the form of outgoing arrow, to X , and one can find, for each i,a Y distribution Pðλ Þ and a function f from ½λ ; ParentsðiÞ to i i i PðX …X Þ¼ P½X jParentsðiÞ: ð11Þ 1 n i X , such that i¼1 [Recall that each conditional P½X jParentsðiÞ can be PðX …X Þ 1 n computed from the joint PðX …X Þ.] 1 n X Y The generalization of Reichenbach’s principle that is ¼ δ(X ;f ½λ ; ParentsðiÞ)Pðλ Þ : i i i i afforded by the formalism of causal models is as follows: if λ …λ i¼1 1 n there are statistical dependences among variables X ; …;X , expressed in the particular form of the joint 1 n distribution PðX …X Þ, then there should be a causal 1 n explanation of these dependences in terms of a DAG Theorem 5 (Ref. [8])—Given a joint distribution relative to which the distribution PðX …X Þ is Markov. PðX …X Þ and a DAG G with nodes X ; …;X , the 1 n 1 n 1 n Note that an alternative way of formalizing the Markov following are equivalent. property is that PðX …X Þ is Markov for the graph if and (1) PðX …X Þ is compatible with the causal structure 1 n 1 n only if, for each i, P½X jParentsðiÞ ¼ P½X jNondescðiÞ, described by the DAG G. i i 031021-10 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) (2) PðX …X Þ is Markov for G; that is, various different approaches in the literature, including the 1 n multitime formalism [41–44], the quantum combs formal- ism [45–47], the process matrices formalism [48–50], and a PðX …X Þ¼ P½X jParentsðiÞ: 1 n i number of other works as well [14,51–53]. i¼1 The discussion of classical causal models in Sec. IV, and The ð1Þ → ð2Þ implication in Theorem 5 can be read as the results of Sec. III for the special case where A is a follows: if it is granted that causal relationships are complete common cause of B and C, suggest the following indicative of underlying deterministic dynamics, and that generalization. the qualitative part of Reichenbach’s principle is valid, Definition 10.—A quantum causal model specifies a then, on pain of irrationality, an agent’s assignment DAG, with nodes A ; …;A , supplemented with the follow- 1 n PðX …X Þ must be Markov for the original graph. 1 n ing. For each node A , there is associated a finite-dimensional The ð2Þ → ð1Þ implication in Theorem 5, like that of Hilbert space H (the “input” Hilbert space) and the dual Theorem 1, is pertinent for causal inference. It asserts that if space H (the “output” Hilbert space). For each node A , i i one observes a distribution PðX …X Þ that is Markov for a 1 n there is associated a quantum channel, described by an graph G, then the causal structure described by G provides operator ρ ∈ BðH ⊗ H Þ, where H A jParentsðiÞ i i ParentsðiÞ ParentsðiÞ a possible causal explanation of the observed distribution. is the tensor product of the output Hilbert spaces associated Note that, given PðX …X Þ, there is not in general a 1 n with the parents of A . These channels commute pairwise; unique graph G such that PðX …X Þ is Markov for G; 1 n i.e., for any i, j, ½ρ ; ρ ¼ 0 [which is a A jParentsðiÞ A jParentsðjÞ i j hence there are in general competing causal explanations. nontrivial constraint whenever ParentsðiÞ∩ParentsðjÞ is Those causal models that require fine-tuning of parameters nonempty]. are typically rejected. Recall from Sec. III that, given a quantum channel ρ , BCjA it is compatible with A being the complete common cause V. QUANTUM CAUSAL MODELS of B and C if and only if ρ ¼ ρ ρ , and if this holds, BCjA BjA CjA A. Proposed definition then ½ρ ; ρ ¼ 0. The definition of a quantum causal BjA CjA In our treatment of the simple causal scenario where A is model, in particular, the stipulation that the channels a complete common cause of B and C (the DAG of Fig. 3), commute pairwise, generalizes this idea. we focused on what form is implied for the quantum Definition 11.—Given a quantum causal model, speci- channel ρ . But there has not been any attempt to define fying a DAG with nodes A ; …;A , and commuting BCjA 1 n channels ρ , the state is an operator σ on a quantity analogous to the classical joint distribution, that A jParentsðiÞ A …A i 1 n is, a quantity analogous to PðXYZÞ in the case of the DAG ⊗ H , where H ≔ H ⊗ H , given by i¼1 A A A A i i i of Fig. 1, nor indeed other classical Bayesian conditionals such as PðXjYZÞ. For works that aim to achieve such σ ¼ ρ : ð12Þ A …A A jParentsðiÞ 1 n i analogues, see Refs. [21,37]. See also Ref. [40], however, i¼1 where it is shown that if one associates a single Hilbert space to a system at a given time, then there are significant The operator σ is referred to as the state of the A …A 1 n obstacles to establishing an analogue of a classical joint quantum causal model since, as we discuss in the next distribution when the set of quantum systems includes subsection, σ is used to calculate the probabilities for A …A 1 n some that are causal descendants of others outcomes when measurements are performed on the This work takes a different approach. The interpretation systems that the model describes [54]. of a quantum causal model will be that each node represents a local region of time and space, with channels B. Making predictions such as ρ describing the evolution of quantum systems BCjA In order to see how a quantum causal model is used to in between these regions. At each node, there is the calculate probabilities for the outcomes of agents’ inter- possibility that an agent is present with the ability to ventions, consider a quantum causal model with nodes intervene inside that local region. Each node A will then be A ; …;A and state σ . Let the intervention at node A 1 n A …A i 1 n associated with two Hilbert spaces, one corresponding to have classical outcomes labeled by k . The intervention is the incoming system (before the agent’s intervention) and defined by a quantum instrument (that is, by a set of the dual space, which corresponds to the outgoing system completely positive trace-nonincreasing maps, one for each (after the agent’s intervention). A quantum causal model outcome, which sum to a trace-preserving map). In order to will consist of a specification, for every node, of the write the probabilities for the outcomes in a simple form, it quantum channel from its parents to the node, with the is useful to define the instrument in such a way that the map operational significance of a network being that it is used to associated to each outcome takes operators on H into calculate joint probabilities for the agents to obtain the operators on H . Hence, suppose that the outcome k A i various possible joint outcomes for their interventions. This i way of treating quantum systems over time has appeared in corresponds to the map E ∶ BðH Þ → BðH Þ and let A A A i i i 031021-11 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) k k i i Our association of each node A of the DAG with a pair of τ ¼ E ðjli hmjÞ ⊗ jli hmj: i A A A A i i i Hilbert spaces, H and H , is simply a quantum version lm i A O I of the splitting of a classical variable X into X and X , and i i The outcome k of the agent’s intervention can then be our joint state σ is the quantum analogue of the A …A 1 n represented by the (positive, basis-independent) operator I I O O conditional probability PðX …X jX …X Þ. n n 1 1 k k i i τ isomorphic to E . A A In a classical interventional causal model, one can If an agent does not intervene at the node A , this i imagine an intervention at node X as a causal process I O corresponds to the linking operator itself: acting between X and X and possibly outputting an i i X additional classical variable k which acts as a record of id τ ¼ jli hmj ⊗ jli hmj: some aspect of the intervention. The most general such A A A i i lm intervention is described by a conditional probability O I distribution Pðk ;X jX Þ [55]. After specifying the nature i i i The joint probability for the agents to obtain outcomes O I of the intervention at each node, fPðk ;X jX Þg , one can i i i i k ; …;k is given by 1 n compute the joint probability distribution over the record variables to be k k 1 n Pðk …k Þ¼ Tr½σ ðτ ⊗   ⊗ τ Þ: ð13Þ 1 n A …A A A 1 n 1 n I I O O Pðk …k Þ¼ PðX …X jX …X Þ 1 n n n 1 1 We can also define operations on the state σ A …A 1 n I O I O X ;X …X ;X n n 1 1 corresponding to marginalization over the outcome k of an intervention on node A by Tr . In this case, the i k A i O I × Pðk ;X jX Þ: ð15Þ i i i joint state on the rest of the nodes after such marginali- i¼1 zation is X Clearly, our intervention operators τ are the quantum k A σ ¼ Tr ðσ τ Þ: A …A A …A A A …A O I 1 ði−1Þ ðiþ1Þ n i 1 n A analogue of the intervention conditionals Pðk ;X jX Þ, and i i our Eq. (13) is the quantum analogue of Eq. (15). If the intervention at node A is trivial, then D. Examples id σ ¼ Tr ðσ τ Þ: A …A A …A A A …A A 1. Confounding common cause 1 ði−1Þ ðiþ1Þ n i 1 n i Consider a quantum causal model with the DAG C. Classical interventional models depicted in Fig. 10. The DAG is supplemented with the quantum channels ρ , ρ , and ρ , where the latter is CjAB BjA A Given the proposed definition of a quantum causal simply a quantum state on H (which can also be thought model, and the interpretation in terms of agents intervening of as a channel from the trivial, or one-dimensional, system at nodes, there is a stronger analogy to be made with a into A). classical formalism that similarly involves interventions The corresponding state is σ ¼ ρ ρ ρ , where σ ABC CjBA BjA A than there is to the standard classical causal models acts on the Hilbert space H ⊗ H ⊗ H ⊗ H ⊗ C C B B introduced in Sec. IV. H ⊗ H . By stipulation, the channels commute pairwise. In order to make this explicit, consider a classical This is immediate in the case of, say, ρ and ρ , since BjA A interventional causal model constructed as follows. For a these operators are nontrivial on distinct Hilbert spaces. But given DAG, split every node X into a pair of disconnected O I nodes, denoted X and X , such that in the DAG i i that results, X has as parents the set of nodes O O O Parents ðiÞ ≔ fX ∶X ∈ ParentsðiÞg, and X has as chil- j i dren fX ∶X ∈ ChidrenðiÞg. In other words, the “I” version of each node X has as parents the “O” version of each node that was a parent of X in the original graph, and the “O” version of each node X has as children the “I” version of each node that was a child of X in the original graph. In this case, one can represent the resulting DAG by a conditional probability distribution: I I O O I O PðX …X jX …X Þ¼ P½X jParents ðiÞ: ð14Þ n n FIG. 10. A causal network with A a common cause for B and C 1 1 i i¼1 and with B a parent of C. 031021-12 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) it is significant in the case of ρ and ρ , both of which strongly constrained as it was in the previous example. CjBA BjA Physically, this is important: if the qubit, for example, is act nontrivially on H . From Theorem 2, this implies that interacting only weakly with the environment, then its H decomposes as H ¼⨁ H ⊗ H , with ρ act- L R A A i CjBA A A i i evolution certainly could not be paraphrased in terms of a ing trivially on (say) the right-hand factors and ρ acting BjA strong von Neumann measurement, as it was for evolutions trivially on the left-hand factors. compatible with Fig. 10. The fact that the output Hilbert space of the A system One further remark concerning this example will help to decomposes in this manner is a significant constraint on the illustrate a distinction between quantum and classical kinds of quantum evolution that can be compatible with the causal models. Suppose that ρ is the pure state j0ih0j DAG of Fig. 10. In words, the evolution undergone by and that we marginalize over λ under the assumption that an the system emerging from A is as follows: a (possibly agent at the λ node does not intervene. In classical causal degenerate) von Neumann measurement is performed and, models, if a root note has a point distribution, then controlled on the outcome, the A system is split into two marginalizing over that node yields a distribution over pieces. One piece evolves to become the input at B. The the remaining variables that is compatible with the DAG output at B is then recombined with the other piece and obtained by removing that node and its outgoing arrows. evolves to become the input at C. This does not hold in the quantum case: even for ρ a pure By way of contrast, it is also instructive to consider state, marginalizing over the λ node (assuming no inter- quantum causal models with the causal structure shown in vention there) in general yields an operator σ that is not ABC Fig. 11. Such a quantum causal model may represent, for compatible with the DAG obtained by removing λ (Fig. 10). example, the non-Markovian evolution of a qubit over three As with the example of the coherent copy in Sec. III E 2, time steps, with A, B, and C representing the qubit at each this makes intuitive sense if one takes the view that a time step, and where the qubit interacts with an environ- quantum pure state represents maximal but incomplete ment whose initial state is ρ . The qubit is initially information. Incomplete information about the λ system uncorrelated with the environment. Suppose that the state may underwrite correlations between B and C, so that such of the environment at the second and third time steps is not correlations cannot be attributed entirely to system A as of interest; hence, corresponding nodes do not appear in the Fig. 10 requires. Hence, even for the environment initially DAG. Given that over the course of this evolution infor- in a pure state, the non-Markovian evolution of a qubit need mation can flow from the qubit to the environment, and not obey the strong constraint implied by the causal back again, it is necessary to include an arrow from A to C, structure of Fig. 10. as well as from λ to B and λ to C. A quantum causal model with this DAG defines com- 2. Simple case of Bayesian updating muting channels ρ , ρ , ρ , ρ . From the fact that CjBAλ BjAλ A λ This section discusses another sense in which the ρ and ρ commute, we conclude that the Hilbert CjBAλ BjAλ quantum notion of conditional independence of the outputs space H ⊗ H decomposes as a direct sum over direct A λ of a channel given the input mirrors qualitatively an products. However, a decomposition of H ⊗ H as a A λ important aspect of the classical case. direct sum over direct products does not imply a decom- Consider a classical causal model with the DAG of position of the Hilbert space H alone as a direct sum over Fig. 1 and distribution PðXYZÞ such that PðYZjXÞ¼ direct products. Hence, the evolution of the qubit is not PðYjXÞPðZjXÞ. A particular feature of this causal scenario is that if new information is obtained about the variable Y, for example, if an agent learns that the value of Y is y, then the process of Bayesian updating proceeds as follows. First, update the distribution over X by applying the rule PðY ¼ yjXÞPðXÞ PðXÞ ≔ PðXjY ¼ yÞ¼ : PðY ¼ yÞ Then use the new probability distribution on X, PðXÞ,to get an updated distribution for Z: PðZjY ¼ yÞ¼ PðZjXÞPðXÞ; ð16Þ FIG. 11. The causal structure of Fig. 10 with an extra node λ, which is a common cause for B and C. A causal model with this where the sum ranges over the values that X may take. DAG may describe a qubit interacting with an environment: A, B, Roughly speaking, the process of Bayesian updating C represent the qubit system at three different times and λ the environment at the initial time. “follows the arrows” of the graph. For this it is crucial 031021-13 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) that the joint distribution PðXYZÞ satisfies PðYZjXÞ¼ settings and classical outcomes of measurements. PðYjXÞPðZjXÞ, otherwise the term PðZjXÞ in Eq. (16) Henson, Lal, and Pusey [63] and Fritz [64] independently would have to be replaced with PðZjX; Y ¼ yÞ. proposed definitions of quantum causal models with the Consider now a quantum causal model, with the DAG of purpose of expressing such constraints. In these appro- Fig. 3 and with state σ ¼ ρ ρ ρ . Suppose that an aches, each node of the DAG represents a process (which ABC BjA CjA A agent at B intervenes, obtaining outcome k , corresponding may have a classical outcome), while each directed edge is associated with a system that is passed between processes. to the operator τ . The agent wishes to calculate the However, despite the fact that their frameworks incorporate probability that an intervention at C yields outcome k the possibility of post-classical resources, they do not have corresponding to τ , conditioned on having obtained the sufficient structure to define conditional independences outcome k , and assuming that there is no intervention at A. between quantum systems. This can be done as follows. First, update the state assigned Operational reformulations of quantum theory such as to A given the knowledge of k to Refs. [65–70] helped to set the stage for the development of k quantum causal models. Although they were conceived Tr ðσ τ Þ B AB σ~ ≔ σ ¼ : independently of the framework of classical causal models, A Ajk B k id B Tr½σ ðτ ⊗ τ Þ AB A B they were quite similar to that framework insofar as they made heavy use of DAGs—in the form of circuit Then apply the channel ρ to σ~ to get the state assigned CjA A diagrams—to depict structural features of a set of proc- to C given the knowledge of k : esses. When the authors of these formulations turned their attention to relativistic causal structure, the frameworks id σ ¼ Tr ðρ σ~ τ Þ: Cjk A CjA A B A they devised drew even closer in spirit to that of causal models. Prominent examples include the causaloid frame- Finally, calculate the probability of k : work of Ref. [71], the multitime formalism [41–44], quantum combs [45–47], the causal categories of Pðk jk Þ¼ Trðσ τ Þ: C B Cjk B C Ref. [28], and the process matrix formalism [48,49].A common aim of these approaches is to be able to compute Again, the process of Bayesian updating follows the arrows the consequences of an intervention upon a particular of the graph. Note that for this to work, it is crucial that the quantum system within the circuit, and this is precisely channel ρ satisfies ρ ¼ ρ ρ . BCjA BCjA BjA CjA one of the tasks that a quantum analogue of a causal model should be able to handle. VI. RELATION TO PRIOR WORK Many of these frameworks represent a quantum system at a given region of space-time by two copies of its Hilbert We now present a short review of prior works on space, one corresponding to the system that is input into quantum causal models and describe how our own proposal the region and one corresponding to the system that is relates to these. output from it. In this way, the region becomes a “locus of Generalizations of Reichenbach’s principle of common cause were discussed in Refs. [56–59], although the intervention” for the system. By inserting a particular approach is quite different from ours. In these works, quantum process into the “slot,” one determines the nature the focus of attention is a Boolean algebra of events (in the of the intervention. This is the approach taken, for instance, classical case), or a nondistributive algebra of projectors (in in the multitime formalism of Ref. [42], the quantum combs the quantum case), with probabilities induced in each case of Ref. [45], and the process matrices of Ref. [48]. This by a state on the algebra. Given a pair of events, or representation of interventions has a counterpart in classical projectors, a common cause is a third event, or projector, causal models, for instance, in the work of Ref. [72],aswas such that the probabilities satisfy certain constraints. For a noted in Refs. [14,50]. critical analysis of Refs. [58,59], see Ref. [15]. Costa and Shrapnel [50] in particular have sought to Preliminary work more directly pertaining to quantum explicitly cast this sort of framework as a quantum causal models took the form of explorations of Bell-type generalization of a causal model. In their approach, the inequalities (and whether they admit quantum violations) nodes of the DAG are associated with a quantum system for novel causal scenarios [60,61]. Several researchers localized in a region (understood as a potential locus of recognized that the formalism of classical causal models intervention) and the collection of edges from one set of could provide a unifying framework in which to pose the nodes to another represent causal processes. problem of deriving Bell-type constraints, and that this An approach of this sort is required if one seeks to find framework might be extended to address the problem of intrinsically quantum versions of important theorems of deriving constraints on the correlations that can be obtained classical causal models. For instance, while Henson, Lal, and Pusey [63] derive a generalization of the d-separation with quantum resources [2,10,11,62]. Note that such constraints are expressed entirely in terms of classical theorem of classical causal models, it only infers 031021-14 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) Finally, we mention a third purpose to which quantum conditional independence relations from d-separation rela- tions for the classical variables in the graph. An intrinsically causal models can be put. Theorems about classical causal quantum version of the d-separation theorem, by contrast, models often concern the sorts of inferences one can make about one variable given information about another. As an would be one which concerns the causal relations among example, if Z is a common effect of X and Y, then learning quantum systems (see, for instance, Ref. [73]). If a set of Z can induce correlations between X and Y. As such, one nodes representing quantum systems can be described by a might expect quantum causal models to also constrain the joint or conditional state, then one can seek to determine sort of inferences one can make among quantum variables. whether factorization conditions on this state are implied by Early work by Leifer and Spekkens [21] had this purpose in d-separation relations among the quantum systems on the mind. The authors noted various scenarios in which their graph. Similarly, while the approaches both of Henson, Lal, proposal could not be applied, and subsequent work [40] and Pusey [63] and of Fritz [64] allow one to derive, from has narrowed down the scope of possibilities for any such the structure of the DAG, constraints on the joint distri- proposal. Our own proposal provides the means of making bution over classical variables embedded therein, they do many of the Bayesian inferences considered in Ref. [21]. not address an intrinsically quantum version of this The case we discuss in Sec. VD 2 is one such example. problem. If a set of nodes representing quantum systems There is also prior work on quantum causal models that can be described by a joint or conditional state, then one takes a significantly different approach to the ones can seek to derive constraints on this state directly from the described above and for which the relation to our work structure of the DAG. is less clear. The work of Tucci [74,75], which is in fact the Our own approach aims at an intrinsically quantum earliest attempt at a quantum generalization of a causal generalization of the notion of a causal model. We therefore model, represents causal dependences by complex transi- associate to each node of the DAG a quantum system tion amplitudes rather than quantum channels. localized to a space-time region, and we represent it by a pair of Hilbert spaces, corresponding to the input and VII. CONCLUSIONS output of an intervention upon the system. Consequently, The field of classical statistics has benefited greatly from our approach is very similar to that of Costa and Shrapnel analysis provided by the formalism of causal models [8,9]. [50]. Nonetheless, there are significant differences in how In particular, this formalism allows one to infer facts about we represent common causes. the underlying causal structure purely from uncontrolled First, in the work of Costa and Shrapnel, any node with statistical data, a tool with significant applications in all multiple outgoing edges is represented as a locus of branches of the physical and social sciences. Given that intervention where the output Hilbert space is a tensor some seemingly paradoxical features of classical correla- product of Hilbert spaces, one for each outgoing edge. As tions have found satisfying resolutions when viewed such, any node acting as a common cause must be through a causal lens, one might wonder to what extent associated with a composite quantum system. It cannot, the same is true of quantum correlations. for instance, be associated with a single qubit. By contrast, Starting with the idea that whatever innovation quantum our approach does not constrain the representation of theory might hold for causal models, the intuition contained common causes in this fashion. Any quantum system, in Reichenbach’s principle ought to be preserved, we including a single qubit, may constitute a complete motivate the problem of finding a quantum version of common cause of a collection of other quantum systems. the principle. This requires us to determine what constraint This extra generality is required since, as our examples a channel from A to BC must satisfy if A is the complete show, the complete common cause of a set of systems can common cause of B and C. We solve the problem by be a single qubit. Second, and more importantly, our work considering a unitary dilation of the channel and by noting shows that for a quantum channel whose input is the that there is no ambiguity in how to represent the absence of complete common cause of its n outputs, it is not the case causal influences between certain inputs and certain outputs that the channel must split the input into n components, of a unitary. From this, we derive a notion of quantum each of which exerts a causal influence on a different conditional independence for the outputs of the channel output. This is merely one special case of the most general given its input. This inference from a common-cause form that such a channel can take. Third, if the complete structure to quantum conditional independence is then common causes consist of multiple nodes in the DAG, then generalized to obtain our quantum version of causal it is only the joint Hilbert space of the collection of these models. that must satisfy the condition of factorizing in subspaces, Given a state on a quantum causal model, we describe while each individual Hilbert space need not. how to construct a marginal state for a subset of nodes. We These differences are likely to have a significant impact discuss a number of simple examples of quantum channels on the form of any intrinsically quantum d-separation and causal structures. A theme of the examples is that when theorem. there is a difference between the quantum and classical 031021-15 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) combination of different causal structures [46,48,83,84]. cases, this can often be understood if one takes the view that a quantum pure state represents maximal but incomplete It has been argued that the possibility of such indefinite information about a system, and hence may underwrite causal structure may be significant for the project of unifying quantum theory with general relativity [71]. correlations between other systems in a way that a classical pure state cannot. There are many directions for future work. In the case of ACKNOWLEDGMENTS classical causal networks, an important theorem states that R. W. S. thanks Elie Wolfe for helpful discussions. This the d-separation relation among nodes of a DAG is sound work was supported by EPSRC grants, the EPSRC and complete for a conditional independence relation to hold National Quantum Technology Hub in Networked among the associated variables in the joint probability Quantum Information Technologies, an FQXi Large distribution [8]. Here, for arbitrary subsets of nodes S, T, Grant, University College Oxford, the Wiener-Anspach and U, subsets S and T are said to be d-separated by U if a Foundation, and by the Perimeter Institute for Theoretical certain criterion holds, where this is determined purely by the Physics. Research at Perimeter Institute is supported by the structure of the DAG. An important question is, therefore, Government of Canada through the Department of whether d-separation is sound and complete for some natural Innovation, Science and Economic Development Canada, property of the state σ on a quantum causal network. and by the Province of Ontario through the Ministry of It is also desirable to relate properties of a quantum Research, Innovation and Science. This project/publication causal network to operational statements involving the were made possible through the support of a grant from the outcomes of agents’ interventions: under what circum- John Templeton Foundation. The opinions expressed in this stances, for example, does it follow that there is an publication are those of the author(s) and do not necessarily intervention by the agents at nodes in a subset U, such reflect the views of the John Templeton Foundation. that, conditioned on its outcome, the outcomes of any interventions by agents at S and T must be independent? Such a result would have an application to quantum APPENDIX A: PROOF OF THEOREMS 2 AND 3 protocols. Imagine, for example, a cryptographic scenario in which agents at S and T desire shared correlations that We here provide the proof of Theorems 2 and 3. This are not screened off by the information held by agents at U. amounts to proving that for a channel ρ , the following BCjA In the classical case, there has been a great deal of work four conditions are equivalent. on the problem of causal inference [8,9,76–78]: given only (1) ρ admits of a unitary dilation where A is a BCjA certain facts about the joint probabilities, for instance, a set complete common cause of B and C. of conditional independences, what can be inferred about (2) ρ ¼ ρ ρ . BCjA BjA CjA the underlying causal structure? For an initial approach to (3) IðB∶CjAÞ¼ 0, where IðB∶CjAÞ is the quantum quantum causal inference, with a quantum-over-classical conditional mutual information evaluated on the advantage in a simple scenario, see Ref. [14]. The formal- (positive, trace-one) operator ρ ˆ . BCjA ism of quantum causal networks described here is the (4) The Hilbert space for the A system can be appropriate framework for inferring facts about underlying, decomposed as H ¼⨁ H L ⊗ H R and ρ ¼ A BCjA i A A i i intrinsically quantum, causal structure, given observed ðρ L ⊗ρ RÞ, where for each i, ρ L represents i BjA CjA BjA i i i facts about the outcomes of interventions by agents. a completely positive map BðH LÞ → BðH Þ, and A B Recently, there has been much interest in deriving ρ R a completely positive map BðH RÞ → BðH Þ. bounds on the correlations achievable in classical causal CjA A C i i models [76,77,79,80] using insights from the literature on We show various implications that collectively give Bell’s theorem. Such bounds constitute Bell-like inequal- Theorem 2. ities for arbitrary causal structures. The main technical Proof that ð3Þ ↔ ð4Þ.—This follows easily from the results of Ref. [27], where a characterization is given of challenge in deriving such inequalities is that the set of correlations is generally not convex if the DAG has more tripartite quantum states over systems A, B, C that than one latent variable, so that standard techniques for satisfy IðB∶CjAÞ¼ 0. deriving Bell inequalities are not applicable. By adapting Lemma 1 [Ref. [27], Theorem 6).—For any tripartite these new techniques to the formalism we present here, one quantum state ρ , the quantum conditional mutual ABC information IðB∶CjAÞ¼ 0 if and only if the Hilbert space could perhaps systematically derive bounds on the quantum of the A system decomposes as H ¼⨁ H L ⊗ H R, correlations achievable in certain quantum causal models, A A A i i thereby providing a general method of deriving Tsirelson- such that like bounds [81,82] for arbitrary causal structures. Finally, it would be interesting to extend the formalism X X to explore the possibility that certain quantum scenarios ρ ¼ p ðρ L ⊗ ρ RÞ;p ≥ 0; p ¼ 1; ðA1Þ ABC i i i BA CA i i i i are best understood as involving a quantum coherent 031021-16 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) L L where for each i, ρ is a quantum state on H ⊗ H and To see this, write BA B A i i ρ R is a quantum state on H ⊗ H R. CA C A i i U U IðB∶λ jλ AÞ¼ Sðρ ˆ Þþ Sðρ ˆ Þ C B Bjλ A ·jλ Aλ Theorem 2 concerns the channel operator ρ , which B B C BCjA U U satisfies Tr ðρ Þ¼ I . Applying Lemma 1 to the BC BCjA A − Sðρ ˆ Þ − Sðρ ˆ Þ: ðA6Þ Bjλ Aλ ·jλ A B C B operator ρ ˆ ¼ð1=d Þρ yields the decomposition BCjA A BCjA The second and fourth terms are entropies of maximally ρ ˆ ¼ p ðρ ˆ L ⊗ ρ ˆ RÞ: mixed states on their respective systems, and hence, sum to BCjA i BjA CjA i i log d . For the first and third terms, it follows from the assumption that there is no causal influence from λ to B in Using Tr ðρ ˆ Þ¼ð1=d ÞI , it follows that for each i, U U BC BCjA A A U that ρ ˆ ¼ ρ ˆ ⊗ ð1=d ÞI . Hence, the third λ ðλ Þ Bjλ Aλ Bjλ A C C B C B the components satisfy Tr ðρ ˆ LÞ¼ð1=d LÞI L , and B BjA A ðA Þ i i i term is equal to Sðρ ˆ Þþ logðd Þ, which gives Eq. (A5). Bjλ A C Tr ðρ ˆ RÞ¼ð1=d RÞI R , with p ¼ðd Ld RÞ=d . The C i A CjA A ðA Þ A A i i i i i Fourth, result follows. Proof that ð1Þ → ð4Þ.—Let ρ be the Choi- BFCjAλ λ IðC∶λ jAλ Þ¼ 0: ðA7Þ B C B C Jamiołkowski operator for the unitary U, defined according to the conventions set out in the main text. Let missing This follows from a similar argument as Eq. (A5), using the indices indicate that a partial trace is taken, as also in assumption that there is no influence from λ to C in U. the main text. Note that, in general, ρ ≠ ρ , since BCjA BCjA The aim is now to use Eqs. (A2), (A4), (A5), and (A7) to the latter is obtained via a particular choice of input states show that ρ ˆ satisfies IðB∶CjAÞ¼ 0. This follows using BCjA for λ and λ . The proof proceeds by proving relations a result from Ref. [12], which states that quantum condi- B C between quantum conditional mutual information tional mutual information on partial traces of a multipartite evaluated on the renormalized operator ρ ˆ ¼ quantum state satisfies the semigraphoid axioms familiar BFCjAλ λ B C from the classical formalism of causal networks [8]. The ð1=d d d Þρ and its partial traces. λ A λ B C BFCjAλ λ B C semigraphoid axioms are as follows: First, ½IðX∶YjZÞ¼ 0 ⇒ ½IðY∶XjZÞ¼ 0; ðA8Þ IðB∶FCjλ Aλ Þ¼ 0: ðA2Þ B C ½IðX∶YWjZÞ¼ 0 ⇒ ½IðX∶YjZÞ¼ 0; ðA9Þ This follows by expanding in terms of von Neumann entropies: ½IðX∶YWjZÞ¼ 0 ⇒ ½IðX∶YjZWÞ¼ 0; ðA10Þ U U IðB∶FCjλ Aλ Þ¼ Sðρ ˆ Þþ Sðρ ˆ Þ B C Bjλ Aλ FCjλ Aλ B C B C ½IðX∶YjZÞ¼ 0∧ ½IðX∶WjYZÞ¼ 0⇒ ½IðX∶YWjZÞ¼ 0: U U − Sðρ ˆ Þ − Sðρ ˆ Þ: ðA3Þ BFCjλ Aλ ·jλ Aλ B C B C ðA11Þ Applying Eqs. (A8)–(A11) to Eqs. (A2), (A4), (A5), and The third term is zero, since the unitarity of U implies that (A7) gives ρ ˆ is a pure state. The final term is logðd d d Þ, λ A λ BFCjλ Aλ B C B C since ρ ˆ ¼ð1=d d d ÞI . Noting also that λ A λ ðλ Aλ Þ ·jλ Aλ B C B C B C ½IðB∶FCjλ Aλ Þ¼ 0 ⇒ ½IðB∶Cjλ Aλ Þ¼ 0; ðA12Þ B C B C Tr ðρ ˆ Þ¼ð1=d d d ÞI , and using λ Aλ λ A λ ðλ Aλ Þ B C BFCjλ Aλ B C B C B C the fact that the von Neumann entropy of the partial trace ½IðC∶λ jAλ Þ¼ 0 ∧ ½IðB∶Cjλ Aλ Þ¼ 0 B C B C of a pure state is equal to the von Neumann entropy of the ⇒ ½IðC∶Bλ jAλ Þ¼ 0; ðA13Þ B C complementary partial trace, yields that the first two terms equal logðd d Þ and logðd Þ, respectively, hence, their F C B ½Iðλ ∶λ jAÞ¼ 0 ∧ ½Iðλ ∶Bjλ AÞ¼ 0 B C C B sum is equal to logðd d d Þ, and Eq. (A2) follows. λ A λ B C Second, ⇒ ½Iðλ ∶Bλ jAÞ¼ 0; ðA14Þ C B Iðλ ∶λ jAÞ¼ 0: ðA4Þ ½IðBλ ∶λ jAÞ¼ 0 ∧ ½IðBλ ∶CjAλ Þ¼ 0 B C B C B C ⇒ ½IðBλ ∶Cλ jAÞ¼ 0: ðA15Þ B C This follows immediately from ρ ˆ ¼ð1=d d d Þ× λ A λ ·jλ Aλ B C B C I . ðλ Aλ Þ B C Hence, condition (1) of the theorem implies that Third, IðBλ ∶Cλ jAÞ¼ 0, where this quantity is calculated on B C the trace-one Choi-Jamiołkowski operator representing the IðB∶λ jλ AÞ¼ 0: ðA5Þ C B dilation unitary U. Using Lemma 1 gives 031021-17 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) U U U W, it follows immediately that there is no influence from λ ρ ˆ ¼ p ðρ ˆ ⊗ ρ ˆ Þ; ðA16Þ L R BCjλ Aλ Bjλ A CjA λ B C B C i i to C in U. Proof that ð2Þ → ð3Þ.—As remarked in the main text, for some appropriate decomposition of ðH Þ and proba- taking the Hermitian conjugate of ρ ¼ ρ ρ BCjA BjA CjA bility distribution fp g . The form of the decomposition, immediately gives ½ρ ; ρ ¼ 0. Hence, i i BjA CjA and the fact that Tr ðρ ˆ Þ¼ð1=d d d ÞI , BC λ A λ ðλ Aλ Þ BCjλ Aλ B C B C B C ρ ¼ ρ ρ ; ðA20Þ gives BCjA BjA CjA U U U ρ ¼ exp½log ρ þ log ρ ; ðA21Þ ρ ¼ ðρ ⊗ ρ Þ; ðA17Þ BCjA BjA CjA L R BCjλ Aλ Bjλ A CjA λ B C B C i i log ρ ¼ log ρ þ log ρ ; ðA22Þ BCjA BjA CjA where for each i, the components satisfy Tr ðρ Þ¼ Bjλ A log ρ þ log ρ ¼ log ρ þ log ρ ; ðA23Þ BCjA ·jA BjA CjA I L  and Tr ðρ Þ¼ I R . The operator ρ is ðA Þ C R ðA Þ BCjA Cjλ A i i obtained by acting with this channel on the input states −1 −1 logðd ρ Þþ logðd ρ Þ BCjA ·jA A A j0i for λ and j0i for λ . This gives B C λ λ B C −1 −1 ¼ logðd ρ Þþ logðd ρ Þ: ðA24Þ BjA CjA A A ρ ¼ ðρ L ⊗ ρ RÞ; BCjA BjA CjA i i The second line follows as ½ρ ; ρ ¼ 0; the fourth BjA CjA because ρ ¼ I , and therefore has the zero matrix as its ·jA A −1 where Tr ðρ LÞ¼ I L  and Tr ðρ RÞ¼ I R ,as B BjA ðA Þ C CjA ðA Þ logarithm; and the final line by adding 2 log d to both i i i i required. sides. It is proved in Ref. [85] that for any trace-one density Proof that ð4Þ → ð1Þ.—Let H ¼⨁ H , with operator ρ , log ρ þ log ρ ¼ log ρ þ log ρ is A A XYZ XYZ Z XZ YZ i i H ¼ H L ⊗ H R, and ρ ¼ ðρ L ⊗ ρ RÞ. equivalent to the condition IðX∶YjZÞ¼ 0. A A A BCjA BjA CjA i i i i i i Proof that ð4Þ → ð2Þ.—Condition (4) is that H ¼ Each term ρ L corresponds to a valid quantum channel, A BjA ⨁ H L ⊗ H R, with ρ ¼ ðρ L ⊗ ρ RÞ. It fol- BCjA i A A i BjA CjA i.e., a CPTP map BðH Þ → BðH Þ. Similarly, each term i i i i A B lows that ρ R corresponds to a CPTP map BðH RÞ → BðH Þ. CjA A C i i The channel ρ L can be dilated to a unitary trans- BjA L R ρ ¼ ðρ ⊗ I Þ; ðA25Þ BjA BjA ðA Þ i i formation V , with ancilla input λ in a fixed state j0i , i B λ such that V acts on the Hilbert space H ⊗ H L. i λ A i X Similarly, ρ R can be dilated to a unitary transformation ρ ¼ ðI L  ⊗ ρ RÞ: ðA26Þ CjA CjA ðA Þ CjA i i W , with ancilla λ in a fixed state j0i , acting on i C λ H R ⊗ H . By choosing the dimension of λ large λ B i The product is enough, we can identify the system λ and the state j0i that are used for each value of i, and similarly λ . λ C ρ ρ ¼ ðρ L ⊗ I R ÞðI L  ⊗ ρ RÞ: ðA27Þ BjA CjA BjA ðA Þ ðA Þ CjA i i j j Let V be the operator that acts as V ⊗ I R on the i;j i A subspace H ⊗ H , and as zero on the subspace λ A B i The only nonzero terms correspond to i ¼ j; hence, H ⊗ H , for j ≠ i. Similarly, let W be the operator λ A i B j L X that acts as I ⊗ W on the subspace H ⊗ H , and as A i A λ i C L R ρ ρ ¼ ρ ⊗ ρ ¼ ρ : ðA28Þ BjA CjA BjA CjA BCjA i i zero on the subspace H ⊗ H for j ≠ i. Let A λ j C V ¼ V ; ðA18Þ i APPENDIX B: PROOF OF THEOREM 4 Proof that ð3Þ → ð2Þ.—The proof proceeds via an W ¼ W ; ðA19Þ inductive argument. Consider where W and V are unitary and ½V ⊗ I ;I ⊗ W¼ 0. ρ ¼ Tr ðρ Þ; λ λ B …B jA B …B B …B jA C B 1 n nþ1 k 1 k The channel represented by ρ can be dilated to the BCjA with 2 ≤ n<k, and assume that the claim holds for this unitary transformation U ¼ðI ⊗ WÞðV ⊗ I Þ, with λ λ B C channel; hence, ancillas λ and λ . From the form of V and W, it follows B C immediately that there is no causal influence from λ to B ρ ¼ ρ  ρ ; ðB1Þ B …B jA B jA B jA 1 n 1 n in U. From ½V ⊗ I ;I ⊗ W¼ 0 and the form of V and λ λ C B 031021-18 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) X X with ½ρ ; ρ ¼ 0, for i; j ¼ 1; …;n. It is shown Sðρ ˆ Þ¼HðpÞþ p logd L þ p Sðρ ˆ RÞ; B jA B jA i j B B …B jA i A i B …B jA 2 3 k 2 k i i i i that the claim remains true if one fewer system is traced X X out. To see this, recall that condition (3) gives Sðρ ˆ Þ¼HðpÞþ p logd L þ p Sðρ ˆ RÞ; B jA i A i B jA i i ¯ ¯ IðB ∶B jAÞ¼ 0, where IðB ∶B jAÞ is evaluated nþ1 nþ1 nþ1 nþ1 i i X X on ρ ˆ . Using Theorem 2 gives B …B jA 1 k Sðρ ˆ Þ¼HðpÞþ p logd L þ p Sðρ ˆ RÞ; B …B jA i i A B …B jA 3 k 3 k i i i i ρ ¼ ρ ¯ ρ ; X X B …B jA B jA B jA 1 k nþ1 nþ1 L R Sðρ ˆ Þ¼HðpÞþ p logd þ p Sðρ ˆ Þ: ·jA i A i ·jA i i with ½ρ ¯ ; ρ ¼ 0. Tracing out systems B …B i i B jA B jA nþ2 k kþ1 nþ1 results in Substituting into ρ ¼ ρ ρ ; B …B jA B …B jA B jA 1 nþ1 1 n nþ1 IðB ∶B ; …;B jAÞ¼ Sðρ ˆ Þþ Sðρ ˆ Þ 2 3 k B jA B …B jA 2 3 k with ½ρ ; ρ ¼ 0. Since ρ satisfies B …B jA B jA B …B jA 1 n nþ1 1 n − Sðρ ˆ Þ − Sðρ ˆ Þ; B B …B jA ·jA 2 3 k Eq. (B1), it follows that the HðpÞ terms and the p log d L terms cancel, and one i A ρ ¼ ρ  ρ : B …B jA B jA B jA 1 nþ1 1 nþ1 is left with For any i ¼ 1; …;n, trace out all systems but B , B and i nþ1 R IðB ∶B ; …;B jAÞ¼ p IðB ∶B ; …;B jA Þ¼ 0: 2 3 k i 2 3 k A to see that ½ρ ; ρ ¼ 0. B jA B jA i nþ1 Hence, if ρ satisfies the claim, so too does B …B jA 1 n ρ .As ρ ¼ ρ ρ ,with ½ρ ;ρ ¼ 0, B …B jA B B jA B jA B jA B jA B jA Non-negativity of both the conditional mutual information 1 nþ1 1 2 1 2 1 2 follows from IðB ∶B jAÞ¼ 0, and tracing out all but and the p implies 1 1 i systems B , B , and A, the proof is complete. 1 2 Proof that ð2Þ → ð3Þ.—This is immediate from Theorem IðB ∶B …B jA Þ¼ 0: 2 3 k 2, by grouping outputs into B and B for each i. i i Proof that ð3Þ ↔ ð4Þ.—The proof that ð4Þ → ð3Þ is Hence, each H R in the above decomposition further immediate from Theorem 2, by grouping outputs into B decomposes into a direct sum of tensor products. Iterating and B for each i. i this procedure results in the required decomposition. It remains to show that if IðB ∶B jAÞ¼ 0 for all i, then i i Proof that ð1Þ ↔ ð4Þ.—The proof that ð4Þ → ð1Þ is a there exists a decomposition straightforward extension of the proof in Appendix A that condition ð4Þ → ð1Þ in Theorem 2. To show that ð1Þ → ð4Þ, first use Definition 4 to show that H ¼⨁ ⊗ H j ; ðB2Þ i ¯ j¼1 i if, for each i, there is no causal influence from λ to B ,it i i follows that, for each i, there is no causal influence from λ with ρ ¼ ðρ 1 ⊗  ⊗ ρ kÞ. B …B jA ¯ i B jA B jA 1 k 1 k to B . Partitioning the output systems into B and B , and i i i i i Given IðB ∶B jAÞ¼ 0, Theorem 2 implies that H ¯ 1 1 A the ancilla systems λ ; …; λ into λ and λ , it follows from 1 k i i decomposes as Theorem 2 that IðB ∶B jAÞ¼ 0. Hence, condition i i ð1Þ → ð3Þ, and since condition ð3Þ → ð4Þ, the result follows. H ¼⨁H L ⊗ H R; A A A i i with ρ ¼ ρ L ⊗ ρ R. By assumption, B …B jA i B jA B …B jA 1 k 1 2 k i i IðB ∶B jAÞ¼ IðB ∶B ;B ; …;B jAÞ¼ 0.Asthe 2 2 2 1 3 k [1] H. Reichenbach, The Direction of Time, edited by M. conditional mutual information never increases if systems Reichenbach (University of California Press, Berkeley, are discarded, we have 0 ¼ IðB ∶B ;B ;…;B jAÞ≥ CA, 1991). 2 1 3 k IðB ∶B ;…;B jAÞ. Non-negativity of the conditional mutual [2] C. J. Wood and R. W. Spekkens, The Lesson of 2 3 k Causal Discovery Algorithms for Quantum Correlations: information then yields IðB ∶B ; …;B jAÞ¼ 0. 2 3 k Causal Explanations of Bell-Inequality Violations Require The above decomposition ensures Fine-Tuning, New J. Phys. 17, 033002 (2015). [3] J. S. Bell, Speakable and Unspeakable in Quantum I L ρ ˆ ¼ p ⊗ ρ ˆ R; Mechanics (Cambridge University Press, Cambridge, B …B jA i B …B jA 2 k 2 k d L i England, 1964). [4] John F. Clauser, M. A. Horne, A. Shimony, and R. A. Holt, with p ¼ d Ld R=d . As the terms in the sum on the rhs Proposed Experiment to Test Local Hidden-Variable i A A A i i have support on orthogonal subspaces, Theories, Phys. Rev. Lett. 23, 880 (1969). 031021-19 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) [5] B. Hensen et al., Loophole-Free Bell Inequality Violation [20] M.-D. Choi, Completely Positive Linear Maps on Complex Using Electron Spins Separated by 1.3 Kilometres, Nature Matrices, Linear Algebra Appl. 10, 285 (1975). (London) 526, 682 (2015). [21] M. S. Leifer and R. W. Spekkens, Towards a Formulation of [6] L. K. Shalm, E. Meyer-Scott, B. G. Christensen, P. Bierhorst, Quantum Theory as a Causally Neutral Theory of Bayesian M. A. Wayne, M. J. Stevens, T. Gerrits, S. Glancy, D. R. Inference, Phys. Rev. A 88, 052130 (2013). Hamel, M. S. Allman et al., Strong Loophole-Free Test of [22] B. Schumacher and M. D. Westmoreland, Locality and Local Realism, Phys. Rev. Lett. 115, 250402 (2015). Information Transfer in Quantum Operations, Quantum [7] M. Giustina, M. A. M. Versteegh, S. Wengerowsky, Inf. Process. 4, 13 (2005). J. Handsteiner, A. Hochrainer, K. Phelan, F. Steinlechner, [23] D. Beckman, D. Gottesman, M. A. Nielsen, and J. Preskill, J. Kofler, J.-Å. Larsson, C. Abellán et al., Significant- Causal and Localizable Quantum Operations, Phys. Rev. A Loophole-Free Test of Bell’s Theorem with Entangled 64, 052309 (2001). Photons, Phys. Rev. Lett. 115, 250401 (2015). [24] T. Eggeling, D. Schlingemann, and R. F. Werner, [8] J. Pearl, Causality: Models, Reasoning, and Inference, 2nd Semicausal Operations are Semilocalizable, Europhys. ed. (Cambridge University Press, Cambridge, England, Lett. 57, 782 (2002). 2009). [25] G. M. D’Ariano, S. Facchini, and P. Perinotti, No-Signaling, Entanglement-Breaking, and Localizability in Bipartite [9] P. Spirtes, C. Glymour, and R. Scheines, Causation, Channels, Phys. Rev. Lett. 106, 010501 (2011). Prediction, and Search, 2nd ed. (MIT Press, Cambridge, [26] One could equally well evaluate the conditional mutual MA, 2001). information on any distribution PðXYZÞ obtained from [10] R. Chaves, R. Kueng, J. B. Brask, and D. Gross, Unifying PðYZjXÞ and an input distribution PðXÞ that has full Framework for Relaxations of the Causal Assumptions in support; again, the statement IðY∶ZjXÞ¼ 0 is equivalent Bell’s Theorem, Phys. Rev. Lett. 114, 140403 (2015). to the conditional independence of Y and Z given X.We [11] R. Chaves, C. Majenz, and D. Gross, Information-Theoretic consider the particular case of a uniform distribution over X Implications of Quantum Causal Structures, Nat. Commun. in order to maintain the strongest possible analogy with the 6, 5766 (2015). quantum case. [12] M. S. Leifer and D. Poulin, Quantum Graphical Models and [27] P. Hayden, R. Jozsa, D. Petz, and A. Winter, Structure of Belief Propagation, Ann. Phys. (Amsterdam) 323, 1899 States which Satisfy Strong Subadditivity of Quantum (2008). Entropy with Equality, Commun. Math. Phys. 246, 359 [13] J. F. Fitzsimons, J. A. Jones, and V. Vedral, Quantum (2004). Correlations which Imply Causation, Sci. Rep. 5, 18281 [28] B. Coecke and R. Lal, Causal Categories: Relativistically (2015). Interacting Processes, Found. Phys. 43, 458 (2013). [14] K. Ried, M. Agnew, L. Vermeyden, D. Janzing, R. W. [29] The analogous mixed notation also appears in Fig. 5 for the Spekkens, and K. J. Resch, A Quantum Advantage for classical case. Inferring Causal Structure, Nat. Phys. 11, 414 (2015). [30] In introducing the circle notation, we define the action of [15] E. G. Cavalcanti and R. Lal, On Modifications of gates on the left or right factors in such a way that coherence Reichenbach’s Principle of Common Cause in Light of between different subspaces in the direct sum can be Bell’s Theorem, J. Phys. A 47, 424018 (2014). maintained. This corresponds to the fact that the condition [16] Reichenbach’s principle assumes that it cannot happen both of decomposing into the appropriate form is applied to the that Y is a cause of Z and that Z is a cause of Y. This is unitary or Kraus operators, rather than to the channel natural if Y and Z are physical variables pertaining to operator itself. We have done this on the grounds that with systems that are localized in space and time. But it is an this definition, the circle notation is most likely to be useful assumption that may not hold for the generic case in which in future applications. In the lower two circuits of Fig. 6, causal explanations are sought for statistical data, since there however, note that coherence between the different sub- can be causal feedback loops: partaking of an addiction, say, spaces is lost. In the lower right circuit, coherence is lost may cause a low mood, which in turn may worsen the when the partial traces are performed on the extra outgoing addiction. Ultimately, it is the specific application that will wires. In the lower left circuit, the final output admits a determine whether adoption of the qualitative part of global factorization of the form B ⊗ C, and output wires Reichenbach’s principle (more generally, the formalism of carrying an i index do not even appear, indicating that this directed acyclic graphs introduced below) is appropriate. degree of freedom has been traced out. Each Kraus operator, [17] In the community studying classical causal inference, a in this case, must act nontrivially only on the ith subspace, deterministic causal dependence such as Y ¼ fðX; λÞ is for some i, and one may deduce that ρ is of the form BjA termed a structural equation, and a causal model wherein all ρ L ⊗ I R, and similarly ρ , consistently with con- causal dependences are deterministic is termed a functional BjA A CjA i i causal model (see, e.g., Ref. [8]). dition (4) of Theorem 3. [18] This is because any other λ would necessarily introduce new [31] Clearly, the notation can be extended in various ways to common causes for Y and Z that are not screened through X, include circles with multiple output wires, circles indicating which would violate the assumption that X is a complete a further decomposition following another circle, and so common cause. on. A fully general interpretation and calculus for these [19] A. Jamiołkowski, Linear Transformations which Preserve extended circuit diagrams is left for future work. Trace and Positive Semidefiniteness of Operators, Rep. [32] The quantum version in Fig. 9 was studied for similar Math. Phys. 3, 275 (1972). reasons in Ref. [33], though from a different perspective. 031021-20 QUANTUM COMMON CAUSES AND QUANTUM CAUSAL MODELS PHYS. REV. X 7, 031021 (2017) [33] B. Schumacher and M. D. Westmoreland, Isolation and [48] O. Oreshkov, F. Costa, and Č. Brukner, Quantum Corre- Information Flow in Quantum Dynamics, Found. Phys. 42, lations with No Causal Order, Nat. Commun. 3, 1092 926 (2012). (2012). [34] C. M. Caves, C. A. Fuchs, and R. Schack, Quantum Prob- [49] M. Araújo, F. Costa, and Č. Brukner, Computational abilities as Bayesian Probabilities, Phys. Rev. A 65, 022305 Advantage from Quantum-Controlled Ordering of Gates, (2002). Phys. Rev. Lett. 113, 250402 (2014). [35] C. A. Fuchs, Quantum Mechanics as Quantum Information [50] F. Costa and S. Shrapnel, Quantum Causal Modelling, New (and Only a Little More), arXiv:quant-ph/0205039. J. Phys. 18, 063032 (2016). [36] R. W. Spekkens, Evidence for the Epistemic View of [51] R. Oeckl, A General Boundary, Formulation for Quantum Quantum States: A Toy Theory, Phys. Rev. A 75, 032110 Mechanics and Quantum Gravity, Phys. Lett. B 575, 318 (2007). (2003). [37] M. S. Leifer, Quantum Dynamics as an Analog of Condi- [52] O. Oreshkov and N. J. Cerf, Operational Quantum Theory tional Probability, Phys. Rev. A 74, 042310 (2006). without Predefined Time, New J. Phys. 18, 073037 (2016). [38] R. W. Spekkens, in Quantum Theory: Informational Foun- [53] O. Oreshkov and N. J. Cerf, Operational Formulation of dations and Foils, edited by G. Chiribella and R. W. Time Reversal in Quantum Theory, Nat. Phys. 11, 853 Spekkens (Springer, New York, 2016), pp. 83–135. (2015). [39] It is interesting to consider an exactly analogous scenario, as [54] This might be seen as a generalization of standard usage, it arises in the toy theory of Ref. [36]. Here, a system since in most treatments of quantum theory, the state analogous to a qubit can exist in one of four distinct classical describes a collection of systems at a single time, i.e., with states (the ontic states of the system). But an agent who none being causal descendants of others. prepares systems and measures them can only ever have [55] Classical interventional models, as we describe them, seem partial information about which of the four ontic states a rarely to be studied in full generality in the classical system is in. The toy equivalent of a CNOT gate corresponds literature. However, among the possible intervention to a reversible deterministic map, i.e., a permutation of the schemes are included the following special cases: ignoring ontic states. By considering the probability distribution over I O X and repreparing X with a fixed value x of which one ontic states of the various systems, one may verify directly i i O I keeps a record, corresponding to Pðk ;X jX Þ¼ δ δ O that the ontic states of toy systems B and C are not i i i k ;x X ;x determined by the ontic state of toy system A. Rather, (the standard notion of intervention, as set out, e.g., in I O the ontic states of B and C depend also on the ontic state Ref. [8]); ignoring X and repreparing X with a value that i i of λ. Furthermore, the analogue of a pure quantum state for λ is is sampled randomly and independently of X and of which a probability distribution on λ that is not a point distribution. O I one keeps a record, corresponding to Pðk ;X jX Þ¼ i i i In this way, statistical correlations between B and C can be O I δ OPðX Þ (a randomized trial); observing X , keeping k ;X i i underwritten by statistical variation in the ontic state of λ. a record of this value, and preparing X to have this value, [40] D. Horsman, C. Heunen, M. F. Pusey, J. Barrett, and R. W. O I corresponding to Pðk ;X jX Þ¼ δ I δ O I (passive ob- i k ;X X ;X Spekkens, Can a Quantum State Over Time Resemble a i i i i i Quantum State at a Single Time?, arXiv:1607.03637. servation of X ); observing X , keeping a record of its value, i i [41] Y. Aharonov and L. Vaidman, in Time in Quantum and repreparing X to have a fixed value x, corresponding to O I Mechanics, edited by G. Muga, R. S. Mayato, and I. Pðk ;X jX Þ¼ δ I δ O (the sort of intervention consid- i i k ;X X ;x i i Egusquiza (Springer, New York, 2007), pp. 399–447. ered in single-world intervention graphs [72]); simply [42] Y. Aharonov, S. Popescu, J. Tollaksen, and L. Vaidman, O I letting the value of X track the value of X , corresponding i i Multiple-Time States and Multiple-Time Measurements in O I to k being trivial, and PðX jX Þ¼ δ O I (no observation i i X ;X i i Quantum Mechanics, Phys. Rev. A 79, 052110 (2009). being made); and many others besides. [43] Y. Aharonov, S. Popescu, and J. Tollaksen, Each Instant of [56] M. Rédei, in Non-Locality and Modality: Proceedings of Time a New Universe,in Quantum Theory: A Two-Time the Nato Advanced Research Workshop on Modality, Success Story, Yakir Aharonov Festschrift, edited by D. C. Probability, and Bell’s Theorems, NATO Science Series, Struppa and J. M. Tollaksen (Springer, New York, 2013), Vol. 64, edited by T. Placek and J. Butterfield (Springer, pp. 21–36. Dordrecht, 2002), pp. 259–270. [44] R. Silva, Y. Guryanova, N. Brunner, N. Linden, A. J. Short, [57] G. Hofer-Szabó, M. Rédei, and L. E. Szabó, On Reich- and S. Popescu, Pre- and Postselected Quantum States: enbach’s Common Cause Principle and Reichenbach’s Density Matrices, Tomography, and Kraus Operators, Notion of Common Cause, Br. J. Philos. Sci. 50, 377 (1999). Phys. Rev. A 89, 012121 (2014). [58] G. Hofer-Szabó and P. Vecsernyés, Noncommuting Local [45] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Theoretical Common Causes for Correlations Violating the Clauser- Framework for Quantum Networks, Phys. Rev. A 80, Horne Inequality, J. Math. Phys. (N.Y.) 53, 122301 (2012). 022339 (2009). [59] G. Hofer-Szabó and P. Vecsernyés, Bell Inequality and [46] G. Chiribella, Perfect Discrimination of No-Signalling Common Causal Explanation in Algebraic Quantum Field Channels via Quantum Superposition of Causal Structures, Theory, Stud. Hist. Phil. Mod. Phys. 44, 404 (2013). Phys. Rev. A 86, 040301 (2012). [47] G. Chiribella, G. M. D’Ariano, P. Perinotti, and B. Valiron, [60] C. Branciard, N. Gisin, and S. Pironio, Characterizing Quantum Computations without Definite Causal Structure, the Nonlocal Correlations Created via Entanglement Swap- Phys. Rev. A 88, 022318 (2013). ping, Phys. Rev. Lett. 104, 170401 (2010). 031021-21 ALLEN, BARRETT, HORSMAN, LEE, and SPEKKENS PHYS. REV. X 7, 031021 (2017) [61] T. Fritz, Beyond Bell’s Theorem: Correlation Scenarios, [74] R. R. Tucci, Quantum Bayesian Nets, Int. J. Mod. Phys. B New J. Phys. 14, 103001 (2012). 09, 295 (1995). [62] R. Chaves, L. Luft, and D. Gross, Causal Structures [75] R. R. Tucci, An Introduction to Quantum Bayesian from Entropic Information: Geometry and Novel Scenarios, Networks for Mixed States, arXiv:1204.1550. New J. Phys. 16, 043001 (2014). [76] C. M. Lee and R. W. Spekkens, Causal Inference via [63] J. Henson, R. Lal, and M. F. Pusey, Theory-Independent Algebraic Geometry: Necessary and Sufficient Conditions Limits on Correlations from Generalized Bayesian for the Feasibility of Discrete Causal Models, arXiv: Networks, New J. Phys. 16, 113043 (2014). 1506.03880. [64] T. Fritz, Beyond Bell’s Theorem II: Scenarios with Arbitrary [77] E. Wolfe, R. W. Spekkens, and T. Fritz, The Inflation Causal Structure, Commun. Math. Phys. 341, 391 (2016). Technique for Causal Inference with Latent Variables, [65] L. Hardy, Quantum Theory From Five Reasonable Axioms, arXiv:1609.00672. arXiv:quant-ph/0101012. [78] R. Chaves, L. Luft, T. O. Maciel, D. Gross, D. Janzing, and [66] J. Barrett, Information Processing in Generalized Probabi- B. Schölkopf, in Proceedings of the 30th Conference on listic Theories, Phys. Rev. A 75, 032304 (2007). Uncertainty in Artificial Intelligence (UAI 2014) (AUAI [67] S. Abramsky and B. Coecke, in Handbook of Quantum Press, Corvallis, 2014), pp. 112–121. Logic and Quantum Structures: Quantum Logic, edited [79] R. Chaves, Polynomial Bell Inequalities, Phys. Rev. Lett. by K. Engesser, D. Gabbay, and D. Lehmann (Elsevier, 116, 010402 (2016). New York, 2009), pp. 261–324. [80] D. Rosset, C. Branciard, T. J. Barnea, G. Pütz, N. Brunner, [68] B. Coecke, Quantum Picturalism, Contemp. Phys. 51,59 and N. Gisin, Nonlinear Bell Inequalities Tailored for (2010). Quantum Networks, Phys. Rev. Lett. 116, 010403 [69] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Probabi- (2016). listic Theories with Purification, Phys. Rev. A 81, 062348 [81] B. S. Tsirelson, Quantum Generalizations of Bell’s Inequal- (2010). ity, Lett. Math. Phys. 4, 93 (1980). [70] L. Hardy, The Operator Tensor Formulation of Quantum [82] S. Wehner, Tsirelson Bounds for Generalized Clauser- Theory, Phil. Trans. R. Soc. A 370, 3385 (2012). Horne-Shimony-Holt Inequalities, Phys. Rev. A 73, [71] L. Hardy, Quantum Gravity Computers: On the Theory 022110 (2006). of Computation with Indefinite Causal Structure, arXiv: [83] J.-P. W. MacLean, K. Ried, R. W. Spekkens, and K. J. quant-ph/0701019. Resch, Quantum-Coherent Mixtures of Causal Relations, [72] T. S. Richardson and J. M. Robins, Single World Interven- arXiv:1606.04523. tion Graphs (SWIGs): A Unification of the Counterfactual [84] A. Feix and Č. Brukner, Quantum Superpositions of and Graphical Approaches to Causality, Center for “Common-Cause” and “Direct-Cause” Causal Structures, Statistics and the Social Sciences Working Papers Series arXiv:1606.09241. No. 128 (University of Washington, Seattle, 2013). [85] M. B. Ruskai, Inequalities for Quantum Entropy: A Review [73] J. Pienaar and Č. Brukner, A Graph-Separation Theorem for with Conditions for Equality, J. Math. Phys. (N.Y.) 43, 4358 Quantum Causal Models, New J. Phys. 17, 073020 (2015). (2002). 031021-22

Journal

Physical Review XAmerican Physical Society (APS)

Published: Jul 1, 2017

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 12 million articles from more than
10,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Unlimited reading

Read as many articles as you need. Full articles with original layout, charts and figures. Read online, from anywhere.

Stay up to date

Keep up with your field with Personalized Recommendations and Follow Journals to get automatic updates.

Organize your research

It’s easy to organize your research with our built-in tools.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve Freelancer

DeepDyve Pro

Price
FREE
$49/month

$360/year
Save searches from
Google Scholar,
PubMed
Create lists to
organize your research
Export lists, citations
Read DeepDyve articles
Abstract access only
Unlimited access to over
18 million full-text articles
Print
20 pages/month
PDF Discount
20% off