Abstract Background The majority of health-service users seem unable to properly compute the positive predictive value of medical tests. The research reported in the present study sought to investigate whether, and to what extent, probabilistic inferences about a positive test result can be improved by changing the traditional way in which probability judgments are elicited and medical information is presented. Methods Online survey respondents were presented with a positive test result regarding a pregnant woman, and had to estimate the chances that her unborn baby had an anomaly (standard judgment), to apportion the numbers of chances for and against this hypothesis (distributive judgment), and to indicate whether the hypothesis that the baby had an anomaly was more or less likely than its alternative (relative judgment). Test sensitivity and information framing were also manipulated. Results Irrespective of education and to some extent of numeracy, the majority of respondents produced correct distributive assessments of chances, which were in line with relative judgments and more accurate than standard ones. When information displayed exclusively positive test results, inferences resulted further improved and unaffected by test sensitivity. Conclusions Simple communication strategies that prompt extensional reasoning on the relevant set of number of chances can help individuals to overcome probabilistic inference errors. Probabilistic reasoning, Positive predictive value, Doctor, patient communication, Single-event probabilities, Distributive judgments, Prenatal testing Introduction Evidence-based medical practice often involves dealing with statistical information. Doctors as well as patients need to rely on statistics in order to make appropriate diagnostic inferences and, ultimately, good decisions. Both, however, have difficulties in integrating statistics with clinical information, as happens when they have to estimate the probability that a person with a positive test result has a given disease (i.e., the positive predictive value of the test [1–3]). The prevailing view is that probabilistic reasoning accuracy mainly depends on the format in which information is presented. A number of studies [3–5] have shown that people’s inferences are more accurate when relevant information (i.e., the prevalence rate of a disease, the test sensitivity, and the false-positive rate of a test) is conveyed by natural frequencies (e.g., “12 in every 15”) than percentages (e.g., “80%”). For a detailed comparison of the two formats, consider the examples in the first and second columns from the left in Table 1. These results led to the widespread recommendation to communicate test data using the natural frequency format rather than single-event probabilities [6–8]. But is the frequency format really effective in minimizing laypeople’s errors in interpretation of positive test results? Let us draw attention to two problematic aspects that, in our opinion, have not been given due consideration thus far. Table 1 Different Numerical Formats for Presenting the Same Statistical Information (and Corresponding Standard Questions) Concerning a Prenatal Test for Detecting the Presence of a Chromosomal Anomaly Percentages format Natural frequency format Chances format Prevalence rate The probability that a pregnant woman carries a child with Down syndrome is 0.15%. 15 out of every 10,000 pregnant women carry a child with Down syndrome. A pregnant woman has 15 chances out of 10,000 of carrying a child with Down syndrome. Test sensitivity The probability that a pregnant woman carrying a child with Down syndrome has a positive result on the “nuchal translucency scan” test is 80%. 12 out of every 15 pregnant women carrying a child with Down syndrome have a positive result on the “nuchal translucency scan” test. 12 of these 15 chances of carrying a child with Down syndrome are associated with a positive result on the “nuchal translucency scan” test. False positive rate (= 1 – test specificity) The probability that a pregnant woman carrying a child without Down syndrome has a positive result on the “nuchal translucency scan” test is 8%. 799 out of every 9,985 pregnant women carrying a child without Down syndrome have a positive result on the “nuchal translucency scan” test. 799 of the remaining 9,985 chances of carrying a child without Down syndrome are associated with a positive result on the “nuchal translucency scan” test. Standard question What is the probability that a pregnant woman who got a positive result on the “nuchal translucency scan” carries a child with Down syndrome? ________ % How many women in a new sample of pregnant women who got a positive result on the “nuchal translucency scan” test carry a child with Down syndrome? ________ out of ________ What are the chances that a pregnant woman who got a positive result on the “nuchal translucency scan” test carries a child with Down syndrome? _____ chance(s) out of ______ Percentages format Natural frequency format Chances format Prevalence rate The probability that a pregnant woman carries a child with Down syndrome is 0.15%. 15 out of every 10,000 pregnant women carry a child with Down syndrome. A pregnant woman has 15 chances out of 10,000 of carrying a child with Down syndrome. Test sensitivity The probability that a pregnant woman carrying a child with Down syndrome has a positive result on the “nuchal translucency scan” test is 80%. 12 out of every 15 pregnant women carrying a child with Down syndrome have a positive result on the “nuchal translucency scan” test. 12 of these 15 chances of carrying a child with Down syndrome are associated with a positive result on the “nuchal translucency scan” test. False positive rate (= 1 – test specificity) The probability that a pregnant woman carrying a child without Down syndrome has a positive result on the “nuchal translucency scan” test is 8%. 799 out of every 9,985 pregnant women carrying a child without Down syndrome have a positive result on the “nuchal translucency scan” test. 799 of the remaining 9,985 chances of carrying a child without Down syndrome are associated with a positive result on the “nuchal translucency scan” test. Standard question What is the probability that a pregnant woman who got a positive result on the “nuchal translucency scan” carries a child with Down syndrome? ________ % How many women in a new sample of pregnant women who got a positive result on the “nuchal translucency scan” test carry a child with Down syndrome? ________ out of ________ What are the chances that a pregnant woman who got a positive result on the “nuchal translucency scan” test carries a child with Down syndrome? _____ chance(s) out of ______ View Large Table 1 Different Numerical Formats for Presenting the Same Statistical Information (and Corresponding Standard Questions) Concerning a Prenatal Test for Detecting the Presence of a Chromosomal Anomaly Percentages format Natural frequency format Chances format Prevalence rate The probability that a pregnant woman carries a child with Down syndrome is 0.15%. 15 out of every 10,000 pregnant women carry a child with Down syndrome. A pregnant woman has 15 chances out of 10,000 of carrying a child with Down syndrome. Test sensitivity The probability that a pregnant woman carrying a child with Down syndrome has a positive result on the “nuchal translucency scan” test is 80%. 12 out of every 15 pregnant women carrying a child with Down syndrome have a positive result on the “nuchal translucency scan” test. 12 of these 15 chances of carrying a child with Down syndrome are associated with a positive result on the “nuchal translucency scan” test. False positive rate (= 1 – test specificity) The probability that a pregnant woman carrying a child without Down syndrome has a positive result on the “nuchal translucency scan” test is 8%. 799 out of every 9,985 pregnant women carrying a child without Down syndrome have a positive result on the “nuchal translucency scan” test. 799 of the remaining 9,985 chances of carrying a child without Down syndrome are associated with a positive result on the “nuchal translucency scan” test. Standard question What is the probability that a pregnant woman who got a positive result on the “nuchal translucency scan” carries a child with Down syndrome? ________ % How many women in a new sample of pregnant women who got a positive result on the “nuchal translucency scan” test carry a child with Down syndrome? ________ out of ________ What are the chances that a pregnant woman who got a positive result on the “nuchal translucency scan” test carries a child with Down syndrome? _____ chance(s) out of ______ Percentages format Natural frequency format Chances format Prevalence rate The probability that a pregnant woman carries a child with Down syndrome is 0.15%. 15 out of every 10,000 pregnant women carry a child with Down syndrome. A pregnant woman has 15 chances out of 10,000 of carrying a child with Down syndrome. Test sensitivity The probability that a pregnant woman carrying a child with Down syndrome has a positive result on the “nuchal translucency scan” test is 80%. 12 out of every 15 pregnant women carrying a child with Down syndrome have a positive result on the “nuchal translucency scan” test. 12 of these 15 chances of carrying a child with Down syndrome are associated with a positive result on the “nuchal translucency scan” test. False positive rate (= 1 – test specificity) The probability that a pregnant woman carrying a child without Down syndrome has a positive result on the “nuchal translucency scan” test is 8%. 799 out of every 9,985 pregnant women carrying a child without Down syndrome have a positive result on the “nuchal translucency scan” test. 799 of the remaining 9,985 chances of carrying a child without Down syndrome are associated with a positive result on the “nuchal translucency scan” test. Standard question What is the probability that a pregnant woman who got a positive result on the “nuchal translucency scan” carries a child with Down syndrome? ________ % How many women in a new sample of pregnant women who got a positive result on the “nuchal translucency scan” test carry a child with Down syndrome? ________ out of ________ What are the chances that a pregnant woman who got a positive result on the “nuchal translucency scan” test carries a child with Down syndrome? _____ chance(s) out of ______ View Large First, observed frequencies in a sample enable inferences about similar samples. Indeed, the superiority of natural frequencies over percentages (for a recent review, see [9]) has been demonstrated using scenarios referring to groups of individuals (e.g., see the question for the natural frequency format in Table 1). In medical practice, however, dealing with uncertainty often means handling single-event probabilities, in the sense that health-service users are concerned with the evaluation of their own specific case more than with predictions about samples of individuals. The cognitive implications of using group statistics to make probabilistic inferences concerning single individuals have not been fully investigated and are not trivial because patients may not (sometimes reasonably) consider themselves fully comparable to the reference population. Second, the extent to which people actually benefit from using the natural frequency format appears limited. Although higher in this format than with percentages, the correctness of judgments concerning the positive predictive value of a medical test is far from being completely accurate. In particular, the accuracy rate has been reported to be less than one-half for well-educated participants, such as experienced German physicians (46% [3]), Austrian (46% [4]), and U.S. university students (31% [10]). It is even lower for laypeople, such as patients at Spanish hospitals (around 25% [11]), Swiss residents (12% [12]), U.S. adults recruited using Amazon Mechanical Turk (AMT; 2% [13]), and UK midwifes recruited at training events or in maternity wards (0% [1]). These results are in line with the finding that the natural frequency format improves probabilistic inferences more for high-numerate than low-numerate individuals [14, 15]. Therefore, the use of a natural frequency format is not a panacea and does not appear to aid those individuals who, in principle, need more assistance, such as the less-educated low-numerate heath-service users. It is also interesting to note that the percentage and natural frequency formats yield somehow opposite errors: a neglect of the prevalence rate [16] versus a disregard of new evidence [4, 13, 17, 18], respectively. All this casts serious doubts on the conclusion that the natural frequency format is an effective tool for fostering probabilistic reasoning. In this study, we examined the effectiveness of three different strategies for improving the understanding of positive prenatal test results which avoid the two above-mentioned problems. We chose to focus on (screening and diagnostic) prenatal tests for two reasons. They are widespread, since most newly pregnant women are encouraged to undergo them so that possible abnormalities of the embryonal/fetal chromosome set [19–21] can be detected. Moreover, the interpretation of their results can play a crucial role in important decisions, such as whether to undertake further investigations, to parent a child with a disability, or to terminate the pregnancy [22, 23]. The first strategy consists in expressing numerical information using numbers of chances [24, 25]. The superiority of natural frequencies over percentages has been explained assuming that the former are easier to handle than the latter because they incorporate more valuable information. Indeed, differently from percentages, natural frequencies express probabilities in terms of positive integer numbers and retain the sample size of the reference class (e.g., “15 out of 10,000”). Moreover, natural frequencies make explicit the nested set structure of the problem. For instance, in the natural frequency example reported in Table 1, the number of true positives is nested within the number of women who carry a child with Down syndrome and, symmetrically, the number of false positives is nested within the number of women who do not carry a child with Down syndrome. This decreases the computational complexity of the problem [4, 16, 26, 27], whose solution can be easily calculated by dividing the number of true positives by the sum of true and false positives. The same computational facilitation can be obtained with numbers of chances (see the example in the third column of Table 1). Indeed, when problem structure and question were equated, the accuracy rates obtained with the natural frequency and numbers of chances formats proved to be comparable [24, 28]. Unlike natural frequencies, however, numbers of chances can refer to single cases [29], and are typically interpreted as single-event probabilities [30, 31]. This makes the chance format suitable to express probabilistic information that refers to a single person or event. The second and third strategies are aimed at enhancing inference accuracy by prompting health-service users to reason extensionally, without applying the rules of probability calculus. In particular, the second strategy concerns the elicitation of distributive judgments [17, 24]. Indeed, once a test result is given (e.g., “positive”), standard probability judgments can be efficiently replaced with a simple apportionment of the numbers of chances that, in light of the positive test result, favor a hypothesis (i.e., “the presence of the anomaly”) versus its complementary (i.e., “the absence of the anomaly”). The higher accuracy of distributive judgments with respect to standard ones has been demonstrated with educated samples of respondents [24], laypeople [17], and children aged between 9 and 12 years old [28]. Finally, the third strategy consists in reframing the information concerning the test. When numbers of chances (or natural frequencies) are used, the prevalence rate of the condition is not needed to compute the positive predictive value of the test. Accordingly, to facilitate health-service users’ comprehension of what a positive test result means, the prevalence rate can be omitted, and the total number of chances of having a positive result (i.e., the sum of the “chances of having a positive test result and the anomaly” and the “chances of having a positive test result but not the anomaly”) can be made explicit. Such a strategy has not yet been implemented with adults before, but it has yielded promising results with children’s Bayesian inferences in a different reasoning context [28]. In what follows, we will describe the results of three experiments in which the first two strategies were used to facilitate the interpretation of diagnostic (Experiments 1, 2, and 3) and screening prenatal test results (Experiments 2 and 3). The three strategies were then combined to determine their joint effectiveness (Experiment 3) across different kinds of judgments (i.e., distributive judgments and respondents’ expectations in terms of endorsed hypothesis). The prenatal tests to which we referred in our experimental scenarios were the nuchal translucency scan (hereafter, “NT scan”) [32], a common screening procedure for detecting Trisomy 21, and the Fetal DNA tests for trisomies 21 and 18 [33]. We consider the latter to be diagnostic procedures because their high (over 99%) sensitivity and specificity match those of more traditional diagnostic tests, such as Chorionic Villus Sampling [34]. However, because Fetal DNA tests are noninvasive, they are commonly included also among screening procedures [35]. Experiment 1 The aim of Experiment 1 was to compare distributive and standard judgments concerning a positive diagnostic test result expressed in a chance format. Methods Participants One hundred and eleven mothers or mothers-to-be subscribers to an Italian pediatric website accepted our online invitation to take part in this experiment. Of these, 21 (19%) were pregnant (mean age: 31 years; SD = 5.2; mean pregnancy time: 26.7 weeks), 86 (77%) had been pregnant in the past 3 years (mean age: 35 years; SD = 4.3), and 4 (4%) women had been pregnant more than 3 years earlier (mean age: 41 years; SD = 2.2). Most respondents had a university degree (59%) or high school diploma (40%), and only a few of them (1%) had a lower educational level. Participation was voluntary and anonymous. Stimuli and procedure Respondents were randomly assigned to one of four conditions, in a 2 (Scenario: Trisomy 21 test vs. Trisomy 18) × 2 (Question: standard vs. distributive) between-participants design. The scenarios presented realistic information about diagnostic prenatal tests [33] framed as numbers of chances. Table 2 extensively reports the scenarios and the questions employed in Experiment 1. On conclusion of the tasks, all respondents were asked to answer an 11-item numeracy scale [36]. Table 2 Scenarios and Questions Used in Experiment 1 Scenarios There is a prenatal test to determine whether a fetus has a chromosomal anomaly. When the test (correctly or incorrectly) indicates the presence of the anomaly, the test result is positive. Here is the information about the test and the anomaly. (Trisomy 21) (Trisomy 18) • A pregnant woman who undergoes the test has 5 chances out of 1,000 of having a child with an anomaly. • A pregnant woman who undergoes the test has 3 chances out of 1,000 of having a child with an anomaly. • In all these 5 chances, the test result is correctly positive. • In all these 3 chances, the test result is correctly positive. • In 2 of the remaining 995 chances of not having a child with an anomaly, the test result is incorrectly positive. • In 9 of the remaining 997 chances of not having a child with an anomaly, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Questions Standard Therefore, in light of her positive test result, there are ______ chance(s) out of _______ that an anomaly is present. Distributive Therefore, there are ______ chance(s) that her positive test result correctly indicates the presence of the anomaly, and there are ____ chance(s) that her positive test result incorrectly indicates the presence of the anomaly. Scenarios There is a prenatal test to determine whether a fetus has a chromosomal anomaly. When the test (correctly or incorrectly) indicates the presence of the anomaly, the test result is positive. Here is the information about the test and the anomaly. (Trisomy 21) (Trisomy 18) • A pregnant woman who undergoes the test has 5 chances out of 1,000 of having a child with an anomaly. • A pregnant woman who undergoes the test has 3 chances out of 1,000 of having a child with an anomaly. • In all these 5 chances, the test result is correctly positive. • In all these 3 chances, the test result is correctly positive. • In 2 of the remaining 995 chances of not having a child with an anomaly, the test result is incorrectly positive. • In 9 of the remaining 997 chances of not having a child with an anomaly, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Questions Standard Therefore, in light of her positive test result, there are ______ chance(s) out of _______ that an anomaly is present. Distributive Therefore, there are ______ chance(s) that her positive test result correctly indicates the presence of the anomaly, and there are ____ chance(s) that her positive test result incorrectly indicates the presence of the anomaly. View Large Table 2 Scenarios and Questions Used in Experiment 1 Scenarios There is a prenatal test to determine whether a fetus has a chromosomal anomaly. When the test (correctly or incorrectly) indicates the presence of the anomaly, the test result is positive. Here is the information about the test and the anomaly. (Trisomy 21) (Trisomy 18) • A pregnant woman who undergoes the test has 5 chances out of 1,000 of having a child with an anomaly. • A pregnant woman who undergoes the test has 3 chances out of 1,000 of having a child with an anomaly. • In all these 5 chances, the test result is correctly positive. • In all these 3 chances, the test result is correctly positive. • In 2 of the remaining 995 chances of not having a child with an anomaly, the test result is incorrectly positive. • In 9 of the remaining 997 chances of not having a child with an anomaly, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Questions Standard Therefore, in light of her positive test result, there are ______ chance(s) out of _______ that an anomaly is present. Distributive Therefore, there are ______ chance(s) that her positive test result correctly indicates the presence of the anomaly, and there are ____ chance(s) that her positive test result incorrectly indicates the presence of the anomaly. Scenarios There is a prenatal test to determine whether a fetus has a chromosomal anomaly. When the test (correctly or incorrectly) indicates the presence of the anomaly, the test result is positive. Here is the information about the test and the anomaly. (Trisomy 21) (Trisomy 18) • A pregnant woman who undergoes the test has 5 chances out of 1,000 of having a child with an anomaly. • A pregnant woman who undergoes the test has 3 chances out of 1,000 of having a child with an anomaly. • In all these 5 chances, the test result is correctly positive. • In all these 3 chances, the test result is correctly positive. • In 2 of the remaining 995 chances of not having a child with an anomaly, the test result is incorrectly positive. • In 9 of the remaining 997 chances of not having a child with an anomaly, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Questions Standard Therefore, in light of her positive test result, there are ______ chance(s) out of _______ that an anomaly is present. Distributive Therefore, there are ______ chance(s) that her positive test result correctly indicates the presence of the anomaly, and there are ____ chance(s) that her positive test result incorrectly indicates the presence of the anomaly. View Large Results In line with previous studies [4, 12], we considered as correct only responses that exactly matched the normative solutions, that is, “5 out of 7” (standard judgment) and “5 versus 2” (distributive judgment), for the Trisomy 21 scenario, and “3 out of 12” (standard judgment) and “3 versus 9” (distributive judgment), for the Trisomy 18 scenario. Education and numeracy did not differ among participants assigned to the four experimental conditions of Experiment 1 (χ2(6) = 6.22, p = .399, and χ2(30) = 31.54, p = .389, respectively). A logistic regression analysis was performed to ascertain the effects of question, scenario, education, and numeracy on the accuracy of probability judgments. Since very few participants (i.e., two in Experiment 1, one in Experiment 2, and one in Experiment 3) declared that they had the lowest education level, for all the three experiments, education was recoded into a dichotomous variable with those indicating that they had no more than a high school diploma in one category, and those reporting that they had a university degree in a second category. Question, scenario, and education were included in the logistic regression analyses as categorical independent variables, while numeracy as a continuous independent variable. The logistic regression model was statistically significant, χ2(4) = 50.02, p < .001. The model explained 49% (Nagelkerke R2) of the variance in the accuracy of the response and correctly classified 81% of cases. Wald χ2 statistics, however, showed that question was the only significant predictor (p < .001; Exp(B) = 22.6, 95% CI Exp(B) = 7.7–66.7), while scenario (p = .29), education (p = .446), and numeracy (p = .097) were not. Consistently with previous findings [17], most of our respondents (88%) proved unable to answer the standard question correctly by estimating the absolute probability that a pregnant woman with a positive diagnostic test result carried a child with an anomaly (Table 3). However, regardless of scenario, education, and numeracy, they (72%) correctly apportioned the number of chances for and against that hypothesis. Therefore, as expected, the distributive question proved more effective than the standard one in eliciting correct probability judgments. Table 3 Frequencies (and Corresponding Percentages) of Correct Responses in Experiment 1 Scenarios Questions Standard Distributive Trisomy 21 4/29 (14%) 22/27 (81%) Trisomy 18 3/28 (11%) 17/27 (63%) Scenarios Questions Standard Distributive Trisomy 21 4/29 (14%) 22/27 (81%) Trisomy 18 3/28 (11%) 17/27 (63%) View Large Table 3 Frequencies (and Corresponding Percentages) of Correct Responses in Experiment 1 Scenarios Questions Standard Distributive Trisomy 21 4/29 (14%) 22/27 (81%) Trisomy 18 3/28 (11%) 17/27 (63%) Scenarios Questions Standard Distributive Trisomy 21 4/29 (14%) 22/27 (81%) Trisomy 18 3/28 (11%) 17/27 (63%) View Large Experiment 2 In the previous experiment, both scenarios concerned a diagnostic test with an ideal 100% sensitivity, which means that false-negative results were not possible (i.e., the presence of the anomaly was always associated with a positive test result). Thus, given the information that the pregnant woman had a positive test result, respondents could easily answer the distributive question, which required only the identification of the values of the two relevant conjunctions (“positive test result & presence of the anomaly” and “positive test result & absence of the anomaly”). However, many tests, especially those for screening, have a sensitivity rate lower than 100% [37]. Consider, for example, the NT scan test scenario reported in Table 4, which has a sensitivity of 80% (i.e., in 12 of the 15 chances of having the anomaly the test is correctly positive). In this case, the conjunction “positive test result & presence of the anomaly” could implicitly evoke the conjunction “negative test result & presence of the anomaly” (whose value, in the example above, is “3 out of 15”). If respondents confused the latter with the “positive test result & absence of the anomaly” conjunction (i.e., conflated false negatives with false positives), they would produce incorrect distributive judgments. Consequently, we expected distributive judgments to be correct less frequently in scenarios concerning tests whose sensitivity rate is lower than 100%. The aim of Experiment 2 was to investigate this hypothesis and control whether distributive judgments are more accurate than standard ones even in (potentially) more complex situations. Table 4 Scenarios and Questions Used in Experiment 2 Scenarios Fetal DNA test NT scan test The “fetal DNA” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. The “nuchal translucency scan” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. • A pregnant woman who undergoes the test has 5 chances out of 1,000 of having a child with Down syndrome. • A pregnant woman who undergoes the test has 15 chances out of 10,000 of having a child with Down syndrome. • In all these 5 chances, the test result is correctly positive. • In 12 of these 15 chances, the test result is correctly positive. • In 2 of the remaining 995 chances of not having a child with Down syndrome, the test result is incorrectly positive. • In 799 of the remaining 9,985 chances of not having a child with Down syndrome, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Questions Standard Therefore, in light of her positive test result, there are ______ chance(s) out of _______ that an anomaly is present. Distributive Therefore, there are ______ chance(s) that her positive test result correctly indicates the presence of the anomaly, and there are ____ chance(s) that her positive test result incorrectly indicates the presence of the anomaly. Scenarios Fetal DNA test NT scan test The “fetal DNA” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. The “nuchal translucency scan” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. • A pregnant woman who undergoes the test has 5 chances out of 1,000 of having a child with Down syndrome. • A pregnant woman who undergoes the test has 15 chances out of 10,000 of having a child with Down syndrome. • In all these 5 chances, the test result is correctly positive. • In 12 of these 15 chances, the test result is correctly positive. • In 2 of the remaining 995 chances of not having a child with Down syndrome, the test result is incorrectly positive. • In 799 of the remaining 9,985 chances of not having a child with Down syndrome, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Questions Standard Therefore, in light of her positive test result, there are ______ chance(s) out of _______ that an anomaly is present. Distributive Therefore, there are ______ chance(s) that her positive test result correctly indicates the presence of the anomaly, and there are ____ chance(s) that her positive test result incorrectly indicates the presence of the anomaly. To be noted is that the prevalence rate in the two scenarios is slightly different because diagnostic and screening tests are typically offered to different populations. NT nuchal translucency. View Large Table 4 Scenarios and Questions Used in Experiment 2 Scenarios Fetal DNA test NT scan test The “fetal DNA” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. The “nuchal translucency scan” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. • A pregnant woman who undergoes the test has 5 chances out of 1,000 of having a child with Down syndrome. • A pregnant woman who undergoes the test has 15 chances out of 10,000 of having a child with Down syndrome. • In all these 5 chances, the test result is correctly positive. • In 12 of these 15 chances, the test result is correctly positive. • In 2 of the remaining 995 chances of not having a child with Down syndrome, the test result is incorrectly positive. • In 799 of the remaining 9,985 chances of not having a child with Down syndrome, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Questions Standard Therefore, in light of her positive test result, there are ______ chance(s) out of _______ that an anomaly is present. Distributive Therefore, there are ______ chance(s) that her positive test result correctly indicates the presence of the anomaly, and there are ____ chance(s) that her positive test result incorrectly indicates the presence of the anomaly. Scenarios Fetal DNA test NT scan test The “fetal DNA” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. The “nuchal translucency scan” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. • A pregnant woman who undergoes the test has 5 chances out of 1,000 of having a child with Down syndrome. • A pregnant woman who undergoes the test has 15 chances out of 10,000 of having a child with Down syndrome. • In all these 5 chances, the test result is correctly positive. • In 12 of these 15 chances, the test result is correctly positive. • In 2 of the remaining 995 chances of not having a child with Down syndrome, the test result is incorrectly positive. • In 799 of the remaining 9,985 chances of not having a child with Down syndrome, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Questions Standard Therefore, in light of her positive test result, there are ______ chance(s) out of _______ that an anomaly is present. Distributive Therefore, there are ______ chance(s) that her positive test result correctly indicates the presence of the anomaly, and there are ____ chance(s) that her positive test result incorrectly indicates the presence of the anomaly. To be noted is that the prevalence rate in the two scenarios is slightly different because diagnostic and screening tests are typically offered to different populations. NT nuchal translucency. View Large Methods Participants Respondents were 164 U.S. residents (mean age: 34 years; SD = 10.8; age range: 18–60 years; 62 women) recruited using the AMT platform. They were paid $1.50 to take part in the study. As in Experiment 1, most participants had a university degree (56%) or high school diploma (43%), and only a few of them (1%) had a lower educational level. Stimuli and procedure Participants were randomly assigned to one of four groups, in a 2 (Scenario: Fetal DNA test vs. NT scan test) × 2 (Question: standard vs. distributive) between-participants design. The sensitivity rates were 100% for the Fetal DNA test and 80% for the NT scan test (see Table 4). Results In the Fetal DNA test scenario, the correct distributive judgment was “5 versus 2,” and the correct standard judgment was “5 out of 7.” In the NT scan test scenario, the correct distributive judgment was “12 versus 799,” and the correct standard judgment was “12 out of 811.” Table 5 reports the accuracy rates for each of the four experimental conditions. Table 5 Frequencies (and Corresponding Percentages) of Correct Responses for Each Experimental Condition in Experiment 2 Scenarios Questions Standard Distributive Fetal DNA 7/42 (17%) 29/41 (71%) NT scan 3/41 (7%) 11/40 (27%) Scenarios Questions Standard Distributive Fetal DNA 7/42 (17%) 29/41 (71%) NT scan 3/41 (7%) 11/40 (27%) NT nuchal translucency. View Large Table 5 Frequencies (and Corresponding Percentages) of Correct Responses for Each Experimental Condition in Experiment 2 Scenarios Questions Standard Distributive Fetal DNA 7/42 (17%) 29/41 (71%) NT scan 3/41 (7%) 11/40 (27%) Scenarios Questions Standard Distributive Fetal DNA 7/42 (17%) 29/41 (71%) NT scan 3/41 (7%) 11/40 (27%) NT nuchal translucency. View Large As in Experiment 1, education and numeracy did not differ in the four experimental conditions of Experiment 2 (χ2(6) = 8.35, p = .213, and χ2(27) = 27.81, p = .421, respectively). A logistic regression analysis was performed to predict the accuracy of probability judgments using question, scenario, education, and numeracy as predictors. The first three factors were included as categorical independent variables, while the last as a continuous independent variable. Once again, the logistic regression model was statistically significant, χ2(4) = 55.7, p < .001; it explained 41% (Nagelkerke R2) of the variance in the accuracy of responses, and correctly classified 81% of cases. Differently from Experiment 1, three variables proved to be significant predictors: question (p < .001; Exp(B) = 9, 95% CI Exp(B) = 3.7–21.9), scenario (p < .001; Exp(B) = 5.2, 95% CI Exp(B) = 2.2–12.3), and numeracy (p = .007; Exp(B) = 1.7, 95% CI Exp(B) = 1.1–2.5), but not education (p = .532). To investigate the role of numeracy further, two separate logistic regression analyses (one for each scenario) were performed. They showed that increasing numeracy was associated with an increased likelihood of providing an accurate probability judgment in the Fetal DNA (p = .043; Exp(B) = 1.6, 95% CI Exp(B) = 1–2.6) but not in the NT scan test (p = .093; Exp(B) = 1.8, 95% CI Exp(B) = 0.9–3.5) scenario. To understand how respondents’ correct responses were distributed within the experimental conditions, we ran a series of chi-square analyses. Distributive judgments were more accurate than standard ones, both in the Fetal DNA (χ2(1) = 24.7, p < .001, φ = 0.54) and NT scan test (χ2(1) = 5.77, p = .016, φ = 0.27) scenarios. As expected, distributive judgments were more correct (χ2(1) = 15.1, p < .001, φ = 0.43) in the Fetal DNA scenario, than in the NT scan scenario, which concerned a test with a lower sensitivity (a similar pattern was observed also with standard judgments, see Table 5, even if it did not reach statistical significance because of their general low accuracy rate). For both scenarios, most respondents correctly individuated the first term of the distributive judgment (Fetal DNA test scenario: 83%; NT scan test scenario: 70%). However, for the NT scan test scenario, the majority of respondents (65%) did not provide the correct second term of the distributive judgment. As expected, their most frequent error (42%) consisted in conflating false negatives with false positives (i.e., they used the value of the “negative test result & presence of the anomaly” conjunction in place of the “positive test result & absence of the anomaly” one). Therefore, when a test can yield both false positives and false negatives, distributive judgments are less accurate—although still more correct than standard ones. Experiment 3 The traditional way to introduce relevant information into probabilistic inference scenarios (which first provides the prevalence rate of a condition and the properties of a test useful to detect it, then assumes that the test is run, and only at the end informs participants on its result) matches the general formalization of a probabilistic updating (e.g., the classic form of the Bayes theorem). However, the understanding of a positive test result reflects a much simpler real-life situation: The test has already been carried out and its result is known. In this case, the scenarios framed in the traditional way have some potentially misleading features (see also [38]). First, since numbers of chances (as well as natural frequencies) already incorporate the prevalence rate, providing respondents with the prior chances of having a given disease is redundant. Second, when the test sensitivity is lower than 100%, the traditional presentation refers, explicitly or implicitly, to the chances of false negatives. As Experiment 2 showed, respondents have difficulty in ignoring this piece of information, even if it is irrelevant in order to make a correct judgment. One possibility to improve the accuracy of probabilistic inferences would thus be to change which test information is presented. Consider, for example, the version of the NT scan test scenario named “Positive test framing” in Table 6. It neither reports the prevalence rate of the anomaly nor contains an implicit reference to the chances of a false-negative result. Rather, it presents only the total number of chances of having a positive test result (i.e., “811”), divided into the two relevant subsets: The chances of having a positive test result and the anomaly (i.e., “12”), and those of having a positive test result but not the anomaly (i.e., “799”). In this case, the “positive test result & presence of the anomaly” conjunction can be easily contrasted with the “positive test result & absence of the anomaly” conjunction, which is the right comparison term to give the correct judgment. In sum, the positive test framing avoids referring to potentially misleading information by exclusively presenting what is relevant for dealing with a positive test result. Accordingly, we expected respondents to be more likely to produce correct distributive judgments with this framing, especially when test sensitivity was lower than 100%. To be noted is that with the positive test framing it is possible to identify the correct solution without making any calculation. Moreover, the order in which information is presented corresponds to all those real-life clinical situations in which a test result precedes more detailed information about the test itself or the possible related clinical condition(s). Table 6 Scenarios and Questions Used in Experiment 3 Fetal DNA test scenario The “fetal DNA” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. Traditional framing Positive test framing • A pregnant woman who undergoes the test has 5 chances out of 1,000 of having a child with Down syndrome. • A pregnant woman who undergoes the test has 7 chances out of 1,000 of having a positive result. • In all these 5 chances, the test result is correctly positive. • In 5 of these 7 chances, the test result is correctly positive. • In 2 of the remaining 995 chances of not having a child with Down syndrome, the test result is incorrectly positive. • In the remaining 2, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. NT scan test scenario The “Nuchal translucency scan” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. Traditional framing Positive test framing • A pregnant woman who undergoes the test has 15 chances out of 10,000 of having a child with Down syndrome. • A pregnant woman who undergoes the test has 811 chances out of 10,000 of having a positive result. • In 12 of these 15 chances, the test result is correctly positive. • In 12 of these 811 chances, the test result is correctly positive. • In 799 of the remaining 9,985 chances of not having a child with Down syndrome, the test result is incorrectly positive. • In the remaining 799, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Questions Distributive Therefore, there are ______ chance(s) that her positive test result correctly indicates the presence of the anomaly, and there are ____ chance(s) that her positive test result incorrectly indicates the presence of the anomaly. Relative Therefore, the more likely conclusion is that: □ Anna’s child has an anomaly □ Anna’s child has not an anomaly Fetal DNA test scenario The “fetal DNA” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. Traditional framing Positive test framing • A pregnant woman who undergoes the test has 5 chances out of 1,000 of having a child with Down syndrome. • A pregnant woman who undergoes the test has 7 chances out of 1,000 of having a positive result. • In all these 5 chances, the test result is correctly positive. • In 5 of these 7 chances, the test result is correctly positive. • In 2 of the remaining 995 chances of not having a child with Down syndrome, the test result is incorrectly positive. • In the remaining 2, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. NT scan test scenario The “Nuchal translucency scan” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. Traditional framing Positive test framing • A pregnant woman who undergoes the test has 15 chances out of 10,000 of having a child with Down syndrome. • A pregnant woman who undergoes the test has 811 chances out of 10,000 of having a positive result. • In 12 of these 15 chances, the test result is correctly positive. • In 12 of these 811 chances, the test result is correctly positive. • In 799 of the remaining 9,985 chances of not having a child with Down syndrome, the test result is incorrectly positive. • In the remaining 799, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Questions Distributive Therefore, there are ______ chance(s) that her positive test result correctly indicates the presence of the anomaly, and there are ____ chance(s) that her positive test result incorrectly indicates the presence of the anomaly. Relative Therefore, the more likely conclusion is that: □ Anna’s child has an anomaly □ Anna’s child has not an anomaly View Large Table 6 Scenarios and Questions Used in Experiment 3 Fetal DNA test scenario The “fetal DNA” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. Traditional framing Positive test framing • A pregnant woman who undergoes the test has 5 chances out of 1,000 of having a child with Down syndrome. • A pregnant woman who undergoes the test has 7 chances out of 1,000 of having a positive result. • In all these 5 chances, the test result is correctly positive. • In 5 of these 7 chances, the test result is correctly positive. • In 2 of the remaining 995 chances of not having a child with Down syndrome, the test result is incorrectly positive. • In the remaining 2, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. NT scan test scenario The “Nuchal translucency scan” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. Traditional framing Positive test framing • A pregnant woman who undergoes the test has 15 chances out of 10,000 of having a child with Down syndrome. • A pregnant woman who undergoes the test has 811 chances out of 10,000 of having a positive result. • In 12 of these 15 chances, the test result is correctly positive. • In 12 of these 811 chances, the test result is correctly positive. • In 799 of the remaining 9,985 chances of not having a child with Down syndrome, the test result is incorrectly positive. • In the remaining 799, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Questions Distributive Therefore, there are ______ chance(s) that her positive test result correctly indicates the presence of the anomaly, and there are ____ chance(s) that her positive test result incorrectly indicates the presence of the anomaly. Relative Therefore, the more likely conclusion is that: □ Anna’s child has an anomaly □ Anna’s child has not an anomaly Fetal DNA test scenario The “fetal DNA” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. Traditional framing Positive test framing • A pregnant woman who undergoes the test has 5 chances out of 1,000 of having a child with Down syndrome. • A pregnant woman who undergoes the test has 7 chances out of 1,000 of having a positive result. • In all these 5 chances, the test result is correctly positive. • In 5 of these 7 chances, the test result is correctly positive. • In 2 of the remaining 995 chances of not having a child with Down syndrome, the test result is incorrectly positive. • In the remaining 2, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. NT scan test scenario The “Nuchal translucency scan” test is a prenatal test that determines whether an unborn child has Down syndrome. When the test (correctly or incorrectly) indicates the presence of Down syndrome, the test result is positive. Here is the information about Down syndrome and the test. Traditional framing Positive test framing • A pregnant woman who undergoes the test has 15 chances out of 10,000 of having a child with Down syndrome. • A pregnant woman who undergoes the test has 811 chances out of 10,000 of having a positive result. • In 12 of these 15 chances, the test result is correctly positive. • In 12 of these 811 chances, the test result is correctly positive. • In 799 of the remaining 9,985 chances of not having a child with Down syndrome, the test result is incorrectly positive. • In the remaining 799, the test result is incorrectly positive. Imagine that Anna, a pregnant woman, undergoes the test, and the test result is positive. Questions Distributive Therefore, there are ______ chance(s) that her positive test result correctly indicates the presence of the anomaly, and there are ____ chance(s) that her positive test result incorrectly indicates the presence of the anomaly. Relative Therefore, the more likely conclusion is that: □ Anna’s child has an anomaly □ Anna’s child has not an anomaly View Large The aim of Experiment 3 was twofold: first, to compare the accuracy of distributive judgments with the traditional and corresponding positive test framing; second, to investigate the agreement between distributive and relative judgments (i.e., the endorsement of the more likely hypothesis in light of the positive test result). Methods Participants A new sample of 164 U.S. residents (mean age: 35 years; SD = 11.9; age range: 18–68 years; 79 women) was recruited using AMT. They were paid $1.50 to take part in the study. Again, most participants had a university degree (53%) or high school diploma (46%), and only a few of them (1%) had a lower educational level. Stimuli and procedure Participants were randomly assigned to one of four experimental conditions, in a 2 (Scenario: Fetal DNA test vs. NT scan test) × 2 (Framing: traditional vs. positive test) between-participants design. Both traditional and positive test framings are reported in Table 6. In each condition, participants answered first a distributive question and then a relative one. In the latter case, respondents had to select which of two competing hypotheses (i.e., “Anna’s child has the anomaly” vs. “Anna’s child has not the anomaly”) was more likely in light of the available evidence (i.e., “the test result is positive”). As for previous experiments, on conclusion of the tasks, participants had to complete the 11-item numeracy scale [36]. The two Fetal DNA test scenarios described a test with 100% sensitivity and did not report (implicitly or explicitly) the chances of false-negative results. Hence, we expected similar rates of correct distributive judgments in the traditional and positive test framings. In the NT scan test scenario, instead, the traditional framing implicitly reported the chances of false-negative results, while the positive test framing did not. Accordingly, we expected a higher rate of correct distributive judgments with the positive test framing compared to the traditional one. Results For both scenarios and framings, the correct distributive judgments were the same as in Experiment 2. The correct relative judgment was “Anna’s child has an anomaly” in the Fetal DNA test scenario, and “Anna’s child has not an anomaly” in the NT scan test one. As in the previous experiments, education and numeracy did not differ in the four experimental conditions (χ2(6) = 3.71, p = .716, and χ2(27) = 18.31, p = .894, respectively). A logistic regression analysis was conducted to predict the accuracy of distributive judgments using framing, scenario, the interaction between framing and scenario, education, and numeracy as predictors. Framing, scenario, and education were included as categorical independent variables, while numeracy as a continuous independent variable. The logistic regression model was statistically significant (χ2(5) = 27.79, p < .001), it explained 21% (Nagelkerke R2) of the variance in the accuracy of responses, and it correctly classified 73% of cases. The Wald criterion showed that framing (p < .001; Exp(B) = 8.3, 95% CI Exp(B) = 2.9–23.3), scenario (p < .001; Exp(B) = 4.9, 95% CI Exp(B) = 1.9–13), the interaction between framing and scenario (p = .020; Exp(B) = 0.18, 95% CI Exp(B) = 0.04–0.8), and numeracy (p = .024; Exp(B) =1.2, 95% CI Exp(B) = 1–1.5) were significant predictors, while education (p = .94) was not. As in Experiment 2, with the traditional framing, respondents were more frequently correct in the Fetal DNA test scenario than in the NT scan test one, χ2(1) = 9.78, p = .002, φ = 0.35 (Table 7). In the latter, the most frequent mistake (39% of all errors) consisted of using the value of the “positive test result & absence of the anomaly” conjunction as the second term of the distributive judgment. As expected, in the positive test framing, the majority of responses were correct, without any significant difference between the two scenarios, χ2(1) = .02, p = .886. The positive test framing of the NT scan test scenario elicited a significantly greater number of correct judgments than did the traditional framing, χ2(1) = 16.24, p < .001, φ = 0.45. Table 7 Frequencies (and Corresponding Percentages) of Correct Responses in Experiment 3 Framing and scenario Questions Distributive Relative Agreement Traditional Fetal DNA test 28/41 (68) 32/41 (78) 0.76 NT scan test 13/39 (33) 20/39 (51) 0.85 Positive test Fetal DNA test 33/43 (77) 36/43 (84) 0.86 NT scan test 32/41 (78) 39/41 (95) 0.88 Framing and scenario Questions Distributive Relative Agreement Traditional Fetal DNA test 28/41 (68) 32/41 (78) 0.76 NT scan test 13/39 (33) 20/39 (51) 0.85 Positive test Fetal DNA test 33/43 (77) 36/43 (84) 0.86 NT scan test 32/41 (78) 39/41 (95) 0.88 The agreement between distributive and relative judgments is also reported. NT nuchal translucency. View Large Table 7 Frequencies (and Corresponding Percentages) of Correct Responses in Experiment 3 Framing and scenario Questions Distributive Relative Agreement Traditional Fetal DNA test 28/41 (68) 32/41 (78) 0.76 NT scan test 13/39 (33) 20/39 (51) 0.85 Positive test Fetal DNA test 33/43 (77) 36/43 (84) 0.86 NT scan test 32/41 (78) 39/41 (95) 0.88 Framing and scenario Questions Distributive Relative Agreement Traditional Fetal DNA test 28/41 (68) 32/41 (78) 0.76 NT scan test 13/39 (33) 20/39 (51) 0.85 Positive test Fetal DNA test 33/43 (77) 36/43 (84) 0.86 NT scan test 32/41 (78) 39/41 (95) 0.88 The agreement between distributive and relative judgments is also reported. NT nuchal translucency. View Large The relative question produced a similar response pattern. Framing, scenario, the interaction between framing and scenario, education, numeracy, as well as the accuracy of the distributive judgments were included as predictors in a logistic regression analysis. The model was statistically significant, χ2(6) = 48.1, p < .001, it explained 39% (Nagelkerke R2) of the variance in response accuracy and correctly classified 84% of cases. Only framing (p = .002; Exp(B) = 14.7, 95% CI Exp(B) = 2.6–81.8), the interaction between framing and scenario (p = .023; Exp(B) = 0.09, 95% CI Exp(B) = 0.01–0.7), numeracy (p = .049; Exp(B) = 1.2, 95% CI Exp(B) = 1–1.6), and the accuracy of distributive judgments (p < .001; Exp(B) = 6.8, 95% CI Exp(B) = 2.7–17.2) proved to significantly predict the accuracy of relative judgments. With the traditional framing, respondents were more frequently correct in the Fetal DNA test scenario than in the NT scan test one (χ2(1) = 6.29, p = .012, φ = 0.28), in which the accuracy rate did not differ from 50% (p > .05, according to binomial test). With the positive test framing, however, the very large majority of relative judgments were correct, and no significant difference was observed between the two scenarios (χ2(1) = .285, p = .092). As for the distributive judgments, the positive test framing of the NT scan test scenario was associated with a significantly higher number of correct relative judgments than the traditional framing, χ2(1) = 19.84, p < .001, φ = 0.50. To investigate whether respondents understood a basic implication (in terms of endorsed hypothesis) of their distributive judgments, we considered the within-subject agreement “in direction” between distributive and relative judgments. That is, independently from their accuracy, we counted the proportion of distributive judgments in favor of the hypothesis “presence of the anomaly” (“absence of the anomaly”), which were associated with corresponding relative judgments endorsing “Anna’s child has an anomaly” (“Anna’s child has not an anomaly”). The agreement was rather high, ranging from 80% in the traditional framing of the Fetal DNA scenario to 87% in the positive test framing of both scenarios, with no statistical differences among the four conditions, χ2(3) = 2.65, p = .45. Overall, the results of Experiment 3 fully confirmed our hypotheses. First, the use of a positive test framing increased the rate of correct distributive judgments, regardless of test sensitivity. Second, relative judgments proved to be mainly accurate and highly associated with distributive judgments. Discussion The present study has investigated the effectiveness of three simple strategies in improving health-service users’ probabilistic inferences in the domain of prenatal testing. Experiment 1 combined a distributive question with numerical information expressed as number of chances. The large majority of a sample of mothers and mothers-to-be failed to estimate the absolute probability that a pregnant woman with a positive prenatal test result carried a baby with an anomaly (i.e., standard judgment), but, regardless of their numeracy and education levels, they correctly apportioned the numbers of chances for and against that hypothesis (i.e., distributive judgment). The results of Experiment 2 suggested that distributive assessments of chances—although systematically more correct than standard judgments—depend on test sensitivity: When it was 100%, the majority of respondents gave correct judgments, but when lower, the opposite was true, possibly because respondents confused false negatives with false positives. Experiment 3 investigated the effects of presenting information concerning a positive test result (i.e., the positive test framing) without mentioning the prevalence rate of the condition and making the overall chances of (true and false) positive test results explicit. This framing appears as clinically justified in all situations in which patients are interested in the chances of having a certain condition given a positive test result. It is also theoretically justified since a correct evaluation of the positive predictive value of a test does not require taking into account the chances of false negatives. The results confirmed the efficacy of the positive test framing: A very high rate of respondents provided correct distributive judgments (77% across scenarios), regardless of test sensitivity. The majority of respondents also made correct relative judgments, that is, they identified the most probable hypothesis. The agreement between distributive and relative judgments was rather high, regardless of the experimental condition. Respondents’ accuracy in the distributive judgment was a significant predictor of the correct endorsement of the more probable hypothesis. While the results of the three experiments are consistent in excluding a role of education, the role of numeracy is less clear-cut. Respondents’ numeracy, indeed, partially accounts for the accuracy of the investigated probability judgments (i.e., in the Fetal DNA scenario of Experiment 2, and in Experiment 3). The effect of numeracy could have been somewhat hindered by the high numeracy of our samples (with a median of 9 correct items out of 11). Since the scores obtained with our samples are in line with those reported in other experiments which employed the same numeracy scale [13, 39], further investigations might better explore the role of numeracy by means of more discriminative measures. These results have major implications for clinical practice. The widespread recommendation of avoiding single-event probabilities to communicate test results [3, 4] appears unjustified since it is exclusively grounded on results concerning percentages. Our findings show that the chance format can be effectively used to communicate a positive test result referring to a single patient. Moreover, they indicate that, in order to reduce difficulties in probabilistic reasoning, communication strategies should prompt extensional reasoning besides focusing on the format in which test results are expressed. This can be done by providing health-service users with the overall chances of positive test results (instead of the prior chances of having the disease), and asking them to assign the number of chances for and against the hypothesis under consideration. Our study has both strengths and limitations. Among the former is the high rate of correct judgments that were obtained when the three strategies were combined: About three-quarters (77%) of our respondents answered the distributive question correctly. Such a success rate contrasts with the finding that at most one-fourth of health-service users reason correctly about test results when asked to produce a standard probability judgment and numerical information is expressed as natural frequencies [1, 11, 13]. Moreover, unlike other communication strategies (as the natural frequency format, but see also [40]), whose effects are strictly intertwined with individuals’ numeracy skills, the benefits of using number of chances, distributive assessments, and positive test framing seem less closely related to respondents’ numeracy and, especially, education. This suggests that the strategies proposed in the present paper may be suitable for a large number of different patients dealing with various medical tests. Finally, respondents were able to generate correct inferences concerning a single individual’s test. This indicates that it is possible to help health-service users in reasoning about what really matters to them, that is: interpreting the information for their personal case, rather than predicting frequencies in a generic sample of individuals. Among the limitations of the present study, it is worth mentioning respondents’ low personal involvement, since they were not asked to evaluate their personal cases but hypothetical scenarios. Furthermore, although this study has documented that our chance-based strategies can boost correct reasoning about a positive test result, it remains to be investigated whether they can be useful in other reasoning tasks, such as interpreting a negative test result or evaluating the informativeness of different screening or diagnostic tests. These and similar other issues might be the fruitful focus of future research, which should also investigate how health-service users’ probabilistic judgments ultimately affect decisions in clinically relevant settings. To summarize, the present study suggests an effective, concrete way to improve the understanding of positive medical test results. It consists in providing health-service users only with two conjunctions expressing the number of chances that a positive test result is and is not associated with a given condition. Relying on this, most laypeople proved to correctly understand the probability in favor or against the hypothesis of having a certain condition and, accordingly, correctly judge whether this hypothesis is more or less likely than its alternative. The three strategies discussed in this paper should not be misunderstood as training to improve general reasoning abilities. More modestly, and in a sense more pragmatically, they represent a first step toward the implementation of a more efficient communication approach tailored to the specific moment in which a probabilistic judgment is needed. Acknowledgments Financial support for this study was provided in part by the Cassa di Risparmio of Trento and Rovereto (SMC, Grant 40102595), by the Deutsche Forschungsgemeinschaft as part of the priority program New Frameworks of Rationality (SPP1516, Grant CR 409/1-2), and by a grant of the GAM Ca’ Foscari Foundation. Compliance with Ethical Standards Authors’ Statement of Conflict of Interest and Adherence to Ethical Standards Stefania Pighin, Katya Tentori, Lucia Savadori, and Vittorio Girotto declare that they have no conflict of interest. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Authors’ Contributions All authors made substantial contributions to conception and design, and/or acquisition of data, and/or analysis and interpretation of data. Ethical Approval The entire study was approved by the Ethics Committee of CERME (University Ca’ Foscari of Venice, Venice, Italy). Informed Consent Informed consent was obtained from all individual participants included in the study. References 1. Bramwell R , West H , Salmon P . Health professionals’ and service users’ interpretation of screening test results: experimental study . BMJ . 2006 ; 333 ( 7562 ): 284 – 286 . Google Scholar Crossref Search ADS PubMed 2. Casscells W , Schoenberger A , Graboys TB . Interpretation by physicians of clinical laboratory results . N Engl J Med . 1978 ; 299 ( 18 ): 999 – 1001 . Google Scholar Crossref Search ADS PubMed 3. Hoffrage U , Gigerenzer G . Using natural frequencies to improve diagnostic inferences . Acad Med . 1998 ; 73 ( 5 ): 538 – 540 . Google Scholar Crossref Search ADS PubMed 4. Gigerenzer G , Hoffrage U . How to improve Bayesian reasoning without instruction: frequency formats . Psychol Rev . 1995 ; 102 ( 4 ): 684 – 704 . Google Scholar Crossref Search ADS 5. Gigerenzer G , Hoffrage U , Ebert A . AIDS counselling for low-risk clients . AIDS Care . 1998 ; 10 ( 2 ): 197 – 211 . Google Scholar Crossref Search ADS PubMed 6. Akl EA , Oxman AD , Herrin J , et al. Using alternative statistical formats for presenting risks and risk reductions . Cochrane Database Syst Rev . 2011 ; 3 : CD006776 . 7. Elwyn G , O’Connor A , Stacey D , et al. ; International Patient Decision Aids Standards (IPDAS) Collaboration . Developing a quality criteria framework for patient decision aids: online international Delphi consensus process . BMJ . 2006 ; 333 ( 7565 ): 417 . Google Scholar Crossref Search ADS PubMed 8. Gigerenzer G , Edwards A . Simple tools for understanding risks: from innumeracy to insight . BMJ . 2003 ; 327 ( 7417 ): 741 – 744 . Google Scholar Crossref Search ADS PubMed 9. McDowell M , Jacobs P . Meta-analysis of the effect of natural frequencies on Bayesian reasoning . Psychol Bul . 2017 ; 143 ( 12 ): 1273 – 1312 . Google Scholar Crossref Search ADS 10. Sloman SA , Over D , Slovak L , Stibel JM . Frequency illusions and other fallacies . Organ Behav Hum Decis Process . 2003 ; 91 ( 2 ): 296 – 309 . Google Scholar Crossref Search ADS 11. Garcia-Retamero R , Hoffrage U . Visual representation of statistical information improves diagnostic inferences in doctors and their patients . Soc Sci Med . 2013 ; 83 : 27 – 33 . Google Scholar Crossref Search ADS PubMed 12. Siegrist M , Keller C . Natural frequencies and Bayesian reasoning: the impact of formal education and problem context . J Risk Res . 2011 ; 14 ( 9 ): 1039 – 1055 . Google Scholar Crossref Search ADS 13. Pighin S , Gonzalez M , Savadori L , Girotto V . Natural frequencies do not foster public understanding of medical test results . Med Decis Making . 2016 ; 36 ( 6 ): 686 – 691 . Google Scholar Crossref Search ADS PubMed 14. Chapman GB , Liu J . Numeracy, frequency, and Bayesian reasoning . Judgm Decis Mak . 2009 ; 4 ( 1 ): 34 – 40 . 15. Hill WT , Brase GL . When and for whom do frequencies facilitate performance? On the role of numerical literacy . Q J Exp Psychol . 2012 ; 218 ( 12 ): 1 – 26 . 16. Barbey AK , Sloman SA . Base-rate respect: From ecological rationality to dual processes . Behav Brain Sci . 2007 ; 30 ( 3 ): 241 – 254 ; discussion 255. Google Scholar PubMed 17. Pighin S , Gonzalez M , Savadori L , Girotto V . Improving public interpretation of probabilistic test results: distributive evaluations . Med Decis Making . 2015 ; 35 ( 1 ): 12 – 15 . Google Scholar Crossref Search ADS PubMed 18. Zhu L , Gigerenzer G . Children can solve Bayesian problems: the role of representation in mental computation . Cognition . 2006 ; 98 ( 3 ): 287 – 308 . Google Scholar Crossref Search ADS PubMed 19. Lippman A . Prenatal genetic testing and screening: constructing needs and reinforcing inequities . Am J Law Med . 1991 ; 17 ( 1–2 ): 15 – 50 . Google Scholar PubMed 20. Malone FD , Canick JA , Ball RH , et al. ; First- and Second-Trimester Evaluation of Risk (FASTER) Research Consortium . First-trimester or second-trimester screening, or both, for Down’s syndrome . N Engl J Med . 2005 ; 353 ( 19 ): 2001 – 2011 . Google Scholar Crossref Search ADS PubMed 21. Neagos D , Cretu R , Sfetea RC , Bohiltea LC . The importance of screening and prenatal diagnosis in the identification of the numerical chromosomal abnormalities . Maedica (Buchar) . 2011 ; 6 ( 3 ): 179 – 184 . Google Scholar PubMed 22. Rothenberg KH , Thomson EJ . Women and prenatal testing: facing the challenges of genetic technology . Women Heal Ser . 1994 ; 304 . 23. Santalahti P , Hemminki E , Latikka AM , Ryynänen M . Women’s decision-making in prenatal screening . Soc Sci Med . 1998 ; 46 ( 8 ): 1067 – 1076 . Google Scholar Crossref Search ADS PubMed 24. Girotto V , Gonzalez M . Solving probabilistic and statistical problems: a matter of information structure and question form . Cognition . 2001 ; 78 ( 3 ): 247 – 276 . Google Scholar Crossref Search ADS PubMed 25. Johnson-Laird PN , Legrenzi P , Girotto V , Legrenzi MS , Caverni JP . Naive probability: a mental model theory of extensional reasoning . Psychol Rev . 1999 ; 106 ( 1 ): 62 – 88 . Google Scholar Crossref Search ADS PubMed 26. Ayal S , Beyth-Marom R . The effects of mental steps and compatibility on Bayesian reasoning . Judgm Decis Mak . 2014 ; 9 ( 3 ): 226 – 242 . 27. Lesage E , Navarrete G , De Neys W . Evolutionary modules and Bayesian facilitation: the role of general cognitive resources . Think Reason . 2013 ; 19 ( 1 ): 27 – 53 . Google Scholar Crossref Search ADS 28. Pighin S , Girotto V , Tentori K . Children’s quantitative Bayesian inferences from natural frequencies and number of chances . Cognition . 2017 ; 168 : 164 – 175 . Google Scholar Crossref Search ADS PubMed 29. Girotto V , Gonzalez M . Chances and frequencies in probabilistic reasoning: rejoinder to Hoffrage, Gigerenzer, Krauss, and Martignon . Cognition . 2002 ; 84 ( 3 ): 353 – 359 . Google Scholar Crossref Search ADS PubMed 30. Pighin S , Tentori K , Girotto V . Another chance for good reasoning . Psychon Bull Rev . 2017 ; 24 ( 6 ): 1995 – 2002 . Google Scholar Crossref Search ADS PubMed 31. Brase GL , Pighin S , Tentori K . (Yet) Another chance for good reasoning? A commentary and reply on Pighin, Tentori, and Girotto, 2017 . Psychon Bull Rev . 2017 . doi: 10.3758/s13423-017-1314-8 . 32. Snijders RJ , Noble P , Sebire N , Souka A , Nicolaides KH . UK multicentre project on assessment of risk of trisomy 21 by maternal age and fetal nuchal-translucency thickness at 10–14 weeks of gestation. Fetal Medicine Foundation First Trimester Screening Group . Lancet . 1998 ; 352 ( 9125 ): 343 – 346 . Google Scholar Crossref Search ADS PubMed 33. Palomaki GE , Deciu C , Kloza EM , et al. DNA sequencing of maternal plasma reliably identifies trisomy 18 and trisomy 13 as well as Down syndrome: an international collaborative study . Genet Med . 2012 ; 14 ( 3 ): 296 – 305 . Google Scholar Crossref Search ADS PubMed 34. Hahnemann JM , Vejerslev LO . Accuracy of cytogenetic findings on chorionic villus sampling (CVS)–diagnostic consequences of CVS mosaicism and non-mosaic discrepancy in centres contributing to EUCROMIC 1986-1992 . Prenat Diagn . 1997 ; 17 ( 9 ): 801 – 820 . Google Scholar Crossref Search ADS PubMed 35. Dondorp W , de Wert G , Bombard Y , et al. ; European Society of Human Genetics; American Society of Human Genetics . Non-invasive prenatal testing for aneuploidy and beyond: challenges of responsible innovation in prenatal screening . Eur J Hum Genet . 2015 ; 23 ( 11 ): 1438 – 1450 . Google Scholar Crossref Search ADS PubMed 36. Lipkus IM , Samsa G , Rimer BK . General performance on a numeracy scale among highly educated samples . Med Decis Making . 2001 ; 21 ( 1 ): 37 – 44 . Google Scholar Crossref Search ADS PubMed 37. Nicolaides KH . Screening for fetal aneuploidies at 11 to 13 weeks . Prenat Diagn . 2011 ; 31 ( 1 ): 7 – 15 . Google Scholar Crossref Search ADS PubMed 38. Navarrete G , Correia R , Froimovitch D . Communicating risk in prenatal screening: the consequences of Bayesian misapprehension . Front Psychol . 2014 ; 5 : 1272 . Google Scholar Crossref Search ADS PubMed 39. Galesic M , Gigerenzer G , Straubinger N . Natural frequencies help older adults and people with low numeracy to evaluate medical screening tests . Med Decis Making . 2009 ; 29 ( 3 ): 368 – 371 . Google Scholar Crossref Search ADS PubMed 40. Pighin S , Savadori L , Barilli E , et al. Using comparison scenarios to improve prenatal risk communication . Med Decis Making . 2013 ; 33 ( 1 ): 48 – 58 . Google Scholar Crossref Search ADS PubMed © Society of Behavioral Medicine 2018. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
Annals of Behavioral Medicine – Oxford University Press
Published: Oct 22, 2018
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.
All for just $49/month
Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly
Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.
Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.
Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.
All the latest content is available, no embargo periods.
“Hi guys, I cannot tell you how much I love this resource. Incredible. I really believe you've hit the nail on the head with this site in regards to solving the research-purchase issue.”
Daniel C.
“Whoa! It’s like Spotify but for academic articles.”
@Phil_Robichaud
“I must say, @deepdyve is a fabulous solution to the independent researcher's problem of #access to #information.”
@deepthiw
“My last article couldn't be possible without the platform @deepdyve that makes journal papers cheaper.”
@JoseServera
DeepDyve Freelancer | DeepDyve Pro | |
---|---|---|
Price | FREE | $49/month |
Save searches from | ||
Create lists to | ||
Export lists, citations | ||
Read DeepDyve articles | Abstract access only | Unlimited access to over |
20 pages / month | ||
PDF Discount | 20% off | |
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.
ok to continue