Cognitive shortcuts in causal inference

Fernbach, Philip M.; Rehder, Bob

doi:10.1080/19462166.2012.682655

Cognitive shortcuts in causal inference

Issue title: Formal Models of Reasoning in Cognitive Psychology

Article type: Research Article

Authors: Fernbach, Philip M.^{a; *} | Rehder, Bob^b

Affiliations: [a] Leeds School of Business, University of Colorado, 19 UCB, Boulder, CO, 80309-0419, USA | [b] Department of Psychology, New York University, NY, USA

Correspondence: [*] Corresponding author. E-mail: [email protected]

Keywords: cognitive science<interdisciplinary links with computational argument, conditionals<interdisciplinary links with computational argument, mental models<interdisciplinary links with computational argument, rationality<interdisciplinary links with computational argument, computational accounts of probabilistic argument, explanation

DOI: 10.1080/19462166.2012.682655

Journal: Argument & Computation, vol. 4, no. 1, pp. 64-88, 2013

Received 8 December 2011

Accepted 28 March 2012

Published: 1 March 2013

Get PDF

Abstract

The paper explores the idea that causality-based probability judgments are determined by two competing drives: one towards veridicality and one towards effort reduction. Participants were taught the causal structure of novel categories and asked to make predictive and diagnostic probability judgments about the features of category exemplars. We found that participants violated the predictions of a normative causal Bayesian network model because they ignored relevant variables (Experiments 1–3) and because they failed to integrate over hidden variables (Experiment 2). When the task was made easier by stating whether alternative causes were present or absent as opposed to uncertain, judgments approximated the normative predictions (Experiment 3). We conclude that augmenting the popular causal Bayes net computational framework with cognitive shortcuts that reduce processing demands can provide a more complete account of causal inference.

The psychology of causal inference is experiencing growing pains. A proliferation of interest in causal reasoning over the last several years is due in large part to the development of causal Bayesian networks, a computational framework for learning, representing and reasoning with causal knowledge. Causal Bayes nets are normative models that are governed by the axioms of probability, and psychological theories based on causal Bayes nets therefore predict that causal judgment should accord with norms. In line with this idea is a variety of evidence that people (including young children) are sophisticated and adept causal reasoners. However, as we detail below, probability judgments based on causal evidence do not always honor the norms associated with Bayes nets, suggesting that understanding such judgments will require considering nonnormative factors imposed by the cognitive processes that implement causal reasoning.

We will argue that causal judgments can be viewed as emerging from an interaction between two competing drives: one towards veridicality and one towards effort reduction. We will illustrate this claim for a paradigmatic task, namely, judging the conditional probability of a hypothesis given causally relevant evidence. We present an analysis of the requirements for optimal performance on this task, one that suggests a number of task variables that may invite reasoners to reduce effort by taking shortcuts that result in inappropriate conclusions. In three experiments we manipulate these variables with an eye to identifying which normative requirements people typically violate and to establishing conditions that support more veridical judgment.

Causal Bayes nets and the normativity of causal inference

People are good at qualitative, contextualised reasoning about the causal systems they interact with in their lives (Sloman 2005). People often have good intuitions for instance about what actions to take to achieve a goal, what caused them to feel ill, or how a new coach will influence the performance of their favorite team. This capability has prompted substantial interest in a theory of mental representation that accounts for causal intuitions: a causal model theory (Waldmann and Holyoak 1992; Glymour 1998; Gopnik, Glymour, Sobel, Schulz and Kushnir 2004). Causal model theories are usually instantiated using causal Bayesian networks, graphs where events or properties and their causal relations are depicted as variable nodes and directed edges (arrows) that point from cause to effect (Spirtes, Glymour and Scheines 1993; Pearl 1988, 2000; Jordan 1999). A causal Bayes net is associated with functions that specify how the probabilities of effects change in the presence of their causes. These functions allow for the calculation of the probability of unknown variables conditioned on known ones and thus support inductive inference.

Since causal Bayes nets are based on probabilistic calculus, psychological theories based on these models predict that human causal inference should respect probabilistic norms when people can apply an appropriate causal model. A variety of evidence supports this idea: People honour many causal reasoning norms not only during simple inferences (Rehder and Burnett 2005) but also more complex causal inferences involving analogies (Lee and Holyoak 2008; Holyoak, Lee and Lu 2010), generalisations (Rehder and Hastie 2004; Rehder 2006, 2009; Shafto, Kemp, Bonawitz, Coley and Tenebaum 2008; Kemp and Tenenbaum 2009), and acts of classification (Rehder and Hastie 2001; Rehder 2003a, b; Rehder and Kim 2006, 2009, 2010). Research on development shows that even children's reasoning is more consistent with causal than associative models by age 4 (Sobel, Tenenbaum and Gopnik 2004; Hayes and Thompson 2007; Opfer and Bulloch 2007; Hayes and Rehder in press). Finally, Causal Bayes nets are also attractive to learning theorists because causal structures and parameters can (in principle) be learned from data (Waldmann et al., 1995; Cheng 1997; Novick and Cheng 2004; Griffiths and Tenenbaum 2005, 2009; Lu, Yuille, Liljeholm, Cheng and Holyoak 2008;, but see Fernbach and Sloman 2009). Causal models provide a better account than associative models for such learning in both adults (Waldmann and Holyoak 1992; Waldmann 2000; for a review see Holyoak and Cheng 2011) and children (Gopnik et al. 2004; Sobel et al. 2004), and sometimes nonhumans (Beckers, De Houwer, Miller and Urushihara 2006; Blaisdell, Sawa, Leising and Waldmann 2006). In summary, causal Bayes nets have provided a valuable organising framework for a large variety of reasoning and learning phenomena.

Despite these successes, other phenomena suggest that causal inference is error-prone. Many counternormative phenomena from the heuristics and biases literature – such as conjunction fallacies (Tversky and Kahneman 1983), subadditive probability judgments (Tversky and Koehler 1994; Rottenstreich and Tversky 1997), simulation effects (Kahneman and Tversky 1982; Wells and Gavanski 1989), and hindsight biases (Fischhoff and Beyth 1975) emerge (and are sometimes exacerbated) in causal scenarios. Tversky and Kahneman (1980) argue that causal reasoning is qualitatively different from a more appropriate evaluation of evidential strength and therefore leads to biased judgment. Moreover, people sometimes confuse the causal role of their actions. This leads to ‘diagnostic self-deception’ (Quattrone and Tversky 1984; Sloman, Fernbach and Hagmayer 2010) and other examples of ‘evidential reasoning’ such as cooperation in the prisoner's dilemma and the voter's illusion (Acevedo and Krueger 2005). People also sometimes feel a false sense of control over outcomes that are actually up to chance or risk (Langer 1975) leading to idiosyncratic superstitions like reluctance to “tempt fate” (Risen and Gilovich 2008; Swirsky, Fernbach and Sloman 2011). In the causal learning literature as well, researchers have documented conditions in which learners depart from the normative acquisition rules specified by Bayes nets (De Houwer and Beckers, 2003; Reips and Waldmann 2008; Waldmann and Walker 2005).

Yet a different kind of error was uncovered in a series of studies by Fernbach, Darlow and Sloman (2010, 2011a). Following Tversky and Kahneman (1980) they compared predictive reasoning – judgment of the conditional probability of an effect given a cause – to diagnostic reasoning – judgment of the conditional probability of a cause given an effect. By varying causal structure and collecting judgments of conditional probability about a variety of scenarios, they were able to evaluate the consistency of judgments in both directions of reasoning. In predicting an effect from a cause, participants systematically neglected the contribution of alternative causes to the probability of the effect. They based their judgments just on the strength of the cause known to be present, and therefore gave conditional probability judgments that were too low. In contrast, diagnostic judgments were sensitive to the strength of alternative causes and approximately consistent with the predictions of a causal Bayes net model.

Fernbach, Darlow and Sloman (2011b) established the robustness of the neglect of alternative causes in a series of experiments assessing conditional probability judgments and gambling decisions in the face of a weak but positive predictive evidence. Ignoring alternative causes can be a serious error when the conditional probability is high, but the contribution of the given cause to that probability is small. Indeed, Fernbach et al. found cases where the conditional probability of the effect given a weak cause is judged lower than the marginal probability of an effect (i.e. the probability of the effect when no evidence is mentioned). For instance, participants told about weak but positive evidence that the Republicans would win the House of Representatives in the 2010 US mid-term election (a newspaper endorsement of a single candidate) were actually less likely to gamble on the Republicans winning than participants given no evidence. Apparently, the focus on the single cause mentioned in a conditional judgment crowded out other causes that would otherwise be considered. Fernbach et al. refer to this as the weak evidence effect.

Effort reduction as a reconciliatory principle

Why might people violate the norms of causal Bayes nets and probability theory more generally? A few authors have tried to bring theoretical organisation to the heuristics and biases literature by appealing to effort reduction as a fundamental drive of human cognition. Kahneman and Frederick (2002) argue that a heuristic is a surreptitious substitution of an easy question for a hard one. More recently, Shah and Oppenheimer (2008) taxonomised a large number of heuristics according to their role in reducing effort relative to what would be required by a full optimal solution to the weighted-additive choice rule (Payne, Bettman and Johnson 1993). They argue that the optimal solution is out of reach because it requires a complex series of processes with many inputs and computational demands. One might argue that the appeal to effort reduction is too vague to provide much explanatory power on its own. We agree with this point, and our goal in this paper is not to litigate this issue but merely to demonstrate the types of shortcuts people make in causal reasoning.

Like the weighted-additive rule, normative causal inference imposes substantial computational demands. Consider the many steps required to render a causal-based judgment of conditional probability: first, a qualitative representation or model of the causal situation must be constructed. This involves not only identifying the causal relations that directly relate evidence and hypotheses but also filling in additional causal variables that may be relevant to the judgment, including alternative causes, enabling conditions, disabling conditions, and so forth. At this step, two sorts of errors may arise. Errors of omission may occur when relevant variables are not included in the model. This may occur because a reasoner's cursory search of long-term memory may fail to yield all relevant knowledge. In contrast, errors of commission occur when relevant knowledge that is readily available (e.g. already retrieved from memory, supplied as part of the reasoning problem, etc.) is nevertheless ignored by the reasoner.

Next, one must identify the functional relations by which causes bring about effects and parameterise those relations (e.g. with the strength of the causal relations). Judgment errors may arise at this stage if causal relations are represented in a simplified form (e.g. as a symmetric associative relation, Rehder 2009) or if causal strengths are represented with low fidelity (e.g. qualitatively rather than quantitatively).

Third, to assess the net influence of hidden variables, the reasoner must integrate over their possible states. To illustrate the importance of integrating over the states of potential alternative causes, it is useful to consider the normative equations for causal inferences based on a reasonably general noisy or parameterisation associated with generative causes between binary variables (for details see Waldmann, Cheng, Hagmayer and Blaisdell 2008; Rehder 2010; Fernbach et al. 2011a). Equation (1) specifies the probability of the effect given the cause assuming that it can be brought about by the focal cause itself (with probability W_C) or by one or more alternative causes (with probability WNetAlt). Equation (2) shows that the probability of the cause given the effect is also a function of W_C and W_NetAlt (in addition to the base rate of the cause, P_C). Finally, Equation (3) specifies how W_NetAlt summarises the net effect of N alternative (independent and generative) causes. Importantly, the effect of an alternative cause A_i depends on the strength of the causal relation linking it with the effect (WAi) times the probability that A_i is in fact present (PAi). Note that when there is only a single alternative cause A, Equation (3) reduces to P_AW_A

P(Effect|Cause)=WC+WNetAlt−WCWNetAlt,(1)P(Cause|Effect)=1−(1−PC)WNetAltPCWC+WNetAlt−PCWCWNetAlt,(2)WNetAlt=1−∏i=1,…,N(1−WAiPAi).(3)

Equation (3) suggests a number of shortcuts that reasoners may take in accounting for the influence of alternative causes. For example, for each alternative cause A_i reasoners may assume values for PAi and WAithat reduce the effort involved in computing PAiWAi: assume the alternative is always present (PAi=1), always absent (PAi=0), or always effective (WAi=1). Each of these three possibilities implies a particular type of judgment error: alternative causes are ignored entirely in the second, their strength is ignored in the third, and their influence is overestimated in the first.

Fourth, once a causal model is constructed, parameterised, and the effect of hidden causal factors computed, reasoners must aggregate the influence of the focal and alternative causes to render a judgment of conditional probability. Errors may be introduced at this stage if they choose to use a qualitative combination rule instead of Equation (1) or (2).

Overview of experiments

The analysis just presented suggests a number of hypotheses regarding why errors arise during causal inferences: (a) relevant variables may not be retrieved from memory, (b) their representation may be deleted from the causal model, (c) integration may rely on shortcuts that misrepresent their influence, and (d) alternatives may not be aggregated appropriately with the focal cause. The aim of the following experiments was to conduct a first assessment of the extent to which each of these shortcuts influence causal reasoning. We taught people novel categories by describing a category's features, its causal model (the structure of its interfeature causal relations), and the model's parameters (the strengths of those relations). After being trained on a novel category, subjects were presented with a category member that possessed one or more features and asked for the likelihood that another feature was present. Following Fernbach et al. (2011a), we varied both the direction of inference (i.e. predicting effects from causes and diagnosing causes from effects) and the strengths of the focal and alternative causes. Specifying causal strengths also allowed us to calculate the normative responses to the inference questions and thus identify conditions that lead to errors. In Experiment 1, we extended Fernbach et al.’s (2011a) design to a task with novel categories where memory retrieval requirements were minimised (participants were provided with a diagram of the category's causal model during the inference task). In Experiment 2, we varied the number and explicitness of alternative causes and the computational difficulty of aggregating parameters. In Experiment 3, we varied whether alternative causes are unknown, known to be present or known to be absent in a particular category member.

Experiment 1

Participants in Experiment 1 were taught one of the two category structures in Figure 1. All subjects learned categories with four binary features. Feature C_w was described as the cause of E_w and C_s was described as the cause of E_s. Examples of features and causal relationships are shown in Table 1. Our central manipulation concerned the different strengths of the alternative causes of the two effect features. To convey the presence of alternative causes, both E_w and E_s were described as also being caused by “one or more” unnamed category features. However, the alternative causes of E_w were described as relatively weak (hence the “w” subscript) by stating that E_w appeared in category members with probability 25% even when C_w was absent. In contrast, the alternative causes of E_s were described as relatively strong (“s”) by stating that it appeared in category members with probability 75% even when C_s was absent.

Figure 1.

Causal structure tested in Experiment 1: (A) strong focal cause condition and (B) weak focal cause condition.

Table 1.

Features and causal relationships for Myastars, an artificial category.

Feature	Causal relationship
Ionised helium	Ionised helium causes the star to be very hot. Ionised helium participates in nuclear reactions that release more energy than the nuclear reactions of normal hydrogen-based stars, and the star is hotter as a result
Very hot temperature
High density	High density causes the star to have a large number of planets. Helium, which cannot be compressed into a small area, is spun off the star, and serves as the raw material for many planets
Large number of planets

After learning, participants were asked a series of conditional probability questions. On predictive questions they were told that a particular category member possessed a cause feature and were asked to judge the likelihood that it possessed the relevant effect feature, and vice versa for diagnostic questions. Subjects were also asked to predict the cause and effect given the absence of the effect and cause, respectively. Finally, subjects were asked to make unconditional (i.e. marginal) judgments by estimating the prevalence of each feature in a category. Effects should be judged to be more prevalent to the extent they have strong vs. weak alternative causes.

Unlike in Fernbach et al.’s (2011a) studies, participants did not have to retrieve causal knowledge from memory. Participants learned the novel interfeature causal relations as part of the experiment and were provided with a diagram of the causal relations during the inference test. Participants were also provided with explicit information regarding both the functional form of the relationship (in the form of a description of the causal mechanism by which the cause generates the effect) between the cause and effect features and the strength of those relationships (by specifying how often a cause, when present, would generate its effect). Finally, the effect of alternative causes was provided in summary form, eliminating the need to integrate over hidden causes. That is, by telling them it was either weak (25%) or strong (75%), reasoners were directly provided with the value of W_NetAlt (the net effect of all alternative causes), relieving them of the need to compute it via Equation (3).

A secondary objective of Experiment 1 was to assess whether the neglect of alternative causes depends on the strength of the focal cause itself. To this end, the strengths of the focal causes (between C_w and E_w, and C_s and E_s) were manipulated as a between-subjects variable. Half of the subjects were told that the strength of these relationships was 40% (Figure 1(A)), whereas the other half was told that their strength was 80% (Figure 1(B)). This manipulation is of theoretical interest because it speaks about the possibility that reasoners exhibit strategic laziness, they neglect alternative causes only when doing so is unlikely to yield large errors in judgments. When a focal cause is strong (e.g. 80%), the maximum error in predictive inference cannot exceed 20% (because the effect cannot be more probable than 100%), whereas it may be as large as 60% for a weak focal cause of 40%. This raises the possibility that subjects in Experiment 1 may be less likely to neglect alternative causes in the weak focal cause condition than the strong one.

We summarise by presenting the normative predictions for this experiment in Figure 2.1 1 First, normative predictions for predictive inferences (computed from Equation (1)) are shown in Figure 2(A). This panel confirms that such inferences ought to be sensitive to alternative cause strength and this sensitivity ought to be larger for weak (40%) vs. strong (80%) focal causes. Second, Figure 2(B) shows that diagnostic inferences (Equation (2)) should also be sensitive to alternative causes and, on the basis of previous research, we predict that they will be in this experiment. The probability of an effect given the absence of the focal cause (panel C) is simply the net strength of the alternatives (W_NetAlt). Finally, Figure 2(D) reveals that predicting a cause given the absence of the effect (panel D) should be sensitive to the strength of the focal cause but not the strength of the alternative cause.2 2 We assess where the inferences made by human causal reasoners diverge from those in Figure 2.

Figure 2.

Normative predictions for Experiment 1. Predictions are generated assuming a base rate of 0.67 for cause features C_w and C_s. (A) Inferring an effect given the presence of its cause, (B) inferring a cause given the presence of its effect, (C) inferring an effect given the absence of its cause, and (D) inferring a cause given the absence of its effect.

Method

Materials

Six novel categories were tested: two biological kinds (Kehoe Ants and Lake Victoria Shrimp), two nonliving natural kinds (Myastars [a type of star] and Meteoric Sodium Carbonate), and two artefacts (Romanian Rogos [a type of automobile] and Neptune Personal Computers). Each category had four binary feature dimensions. One value on each dimension was described as typical of the category and the other was described as atypical. For example, participants who learned Myastars were told that “Most Myastars have very hot temperature whereas some have a low temperature”, “Most Myastars have high density whereas some have a low density”, and so on.

Subjects were also provided with causal knowledge corresponding to the structures in Figure 1. Each causal relationship was described as one typical feature causing another, with one or two sentences describing the mechanism responsible for the causal relationship (see Table 1 for an example). In addition, a sentence describing the strength of the relationship (either 40% or 80%) was worded to convey the fact that the strength represented the power or propensity of the cause to individually produce the effect (rather than a conditional probability of the effect given the cause). For example, for the Myastar causal relationship between high density and a large number of planets, subjects were told “Whenever a Myastar has high density, it will cause that star to have a large number of planets with probability x%”, where x was either 40 or 80. Note that Experiment 2 changes this wording to further emphasise the generative nature of the causal strength information.

Participants were also given information about the possibility of alternative causes of E_w and E_s. For example, participants who learned about Myastars learned not only that high density causes a large number of planets but also that “There are also one or more other features of Myastars that cause a large number of planets. Because of this, even when its known cause (high density) is absent, a large number of planets occurs in x% of all Myastars”, where x was either 25 or 75. The assignment of the four typical category features to the roles C_s, E_w, C_s, and E_s in Figure 1 was balanced over subjects, such that for each category a pair of features played the role of C_w and E_w for half the subjects and C_s and E_s for the other half. The features and causal relationships for all six categories are available from the authors.

Participants

Ninety-six New York University undergraduates received course credit for participating in this experiment. There were three between-subject factors: weak (40%) vs. strong (80%) focal causes, the two assignments of physical features to roles of C_w, Ew, C_s, and E_s, and which category was learned (6 levels). Participants were randomly assigned to these 2×2×6=24 between-participant cells subject to the constraint that an equal number appeared in each cell.

Procedure

Experimental sessions were conducted by a computer. Participants first studied several screens of information that presented the category's cover story, which features occurred in “most” vs. “some” category members, the two causal relationships (their strength and causal mechanism), the presence of alternative causes (strengths of 25% or 75%) for features E_w and E_s, and a diagram similar to that in Figure 1. When ready, participants took a multiple-choice test that tested them on the knowledge they had just studied. While taking the test, participants were free to return to the information screens; however, doing so obligated them to retake the test.

Participants then performed inference and feature likelihood tests. During the inference test, participants were presented with two blocks of eight inference questions. Four of the eight questions involved features C_w and E_w. They were asked to predict the effect, given the presence of the cause and its absence and to predict the cause given the presence of the effect and its absence, that is, to estimate P(Ew|Cw), P(Ew|Cw‾), P(Cw|Ew), and P(Cw|Ew‾). The analogous four questions were asked for features C_s and E_s. For each question, participants were asked to suppose that a category member had been found with one feature and were asked whether it had the other feature. To attenuate memory retrieval demands, participants were provided with a printed diagram of the causal relations similar to the one of those in Figure 1 and told that “To answer these questions you should use the information about causal relationships between features of [category name] that you learned about earlier in the experiment”. Responses were entered by positioning a slider on a scale where the left end was labelled “Sure that it doesn't” and the right end was labeled “Sure that it does”. The position of this was scaled into the range 0–20. The presentation order of test items within a block was randomised for each participant.

During the feature likelihood rating task that followed the inference test, each of the two features on the four binary dimensions was presented on the computer screen and what proportion of all category members possessed that feature was rated by the subjects. The order of these trials was randomised for each participant. Subjects could continue to refer to the printed diagram of causal relationships during this test.

Results

Inference ratings

An initial analysis of the inference ratings revealed no effects of which of the six categories participants learned and the assignment of category features to the roles of C_w, E_w, C_s, and E_s in Figure 1. Thus, the average inference ratings collapsed over these factors and are presented in Figure 3 for each type of inference as a function of the strengths of the focal and alternative causes.

Figure 3.

Inference ratings from Experiment 1 as a function of the strengths of the focal and alternative causes. (A) Inferring an effect given the presence of its cause, (B) inferring a cause given the presence of its effect, (C) inferring an effect given the absence of its cause, and (D) inferring a cause given the absence of its effect. Error bars are standard errors of the mean. ^†p<0.10. *p<0.05. **p<0.01.

The key question of Experiment 1 concerned whether predictive inferences would be sensitive to the strength of the effect's alternative causes, and in turn whether this effect would be moderated by the strength of the focal cause. In fact, the predictive ratings shown in Figure 3(A) indicate that although subjects correctly rated the effect feature to be more likely for stronger (80%) vs. weaker (40%) focal causes (ratings of 17.3 vs. 13.4, respectively), in neither condition were the ratings at all affected by the strength of the alternative causes. A 2×2 mixed ANOVA with focal cause strength as the between-subject factor and alternative cause strength as the within-subject factor revealed a main effect of focal cause strength, F(1, 94)=31.27, MSE=582, p<0.0001, confirming the larger inference ratings for the 80% focal cause, but no effect of alternative strength and no interaction, both F’s <1. These results replicate those reported in Fernbach et al. (2011a) with different materials and when (a) subjects had a diagram of the causal relations (meaning those relations were highly available) and (b) when the information about alternative causes was provided in summary form.

In contrast (and also consistent with the previous results), inferences in the diagnostic direction were sensitive to the strength of the alternative causes. Figure 3(B) reveals not only that ratings were higher for stronger vs. weaker focal causes (average of 16.4 vs. 12.6 in 80% and 40% conditions, respectively), they were lower for stronger vs. weaker alternative causes (13.9 vs. 15.1), consistent with the fact that a cause is less likely when stronger alternative causes are present. A 2×2 ANOVA revealed a main effect of focal cause strength, F(1, 94)=24.67, MSE=712, p<0.0001, a main effect of alternative strength, F(1, 94)=17.36, MSE=90, p<0.0001, and no interaction, F<1.

Figure 3(C) presents the predictive ratings in which the category member was explicitly stated as having the atypical value on the cause dimension (e.g. a Myastar with low rather than high density). These inferences were correctly sensitive to the strength of the alternative cause (average ratings of 8.2 vs. 4.4 in 75% and 25% conditions, respectively). A 2×2 ANOVA revealed a main effect of alternative cause strength, F(1, 94)=40.60, MSE=428, p<0.0001. Unexpectedly, in this analysis there was a marginal effect of focal cause strength, F(1, 94)=3.91, MSE=667, p=0.051, reflecting that ratings were higher for focal strengths of 80% (7.1) vs. 40% (5.6).3 3

Finally, Figure 3(D) presents the diagnostic ratings in which the category member was explicitly stated as having the atypical value on the effect dimension (e.g. a Myastar with a small number of planets). As expected, these inferences were sensitive to the strength of the focal cause (ratings of 5.8 vs. 3.5 in 80% and 40% conditions, respectively). A 2×2 ANOVA confirmed a main effect of focal cause, F(1, 94)=9.75, MSE=664, p<0.01. Unexpectedly, this analysis also revealed an effect of alternative strength, F(1, 94)=8.58, MSE=114, p<0.01, reflecting the fact that ratings were higher for alternative strengths of 75% (5.1) vs. 25% (4.2).4 4

Feature likelihood ratings

The purpose of the feature likelihood test was to confirm that judgments regarding the prevalence of the effect features E_w and E_s reflected the strengths of the focal and alternative causes. The likelihood ratings for the effect features are presented in Figure 4 as a function of the two types of strengths. As expected, the effect features were rated as more prevalent both for stronger vs. weaker focal causes (76.6 vs. 67.3) and stronger vs. weaker alternative causes (75.8 vs. 68.1). A 2×2 ANOVA revealed a main effect of focal cause strength, F(1, 94)=11.62, MSE=353, p<0.001, a main effect of alternative strength, F(1, 94)=16.80, MSE=169, p<0.0001, and no interaction, F<1.

Figure 4.

Likelihood ratings for the effect features (E_w and E_s in Figure 1) from Experiment 1. Error bars are standard errors of the mean. *p<0.05. **p<0.01.

Discussion

Despite the extensive differences in methodology, Experiment 1 replicated the results of Fernbach et al. (2010, 2011a) showing no sensitivity to alternative strength in predictive inferences. This pattern emerged despite the facts that providing participants with a diagram of the causal structure eliminated the need to retrieve causal relations from memory and that providing the strength of alternative causes in summary form eliminated the need to integrate over hidden variables. Diagnostic inferences, in contrast, were appropriately sensitive to the alternative causes, consistent with the asymmetry between predictive and diagnostic inferences also found in previous research.

An important question is whether the neglect of alternatives during predictive inferences reflected subjects’ misunderstanding of the information we provided about alternative cause strength. Two sources of evidence argue against this possibility. First, subjects rated the effect as more likely for strong vs. weak alternative causes when the focal cause was absent. Second, subjects’ unconditional judgments concerning the prevalence of the effect features among category members were also sensitive to the alternatives. That is, subjects were able to make use of the alternative cause information for several types of judgments, but not those involving predicting an effect given the presence of a cause.

Another concern is whether the information we provided about the strength of the focal causes was interpreted as a causal power, that is, the propensity of the cause to produce the effect. For example, neglect of alternatives would be expected if those strengths were instead interpreted as a conditional probability that incorporates the effect of alternative causes. Recall, however, that this strength information was provided as part of a description that emphasised the generative nature of the causal mechanisms and that the causal powers were written on the links on the diagrams given to participants (making it clear that they referred to the individual relations). More importantly, subjects’ unconditional feature likelihood judgments were sensitive to the strength of the alternative causes showing that participants realised that the proportion of category members possessing the effect feature was higher when alternatives were strong. In Experiment 2, the causal strength information was reworded to further emphasise the generative interpretation of the strength information.

A final important result from Experiment 1 was that alternative causes were neglected during predictive inferences regardless of the strength of the focal cause. This resulted in an especially egregious judgment error in the weak focal cause condition in which an effect given the cause should be 30% more likely for a strong vs. weak alternative cause (Figure 2(A)). This suggests that reasoners’ neglect of alternatives does not only arise when the error in judgment is likely to be small. Rather than arising from a reasoner's strategic decision to neglect alternatives, doing so appears to operate as a general heuristic that occurs regardless of the potential loss of accuracy involved.

Experiment 2

In Experiment 1 we found that reasoners fail to attend to alternative causes in predictive inferences even when the need to retrieve them from memory and integrate over their possible states was eliminated. One reason this may have occurred is that even though a representation of the alternative causes was literally right in front of them, reasoners may not have recognised their relevance to predictive inferences. That is, they may have committed what we referred to earlier as an error of commission by excising the alternative causes from the causal model with which they reasoned during predictive inferences. In Experiment 2, we assessed whether changing the representation of the alternative cause would affect participants’ inferences. We varied whether the alternative was described as the net influence of “one or more other features” (the implicit condition) as in Experiment 1 or as a single explicit category feature (the explicit condition).

Participants were taught one of the category structures in Figure 5. All subjects learned categories with six binary features. Features C_w and C_s were described as causes of E_w and E_s, respectively, each with a strength of 60%. Whether or not the alternative causes of the effect features were explicit was manipulated as a between-subjects variable. In the implicit condition, the alternative causes of E_w and E_s were described as “one or more other” category features (Figure 5(A)). In the explicit condition, the alternative causes of E_w and E_s were two of the category's instructed features, namely, A_w and A_s, respectively (Figure 5(B)). Alternative cause strength was again manipulated as a within-subjects variable in both conditions, with the alternative for E_s (75%) being stronger than the alternative for E_w (25%).

Figure 5.

Causal structure tested in Experiment 2: (A) implicit alternative cause condition and (B) explicit alternative cause condition.

Two additional changes to the materials were made. First, recall that in Experiment 1 subjects were given information regarding the likelihood of an effect when the known cause was absent (e.g. “…even when its known cause (high density) is absent, a large number of planets occurs in x% of all Myastars”). To make the explicit and implicit conditions of Experiment 2 comparable, analogous information was provided for the explicit alternative causes Aw→Ew and As→Es. For example, an alternative cause of a large number of planets in Myastars was a very hot temperature. For this causal link, subjects were not only told that “Whenever a Myastar has a very hot temperature, it will cause that star to have a large number of planets with probability 60%’, but also the probability of the effect in the absence of the other cause: “This means that when a Myastar has a very hot temperature and the other cause of a large number of planets (high density) is absent, it will have a large number of planets with a probability of 60%”. The focal causal relations Cw→Ew and Cs→Es were described the same way. Note that this wording further emphasises the generative interpretation of the causal strength information (as opposed to it being a conditional probability).

Second, recall that Experiment 1 found evidence that some subjects interpreted the causal links as having a dual sense (e.g. high density causes a large number of planets and low density causes a small number of planets; see Footnote 3). To address this possibility, subjects in Experiment 2 were told that most Myastars had high density and some had low density, they had either high or normal density. With this wording, we felt that they would be unlikely to assume a causal link between atypical feature values (e.g. that normal density causes a normal number of planets).

Although the intent of Experiment 2 is to assess whether a more explicit representation will yield greater sensitivity to alternative causes, it introduces a computational requirement that was absent in Experiment 1, namely, the need to integrate over hidden variables. Computing the influence of the alternative cause in the explicit condition requires multiplying its causal power (W_A) by the probability it is present (P_A) in the manner specified by Equation (3). Thus, making the alternative cause explicit may result in less sensitivity to alternative cause strength relative to the implicit condition. Experiment 3 will test conditions in which the alternative is explicit but the need for integration is avoided.

Figure 6 presents the normative predictions for Experiment 2.5 5 Both predictive inferences (Figure 6(A) and (C)) and diagnostic inferences in which the effect is present (Figure 6(B)) should be sensitive to the strength of the alternative causes. However, that sensitivity should be weaker in the explicit condition (because of the need to multiply by PAi, the base rate of the alternative cause). In contrast, diagnostic inferences in which the effect is absent (Figure 6(D)) should be insensitive to the alternatives.

Figure 6.

Normative predictions for Experiment 2. Predictions are generated assuming a base rate of 0.67 for both focal (C_w and C_s) and alternative (A_w and A_s) causes. (A) Inferring an effect given the presence of its cause, (B) inferring a cause given the presence of its effect, (C) inferring an effect given the absence of its cause, and (D) inferring a cause given the absence of its effect.