Comments

You must log in or register to comment.

ribnag t1_iywckl2 wrote

This is a great cautionary tale against looking for simple chains of causality, but the title is misleading - Causal projection is an extremely specific technical term. Thinking in causal terms is still one of the most powerful tools we have in modern science, we just need to be careful not to fall for our own confirmation biases.

135

owlthatissuperb OP t1_iywo5op wrote

So, I made up the term "causal projection" for this piece. I did look around for previous uses, but couldn't find any consistent usage that implied it was a term of art. Is there a technical definition that I'm missing?

30

ribnag t1_iywpp4b wrote

Your link defines it as iteratively converting a DCG to a DAG by removing the lowest-weighted connections until only forward paths exist between the hypothetical cause and the target effect, thereby establishing "causality".

In one sense that's entirely defensible, but the fundamental flaw is that you can do the same between almost any nodes in the graph (as long as a cycle exists between them - You can't e.g. prove JFK's assassination caused the Big Bang because there's no loop that can ever go back to that point).

/ Edit: My apologies, I misunderstood that you're the actual author of TFA. But you're still right!

24

ward8620 t1_iyxf67k wrote

I’m getting my phd in economics with a focus on empirical estimation, and I want to offer some support for your article as well as some perspective from one of the disciplines you cited in the post. In general, I think statements of causality like the Brooks piece you cited are incredibly misleading and entirely unprovable, usually marred by reverse causality and the ommission of other potential explanatory variables. I totally agree that any causal statements, especially those made by political actors, should be viewed with skepticism.

But as far as causal claims within academic research, I do believe economics takes it more seriously than other non-quantitative disciplines. Econometrics, the economic sub-discipline of statistics, is almost chiefly concerned with understanding when we can say that statistical estimates can be interpreted as causality. In the last few decades, researchers have become even more precise in their understanding of what types of causality we’re measuring (I.e. what portion of the population it’s relevant to). In general, we’re considering the causal effects of policies or behavior rather than causation of events in history, which is, as you suggest, nearly impossible to parse in most scenarios. Our reach can certainly be limited and we don’t get it right 100% of the time, but every economist I know does not make causality claims lightly. Perhaps this reveals a bit of my personal bias for a discipline that I am fascinated by, and nothing you’re saying here directly implies that you disagree with anything I’ve written here, but I wanted to add a bit of my own perspective from someone who feels we can all be a bit too quick to state two things are connected by causality and spends most of their time trying to figure out when we can actually make those claims. Great article! :)

17

owlthatissuperb OP t1_iyxpix4 wrote

Thanks!

Yeah I agree with you. Typically, when you get into academic research, the domain experts fully appreciate how complicated the situation is, and know how to properly interpret causal claims.

> Econometrics, the economic sub-discipline of statistics, is almost chiefly concerned with understanding when we can say that statistical estimates can be interpreted as causality

IMO (and this is controversial), you can never infer causality from looking passively at data--data alone can't discern between causation and correlation. It can only lend support to a working theory (i.e. if you already have a proposed causal mechanism).

The only way to infer causality is to reach into a system and modify it. If you can turn the "cause" knob and consistently observe the effect, you can infer causality. But passively peering in and seeing "when A changes, the B tends to change too" doesn't get you there (even if e.g. there's a time delay).

But I do think others would disagree with me.

8

ward8620 t1_iyxsjny wrote

I completely agree that you can’t infer causality by “passively” looking at data, in the sense that it sounds like what you’re describing is naively looking at a scatter plot or running a regression of Y on X.

The key insight of causal econometrics is exactly the point you’re making, that in order to understand causality we have to somehow approximate the environment that is present in a lab, where we can randomly assign individuals to treatment and control groups, ensuring that people in both groups are on average the same and thus the only difference in expectation between these groups is the treatment of interest. Of course, we can’t do this with observational data, so we look for natural experiments, or environments where random distribution of treatment may occur among some population by chance.

There are a lot of specific methods, but the essence of them all is that, as long as there is some feature that is as-good-as randomly distributed between people, and that feature is correlated with the treatment we care about, we can use variations in that random factor to estimate the causal effect of changing treatment for those individuals who shift their behavior because of the random variable. An early example in economics is using variation in military participation driven by the Vietnam draft lottery to estimate the causal effect of military participation on lifetime earnings. So in that way, economists really do try to estimate causality by looking for situations in which we might think the “cause” knob is being turned due to historical or institutional quirks.

I’m just skimming the surface, but if you’re interested you should check out Mostly Harmless Econometrics by Angrist and Pischke (the former of whom won the Nobel Prize last year for these findings) or Causal Inference: The Mixtape. Our capability of being very confident about causality in data is definitely limited to when we can find these “natural experiments,” but researchers have been able to find quite a lot and it really forms the basis of modern empirical economics research.

9

DarkSkyKnight t1_iyz6pft wrote

Put more simply certain mathematical assumptions required for causality cannot be justified from the data alone; it has to be argued.

5

My3rstAccount t1_iyy2hk9 wrote

Oh my god, people are experiments, and it's in our money and religions.

−8

YoungXanto t1_iz07hoe wrote

>you can never infer causality from looking passively at data

In this view, causal inference is relegated only to a single observation. Extrapolating results to any other similar expiremental set-up (even identical) is just that. To quote Hume,

>I say, then, that, even after we have experience of the operation of cause and effect, our conclusions from that experience are not founded on any reasoning, or any process of the understanding

There is an epistemological limit of the concept of causation. In statistical inference, based on probability theory, a good professor will use this limit to routinely used to smack undergrads upside the head- be it regression or p-values.

We assume distributions of underlying samples, along with central limit theorem to do statistics that support causal inference. We can attempt to control for type 1 error via our set-up, but even when our assumptions are not violated we still can never claim a result with 100% certainty.

Carefully controlled experimentation is better than using some observation set, but it suffers two drawbacks- it is expensive to obtain and it's uses beyond the experiment are quite limited, necessarily requiring extrapolation. So I argue pragmatically that we should use latent data and the statistical tools at our disposal to understand causation (to the extent it actually exists) with the appropriate limiting caveats.

5

owlthatissuperb OP t1_iz0gzc7 wrote

I agree with you. If we can start with a reasonable hypothesis, looking back at historical data is a valuable way to gather evidence for that hypothesis.

1

passingconcierge t1_iyzm86f wrote

> The only way to infer causality is to reach into a system and modify it.

This seems, to me, to be an unfounded strong claim about inference that entails causality always being obliged to be empirical. Which, essentially, reduces econometrics, as it exists, to being entirely correlative knowledge because it is composed entirely of historical data.

What if there is no "cause knob" but, also, the set of data, C, at time 0 always results in the specific set of data, E, at time x>0, but a random set of data, Rn, at any time x where n<>x? There is nothing to modify since Modifying C changes the set and so there is no transition C->E. Which, essentially, means you have frustrated, prevented, blocked - essentially interrupted - the causal connection between C & E. This might not seem to be clearly expressed but it does actually require that causality is holistically considered: you have to take all the nodes and arcs of the graph into account.

You might say, that is simply a description of correlation and always was, and your claim might seem convincing. But how do you exclude causality. Even at a vanishingly small probability, the statement "that C causes E" is a fact; and a legitimate claim to make, even if you must qualify it by saying but only once in a billion. You might say one in a billion means it will never happen. Which is not a great claim. The probability of winning the lottery is, say, one in a billion - or tens of billions - yet there is more than one lottery winner since it started. The point being that, just because something has a low probability of happening does not forbid it happening.

> "when A changes, the B tends to change too" doesn't get you there (even if e.g. there's a time delay).

So, the idea here is not proven by your claims. You can infer causality by passively looking at data. Econometrics does it all the time. The deeper problem being that we live in a Universe that is deeply causal. Which suggests that starting from an assumption that there is "no causality involved" is a flawed premise. A flawed premise that is easily rejected because the data was created by a person not a random process and, therefore, you need good reason to reject the notion that the data "has" causality locked into it.

The idea of causality as being purely mechanistic, which is what it seems you are supposing here, is not the only way you can reason about causality.

3

owlthatissuperb OP t1_iz0gk1h wrote

Yeah I mostly agree with you. Here's the distinction I'll make:

If you have a starting hypothesis (e.g. an increase in the money supply will cause inflation), you can very much go back and look at historical data to find support for your hypothesis.

But if you have a completely unlabeled dataset (just a bunch of variables labeled x, y, z, ...), and can see how those variables change over time, there's no way to look at the data and say with any confidence that "x has a causal impact on z"

2

passingconcierge t1_iz22jo3 wrote

> If you have a starting hypothesis (e.g. an increase in the money supply will cause inflation), you can very much go back and look at historical data to find support for your hypothesis.

You can express "increase in money supply" and "inflation" as "just a bunch of variable labels". So the two scenarios you sketch are identical in every sense apart from the first having named variables and the second having anonymous variables. Which gives the appearance that you are attributing causality on the basis of some pre-exisiting theory about "money supply" and "inflation". Which runs the risk of creating a circular definition. In essentials, you are ignoring the insights of Hume and the response of Kant regarding the insights of Hume.

I am happy to agree that if we have two columns of numbers

  1     1
  2     4
  3     9
  :      :
 99    9,801

we could agree that the relationship between the first column and the second is that the second is the square of the first. That establishes that there is a mathematical relationship but that mathematical relationship does not guarantee any kind of causality. Although, if you take the position of Tegmark - the Mathematical Universe Hypothesis - the existence of a mathematical relationship guarantees reality but not necessarily causality. Which leaves you in the same situation: data sets, labelled or not, do not reveal causality. For that you need a theory of knowledge that gives warrant to the knowledge that x=9 therefore y=81 is a causal relationship and simply labelling the numbers with "money supply equals nine therefore inflation equals eighty one" does not establish that.

Which largely points to there being no "causal knobs" inside data sets. There may be something about a data set that has some kind of "establishes causality" about it, but it is not simply doing mathematical manipulations or matching variable labelling. There is something rhetorical going on that you really are not making clear.

2

owlthatissuperb OP t1_iz2pjrl wrote

When I'm talking about labeled vs unlabeled, what I really mean is that we have some intuition for how the labeled dataset might behave. E.g. "an increase in money supply causes an increase in inflation" is a better causal hypothesis than "an increase the president's body temperature causes an increase in inflation". We can make that judgement having never seen data, based on our understanding of the system.

Having made that hypothesis, we can look back to see if the data support it. The combination of a reasonable causal mechanism, plus correlated data, is typically seen as evidence of causation.

If you don't have any intuition for how the system works, you don't have the same benefit. All you can see are the correlations.

E.g. in your x->x^2 example, if all you had were a list of Xs and Ys, you couldn't tell if the operation was y=x^2 or x=sqrt(y). Without any knowledge of what the Xs and Ys refer to, you're stuck.

1

passingconcierge t1_iz480o8 wrote

> When I'm talking about labeled vs unlabeled, what I really mean is that we have some intuition for how the labeled dataset might behave. E.g. "an increase in money supply causes an increase in inflation" is a better causal hypothesis than "an increase the president's body temperature causes an increase in inflation". We can make that judgement having never seen data, based on our understanding of the system.

What you have here is a circular argument. You are arguing that we can label variables with labels that are theory driven and so we can infer causality between those labels. You have already theorised causality without the data. So, the data is not the source of explanation it is merely a means to, rhetorically, assert that causality is the explanation. You have a causal explanation in mind and label the data informed by that causal explanation and then you carry out a mathematical operation on the numbers labelled and so, because you have labelled them you infer a causal explanation.

So you are correct: you can make a judgement without seeing the data. The data adds nothing to your understanding of the system because you have started from a theory, a model, and your activities with the causal relationship in mind. The data does not "contain causal knobs".

> Having made that hypothesis, we can look back to see if the data support it. The combination of a reasonable causal mechanism, plus correlated data, is typically seen as evidence of causation.

I would argue that what you are doing here is establishing rules for a rhetoric. Let us assume that we both accept mathematics is a kind of unbiased source of knowledge. This is a broad and possibly unwarranted assumption that would need refining, but accept it, broadly, for now.

You have a set of data which you recognise as x and y values. You have no theoretical labels to add them. But you list them and you are lazy. So you use a spreadsheet to tell you that the y column can be derived from the x column by

  f(x) = x^2   with R^2  = 1

So you are happy. The coefficient of determination ( R^2 ) tells you that the data "100% supports" the y=x^2 hypothesis. You are happy until someone comes along and says, have you considered

 f(x) = x * x
 f(x) = sqrt(g(x)), g(x) = x * x
 f(x) = (x * x * x) / x
 f(x) = (x * x * x * x) / (x * x)
 f(x) = (x^n) / (x^n-1) forall(n)&gt;2

You object that this is all just messing about with variations on squaring things. I agree. But I point out that all I am doing is showing that there is more than one way to express a relationship of x to y but, generally, avoiding the use of y as a label.

So when you have f(x) = sqrt(g(x)), g(x) = x * x it is an awful circumlocution but it demonstrates that you can have a whole range of things "happening" to avoid using y. Which raises an interesting point about your notions of labelling data.

For a moment, pretend x can be relabelled "money supply" and y can be relabelled "inflation". We have the data set, as before {(1,1),(2,4),(3,9), ..., ( n,n^2 )} and we are supposing that the relationship is f(x) = sqrt(g(x)), g(x) = x * x or it is f(x) x * x. The first things first,

    f(x) is clearly to be relabelled as inflation.
    g(x) is also inflation (see your point^1 below)
    sqrt(g(x)) is money supply

Your point is that labelling clarifies causality. So, in mathematics it is permissible to rearrange a formula. But you are inferring causality so the only symbol in common in all of the formulations is the equals symbol. Which you might be holding in place of "causes". Which does correspond to your notion of Directed Acyclic Graphs but then places a huge constraint onto what you can actually say with labels.

So, because we have two formulations that you definitely agree on - the ones in the footnote - you can, rhetorically, say that we cannot tell if the causal case is

 y=x^2 .................... y is caused by x^2     
 x=sqrt(y) ................ x is caused by sqrt(y)

which is then translated into

 inflation is caused by squaring the money supply
 the money supply is caused by square rooting inflation

What this highlights is that you now actually need, back in the labels, some meaningful understanding of what "squaring the money supply" is and what "square rooting inflation" is. Because, to be causally coherent, these cannot just be vacuous utterances. This example is incredibly simple.

Just imagine what would happen if your chosen econometric methodology dictates the use of linear regression. You then have a philosopical need to explain x and y in terms of a lot of mathematical structuring around squares, roots, differences, and so on.

Which might boil down to me saying, "I do not think that the equals sign is a synonym for causality". But it might also be saying that "data adds nothing to causal explanation in economics".

Quite literally, you have show two possible formulae for a simple relationship. Which suggests that, at best, a 1 in 2 chance (50% probability p=0.5) that you randomly select the "correct" relationship - where, here, "correct" requires that the relationship expresses something causal. This becomes worse when you realise that it is possible to express x^2 in an infinite variety of ways (rendering p=0, effectively true). This means that you are never really talking about causation.

Which leaves you in the position that econometrics is a good source of rhetorical support for causation but only really provides evidence of correlation: that there is, indeed, a pattern in the data. That pattern in the data does not, in any way, vouchsafe your theoretical causal explanation with certainty. Even if you label it.

^1 E.g. in your x->x2 example, if all you had were a list of Xs and Ys, you couldn't tell if the operation was y=x2 or x=sqrt(y). Without any knowledge of what the Xs and Ys refer to, you're stuck.

2

bildramer t1_iyzm2m7 wrote

Obviously you can infer causation from raw "passive" data. What else could our brains possibly be doing when they learn? We don't affect most things.

One way imagine how it's possible is to contrast the DAGs A->C, A->D, B->C, B->D, C->E, C->F, D->E, D->F and the one with arrows flipped. Then think about conditional dependence, P(C|D,A,B) = P(C|A,B) vs. P(C|D,E,F) != P(C|E,F). Knowing everything about effects can increase mutual information between C and D; knowing everything about causes can't. That's how you can distinguish between this DAG and the backwards one using only correlations. No need to intervene anywhere.

2

owlthatissuperb OP t1_iz0fsy9 wrote

I haven't followed your technical example yet but I plan on it. Thanks for that!

> What else could our brains possibly be doing when they learn?

I don't think this argument says much--our brains use fuzzy heuristics all the time, and people were really bad at understanding causality (see things like raindances and voodoo) before experimental science came along (which manipulates the world to see how it reacts).

2

smithsonionian t1_iywa7g5 wrote

Good article.

For further reading, “The Book of Why” by Judea Pearl.

12

wavegeekman t1_iyz60ww wrote

Also "Causality", also by Pearl. More technical.

4

YoungXanto t1_iyzy0jb wrote

There are two technical books he's published. One is Causality (2008) which is very technical and requires a fair amount of math background to understand and work through. He also has "Causal Inference in Statisrics: A Primer" which presents the core concepts with significantly less math pre-requisits

His do calculus is interesting, and he's highly influential in the machine learning literature, but he has a fair amount of detractors.

I personally like the concept that he presents in which we can reverse causality by re-ordering our equations. It points to the epistemological limits of our ability to understand causation in a way that Hume elucidated with his billiard ball examples a couple hundred years ago.

That said, Pearl is a bit arrogant for my taste, coming across as if he's the sole inventor of concepts that have existed for hundreds of years. His framework is a good one, but it is far from the only one.

2

iiioiia t1_iz6m52s wrote

> His do calculus is interesting, and he's highly influential in the machine learning literature, but he has a fair amount of detractors.

This is a completely uninformed question but I am curious: are there any ML libraries you know of that specifically address causality (like, chains of causality, not simply direct correlation)?

0

YoungXanto t1_iz6vnu0 wrote

I can't think of any off the top of my head, but I'm sure there is a plethora out there depending on what you want to do. A good starting point for Google is "DAG" (directed acyclic graph) which are basically the basis of Pearl's framework (and wildly useful in many other contexts)

2

Kelli217 t1_iyx3t9a wrote

I seem to recall that the Communication of the Association for Computing Machinery, one of the more well known journals to have published an article of this nature, “Go To Statement Considered Harmful,” later published an article titled “‘Considered Harmful’ Considered Harmful.”

5

Thirdwhirly t1_iyx5vld wrote

This goes hand-in-hand with the illusion of explanatory depth (IOED) and it’s not entirely about being nefarious or ignorant. That said, they pose a great example about crazy shit happening in the 50’s and 60’s that makes it almost parallel to IOED, but they’re both ways of missing the point.

That said, anything that reminds me that IOED is a thing is good by me.

3

owlthatissuperb OP t1_iyxmzct wrote

I'd never heard of IOED! Thanks for sharing. Sounds like it's related to Dunning-Kruger.

5

Thirdwhirly t1_iyxotum wrote

Totally! However, the way I’ve seen it described, generally, focuses on the the topic and not the person saying it. For example, Black Holes: it is hard to be an expert in this area, but there are so many ways to look at the topic of black holes that any single way is both 1) inadequate, and 2) could be made to sound complete.

6

5-Why-Guy t1_iyx6c01 wrote

Thanks for sharing. It seems like one of the issues that needs to discussed is simplification vs oversimplification. Feedback loops are a tool that can add nuance to causal relationships (after all, they are just adding more causes) but can still be oversimplified. But simplify we must... you can map out causes all day but it still will not fully represent the terrain of reality and we need to simplify ideas to communicate them to others.

3

owlthatissuperb OP t1_iyxobnt wrote

Totally agree! One metaphor I didn't include: a causal explanation is a lens for looking at a particular domain. It clarifies some things, but obscures others.

5

fane1967 t1_iyxfres wrote

What I personally see is a lot of correlation mistaken for causality.

3

colinallbets t1_iyzaavg wrote

You don't have to assume the explanations are true for them to have utility

3

ddd12547 t1_iyztj5q wrote

Electric fence doesnt always have to be turned on, only once.

3

owlthatissuperb OP t1_iz0f3nn wrote

Are you talking about overbelief? Or a fuzzy notion of truth? I'm all for both.

3

colinallbets t1_iz0l0yd wrote

I mean we'll never fully specify a causal mechanism, but we can use these methods to reason with (and identify) potential sources of error.

2

jhagen13 t1_iz0ecuh wrote

That's a whole lot of words and pictures to explain the concept of "shit happens." Sorry for my "rudimentary" response to an overanalysis of the understanding that life is shades of gray, not black and white.

1

ddd12547 t1_iz0ol28 wrote

This, but i take it as the observer of shit happens, takes the issue with existence of shit without knowing who or what is doing the shitting. If sourcing the shit becomes an encompassing preoccupation it might help to examine that knowing where the shit is coming from will never be useful in stoping, changing, or affecting the shits source nor will it affect the shit thats already happened

1

jhagen13 t1_iz0q1f2 wrote

Exactly. The shitting is always occurring, whether it's a result of our own choices or not. Life is cruel, random and unfair. We just do our best to remedy that or, at a minimum, mitigate it. Find the positive lessons or things to be grateful for and move on.

2

ddd12547 t1_iz0r8wm wrote

Agreed only the pattern seeking nature and the human inability to distinguish between abd avoid lumping all or miscategorizing this shit from that shit... can lead to a mistaken sense of a single shit source which while still unknowable could potentially be deduced *(im liking this more and more) a single cruel source negates random and amplifies unfair. The cruel piece of shit fact then becomes inescapable maybe life isnt a toilet we the observer are shit

1

jhagen13 t1_iz0ts1n wrote

And that last part is the hardest for people to grapple with. Admitting that one sees that they're wrong and not reality is a harsh but necessary (and freeing) epiphany.

1

ddd12547 t1_iz0v34f wrote

its also ego death, and tough to recover from. building and maintaining a sense of self after that sort of epiphany is a different kind of ill that philosophy is still trying to cure. Almost every trans-formative self narrative or post crisis identity salience is still at best slightly vulnerable and at worst fragile as hell.

I might have lost the plot along the way in this thread what was the question

I might

1

jhagen13 t1_iz0w20r wrote

Having lived through a soul-crushing event that destroyed my very identity and forced me to rebuild everything about myself....I can verify what you're saying. Strength comes with practice and surrounding oneself with good, like-minded people to help you steer the course.

2

bornofthebeach t1_izd9u3b wrote

Have you heard of "motivated reasoning"? Your "causal projection" sounds like a very similar idea.

Nicely written article, btw! I hadn't heard of Causal Loop Diagrams before, and will be adding that to my vocabulary :)

One critique: the crux of your argument seems to be that it's too hard to calculate counterfactual outcomes given complex CLDs. But isn't that what multi-input multi-output control systems do? Or, given the state of the world (or each of the variables in your graph at least) at some given time, couldn't you just let the model run from there?

A core part of Causal Inference is not just estimating the magnitude and polarity of the effects, but fitting a function to each effect. If you can quantify the effect each variable has on others, I don't see why you can't answer questions like "how much did the assassinations contribute to civil rights reform?".

1

owlthatissuperb OP t1_izf0h4i wrote

Yes I do think you can still run CLDs as a model--but they're much more chaotic. They would typically be modeled using differential equations, which can be really sensitive to a slight change in conditions. E.g. even a tiny miscalculation for the weight of one edge might cause the system to enter into a totally different equilibrium.

2

bornofthebeach t1_izfc0ag wrote

Thanks for the response :)

Fair enough. If we had figured out a way to model society at the level of abstraction in your example, we'd have psychohistory.

It might be worth a clarification that it's not the complexity of the model, but the chaotic nature of the underlying system that's makes it intractable. If you had just as many variables, but they were pool balls being hit, you'd be able to predict the outcome with high accuracy.

It's exogenous random variables and stochasticity in the effects themselves that create the chaos, not the complexity of the model itself. (in my understanding)

If you've tried modeling this with differential equations I'd be super curious to see! I've never made a CLD model before, only the DAG version.

2