One area I’ve gotten interested in lately is causal inference. For those of you not familiar, it’s a methodology that attempts to find and validate cause-effect relationships between variables. The key is that it attempts to do so using data *without* having to rely on controlled experiments. (For an introduction for the casual reader, I highly recommend Judea Pearl’s book *The Book of Why.*)

One concept I found interesting was the implications of something called a *collider.* A collider is a variable that is the effect of two or more variables. As a simple example, consider the following:

The way to read this diagram is that *fame* is a function (or effect) of money, talent, and looks. In other words, fame = f(money, talent, looks). In this example, *fame* is a collider relative to *money, looks, and talent* because they all have arrows pointing into *fame*.

The interesting implication from the book is the following: given that you hold the level of a collider constant, the other variables become dependent upon each other even though there is no causal influence on them.

To understand this better, let’s use an even simpler example: X + Y = Z. In this case, *Z* is a function of X and Y (i.e. Z = f(X,Y)), so Z is a collider with respect to X and Y:

Here’s the key point: if we fix the value of Z at some specific value (say 10, so we’re left with the relationship X + Y = 10), then X and Y become correlated. In other words, if I know the value of X (say 8), then I can infer Y (i.e. 2).

The interesting finding from causal inference is that this dynamic generalizes. Said another way, for a given level of Z, information about X automatically gives me some information about Y, even if I can’t observe Y directly.

### Almost Famous

Why is this interesting? Let’s go back to our fame example. Assuming our causal model is valid, then we can say that *for a given level* level of fame, if we know something about their level of wealth, we can infer something about their level of looks and talent. If we simplified it down for a minute to say just include looks and talent, then we could say – **for a given level of fame – we’d expect that a person who is more attractive is likely to be less talented. **(Another way to think about this is if they were both attractive *and *talented, they’d be even *more* famous).

I haven’t done an analysis to verify this yet, but it’d be interesting to run an experiment. For example, look on social media for actors who have a similar level of followers (as a proxy for fame). Within that cohort, if the model is valid then you would see a spectrum ranging from the good-looking-but-hacky to the talented-but-ugly.

### Counterintuitive Hiring

This finding has interesting implications in many places. Take hiring, for example. Consider, for example, a hypothesis that *seniority_level = f(skill, likability)*. If you think both skill and likability are positively correlated to seniority level, then – for a given level of seniority – consider that the most skilled person is likely to be the one you personally like the least.

These are of course toy examples; the causal structure of real life is likely to be much more complex. But they illustrate both the power of causal analysis and the sometimes counterintuitive truths behind the way the world works.

Love Judea Pearl’s work in the Book of Why. What makes it even more robust is when you add the concept of “feedback” or nonlinearity into causal structures where looks, talent, connections may lead to fame, but it also fosters more opportunities for skill building (building more talent) and potentially more connections. One could go even further to say that all that fame leads to more profit/money, which allows one to invest in one’s wardrobe and/or physical looks–and these reinforcing causal feedback loops create exponential growth over time. Always a limit though. That’s why I really love Jay Forrester’s work on System Dynamics…it takes causal inference to another level.

I agree: feedback, time delays, and non-linearity are key elements in complex systems, whereas the math in causal inference only works assuming no feedback loops. Given your experience, are there purposes for which – or conditions under which – causal inference is a better approach, and vice versa?

For example, causal inference seems useful when you have a data with which to (in)validate your causal graph and can then be used to assess the effect of a given intervention without having to actually specify the function f(x) that relates x and y (i.e. y = f(x)). In system dynamics, by contrast, it seems like you need define all of the functions that relate the variable – this seems more complex. On the other hand, system dynamics models seem very helpful for allowing one to gain an intuitive understanding of ‘bigger, longer-term’ systems (i.e. macroeconomic systems; population dynamics; etc.) I could also imagining using them together: using causal discovery algorithms to help identify which components cause which, and then use that to guide your building of a systems dynamics model. Would love your (and others’) thoughts on this. Ultimately my interest is in understanding these systems both for prediction (what is likely to happen) and intervention (what is likely to happen if we do X).