An Alternate Formulation for Causal Inference Research

Causal inference is an important and active area of artificial intelligence research today. Indeed, no less than Turing award winner Yoshio Bengio lists causal reasoning as a top priority, as does his co-Turing award winner Yann LeCun, who writes that “Lots of people in ML/DL know that causal inference is an important way to improve generalization. The question is how to do it“. And Judea Pearl’s The Book of Why is a groundbreaking advance in this important discipline.

While valuable, these initiatives overlook a much easier formulation of causal reasoning—you might call it “low hanging fruit”—that can provide immediate value to organizations with very little effort. For this reason, I hope that causal researchers can seriously consider addressing it.

The standard AI formulation of causal reasoning goes something like this,

We want to improve the accuracy (e.g. reduced false positives and false negatives; better AUC; better R^2) of automated AI systems. One way to do this is to incorporate models of the causal mechanisms driving the real-world phenomena represented by AI models. And we can learn those causal mechanisms from data.”

But there is a related but entirely different formulation of causal reasoning:

We want to support decision makers the way that they naturally think, and to use AI in situations where there is some outcome(s) to be achieved, and some actions to be taken, even if we have sparse or nonexistent historical “ground truth” data providing the historical outcomes associated with certain actions. We want to simulate the action-to-outcome causal relationships, which may include complex system dynamics. We are willing to obtain this causal information not just from data, but by interviewing human experts, from research studies, and from other text when it is available.”

In this new formulation, the maxim “All models are wrong, but some models are useful” comes into its own. As it turns out, even very low-fidelity models of the path from actions to outcomes can be very valuable. The reason: in many situations, model accuracy is only a proxy for “information that leads to a good decision”. When looked at from this perspective, everything changes. This formulation is the core model within the emerging discipline of decision intelligence (DI).

For a simple example, consider my decision as to when to press the accelerator of my car to move into an intersection. I can make a safe decision, even with only an approximate model of the speed of other cars. And my knowledge of the weather, of how many people are occupying the building on the corner, and many other factors can be very bad to nonexistent, and I’m still safe.

This may seem like an extreme example, but it illustrates a real phenomenon where data scientists, unaware of the decisions for which their models will be used, focus on delivering accurate “insights”. This can lead to unnecessary effort in some arenas (e.g. too much time spent modeling the weather in my example) and not enough in others. For data scientists to guess at end users’ decision mental models is, simply put, not enough.

In over 35 years delivering AI and DI solutions, I’ve observed this “data/decision mismatch” situation countless times.

Here are four key implications of this different formulation which, again, I hope AI researchers will begin to address:

  1. Most causal work involves inducing causal models from data.  But in a large number of use cases, it is rare to find enough causal information in data, so we often need to obtain it from human experts: in the form of interviews or extracting knowledge (perhaps using automated methods like NLP) from written sources like research papers.  We need NLP research for that purpose, and we need UX / cognitive research to understand how to best extract causal knowledge from people. In particular, gathering, preparing, and learning from data can take months to years, where the same causal information could be elicited from a human expert in just a few minutes.
  2. Most causal work seeks a single representational scheme to represent causation.  But in a practical setting, most decisions involve a variety of causal links.  Most models I’ve worked with include behavioral factors, econometrics, inference, and more.  We need research demonstrating how to propagate causation over such heterogenous representations (not just one, like Bayesian causation).
  3. Most causal work restricts the semantics of “causation” to be that which can be proven to be causative, and not correlative.  But when we work with human decision makers, they don’t think this way, and so this creates a barrier between formal methods and human cognitive models, which severely restricts how much causal work actually gets used in practice, along with our ability to elicit expertise.  So we need research into how to a) elicit “causal-ish” knowledge from people (e.g. if there’s a higher interest rate on this product, then that causes the finance charge to go up – an econometric causation, has to live in the same model with if we show people three videos telling them to wear masks then this causes them to be 10% more likely to wear them) and b) how to convert this “causal-ish” knowledge to a form in which it supports decision making.
  4. Most causal work doesn’t integrate with simulation, digital twins, AI, econometrics, behavioral psychology, and more. So we need research that treats multidisciplinary integration as a first-class area of interest, not a secondary topic to be left to later “during implementation, not research”.

Even without academic programs addressing causal questions like those above, the field of decision intelligence has grown to the level that it is predicted to be worth US$37B worldwide in the next decade. If for no other reason than this, it’s time for academic research to take this alternative DI formulation seriously, so that we can work together to solve some of the hardest problems of our time. Human decision-making is one of the world’s must underutilized sustainable resources; getting it better is easy to do and worth attention from our best and brightest.

You may also like...