From proportion to causation

Jul 6, 2023

I get confused by regression. You should be confused too.

15 Comments

Jul 6, 2023

Isn't this just the first half of a recommendation algorithm? Use beta to offer the record store owner deals from the wholesaler, and boom, now there *is* a causal link between the color of the cover and the records in stock (assuming that the owner isn't too hipster to use coupons)

Expand full comment

Reply (1)

Ben Recht

Jul 7, 2023

Do we believe the betas are real in recommender systems though? But yeah, definitely too hipster for coupons.

Expand full comment

Reply (1)

Sarah Dean

Jul 7, 2023

Once we start making recommendations according to them, they certainly *become* real. Though not in the original sense of modelling the world!

Expand full comment

Reply (1)

Ben Recht

Jul 7, 2023

Yes. And since observational studies and data are used like cudgels to inform policymaking, I worry the models also become real in a lot of the other cases where logistic regression is abused.

Expand full comment

apoorva lal

Jul 6, 2023

> If I add a little storytelling about exogeneity, I can declare that green covers cause records to appear in the last dying record store

A story about cover green-ness being as good as randomly assigned conditional on other covariates would have to be different from sleeve size being as good as randomly assigned conditional on other covariates, no? At best, one of these stories might ring true; usually, none will.

Not saying this is what most observational research does, unfortunately, but that's a sociological problem and not a statistical one. The statistical argument for how, under conditional ignorability and overlap, you can estimate the conditional probabilities of records being kept with and without some attribute, is valid, and you can estimate causal effects from these quantities.

Expand full comment

Reply (1)

Ben Recht

Jul 6, 2023

Let me grant you that "the statistical argument for how, under conditional ignorability and overlap, you can estimate the conditional probabilities of records being kept with and without some attribute, is valid." Even if this is valid, you need to show me the assumptions hold before we estimate causal effects. But these assumptions *never* hold.

Expand full comment

Reply (1)

apoorva lal

Jul 6, 2023

I mean I sort of agree; this is loosely the design based causal inference view.

This is why selection on observables is the last refuge of the scoundrel (and we should be doing partial identification in these settings anyway). We have proximate checks for overlap (plot the pscore), but none for unconfoundedness.

Also, the David Freedman spirit lives on.

Expand full comment

Reply (1)

Ben Recht

Jul 6, 2023

This substack is basically a tribute to David Freedman. I have a post drafted about him that I'll share soon.

Expand full comment

Maxim Raginsky

Jul 6, 2023

Welcome to the dappled world of stochastic realization theory! This sort of thing makes you appreciate more the work of Akaike, Ruckebusch, Lindquist, Picci, and Willems on generative models of Gauss-Markov processes, where everything is nice and linear, and how messed up things get once you step outside that zone.

Expand full comment

Reply (1)

Maxim Raginsky

Jul 6, 2023

For example, here you can see Picci being really careful about not conflating surface correlations with causation (everyone who purports to do "causal modeling" should read this): https://link.springer.com/chapter/10.1007/978-3-662-08546-2_12

Expand full comment

Reply (1)

Ben Recht

Jul 6, 2023

Why are the control theorists by and large the most suspicious of stochastics?

Expand full comment

Reply (1)

Maxim Raginsky

Jul 6, 2023

Because most uses of stochastics outside of pragmatic, operational contexts (randomized algorithms, RCTs) or relatively well-established physical models (device noise) should be treated with suspicion.

Expand full comment

Reply (2)

Ben Recht

Jul 6, 2023

I mean, yes, we are in total agreement. But why were the control folks in this camp? I imagine there's a good explanation that traces back to Honeywell or some other defense contractor...

Expand full comment

Reply (1)

Maxim Raginsky

Jul 6, 2023

I don't have a good answer to that! But it must have something do with the fact that there is a difference between using stochastic building blocks in "making things happen" and imputing them based on observations. You should check out Kalman's curmudgeonly comment on Peter McCullagh's "What Is a Statistical Model?" here: https://projecteuclid.org/journals/annals-of-statistics/volume-30/issue-5/What-is-a-statistical-model/10.1214/aos/1035844977.full

Expand full comment

Maxim Raginsky

Jul 6, 2023

I wrote a blog about this ages ago (incidentally, it was inspired by Larry Wasserman's musings on misinterpretation of p-values as conditional probabilities!): https://infostructuralist.wordpress.com/2013/03/17/stochastic-kernels-vs-conditional-probability-distributions/

Expand full comment

arg min

From proportion to causation