Isn't this just the first half of a recommendation algorithm? Use beta to offer the record store owner deals from the wholesaler, and boom, now there *is* a causal link between the color of the cover and the records in stock (assuming that the owner isn't too hipster to use coupons)

Yes. And since observational studies and data are used like cudgels to inform policymaking, I worry the models also become real in a lot of the other cases where logistic regression is abused.

> If I add a little storytelling about exogeneity, I can declare that green covers cause records to appear in the last dying record store

A story about cover green-ness being as good as randomly assigned conditional on other covariates would have to be different from sleeve size being as good as randomly assigned conditional on other covariates, no? At best, one of these stories might ring true; usually, none will.

Not saying this is what most observational research does, unfortunately, but that's a sociological problem and not a statistical one. The statistical argument for how, under conditional ignorability and overlap, you can estimate the conditional probabilities of records being kept with and without some attribute, is valid, and you can estimate causal effects from these quantities.

Let me grant you that "the statistical argument for how, under conditional ignorability and overlap, you can estimate the conditional probabilities of records being kept with and without some attribute, is valid." Even if this is valid, you need to show me the assumptions hold before we estimate causal effects. But these assumptions *never* hold.

I mean I sort of agree; this is loosely the design based causal inference view.

This is why selection on observables is the last refuge of the scoundrel (and we should be doing partial identification in these settings anyway). We have proximate checks for overlap (plot the pscore), but none for unconfoundedness.

Welcome to the dappled world of stochastic realization theory! This sort of thing makes you appreciate more the work of Akaike, Ruckebusch, Lindquist, Picci, and Willems on generative models of Gauss-Markov processes, where everything is nice and linear, and how messed up things get once you step outside that zone.

Because most uses of stochastics outside of pragmatic, operational contexts (randomized algorithms, RCTs) or relatively well-established physical models (device noise) should be treated with suspicion.

I mean, yes, we are in total agreement. But why were the control folks in this camp? I imagine there's a good explanation that traces back to Honeywell or some other defense contractor...

Isn't this just the first half of a recommendation algorithm? Use beta to offer the record store owner deals from the wholesaler, and boom, now there *is* a causal link between the color of the cover and the records in stock (assuming that the owner isn't too hipster to use coupons)

Do we believe the betas are real in recommender systems though? But yeah, definitely too hipster for coupons.

Once we start making recommendations according to them, they certainly *become* real. Though not in the original sense of modelling the world!

Yes. And since observational studies and data are used like cudgels to inform policymaking, I worry the models also become real in a lot of the other cases where logistic regression is abused.

> If I add a little storytelling about exogeneity, I can declare that green covers cause records to appear in the last dying record store

A story about cover green-ness being as good as randomly assigned conditional on other covariates would have to be different from sleeve size being as good as randomly assigned conditional on other covariates, no? At best, one of these stories might ring true; usually, none will.

Not saying this is what most observational research does, unfortunately, but that's a sociological problem and not a statistical one. The statistical argument for how, under conditional ignorability and overlap, you can estimate the conditional probabilities of records being kept with and without some attribute, is valid, and you can estimate causal effects from these quantities.

Let me grant you that "the statistical argument for how, under conditional ignorability and overlap, you can estimate the conditional probabilities of records being kept with and without some attribute, is valid." Even if this is valid, you need to show me the assumptions hold before we estimate causal effects. But these assumptions *never* hold.

I mean I sort of agree; this is loosely the design based causal inference view.

This is why selection on observables is the last refuge of the scoundrel (and we should be doing partial identification in these settings anyway). We have proximate checks for overlap (plot the pscore), but none for unconfoundedness.

Also, the David Freedman spirit lives on.

This substack is basically a tribute to David Freedman. I have a post drafted about him that I'll share soon.

Welcome to the dappled world of stochastic realization theory! This sort of thing makes you appreciate more the work of Akaike, Ruckebusch, Lindquist, Picci, and Willems on generative models of Gauss-Markov processes, where everything is nice and linear, and how messed up things get once you step outside that zone.

For example, here you can see Picci being really careful about not conflating surface correlations with causation (everyone who purports to do "causal modeling" should read this): https://link.springer.com/chapter/10.1007/978-3-662-08546-2_12

Why are the control theorists by and large the most suspicious of stochastics?

Because most uses of stochastics outside of pragmatic, operational contexts (randomized algorithms, RCTs) or relatively well-established physical models (device noise) should be treated with suspicion.

I mean, yes, we are in total agreement. But why were the control folks in this camp? I imagine there's a good explanation that traces back to Honeywell or some other defense contractor...

I don't have a good answer to that! But it must have something do with the fact that there is a difference between using stochastic building blocks in "making things happen" and imputing them based on observations. You should check out Kalman's curmudgeonly comment on Peter McCullagh's "What Is a Statistical Model?" here: https://projecteuclid.org/journals/annals-of-statistics/volume-30/issue-5/What-is-a-statistical-model/10.1214/aos/1035844977.full

I wrote a blog about this ages ago (incidentally, it was inspired by Larry Wasserman's musings on misinterpretation of p-values as conditional probabilities!): https://infostructuralist.wordpress.com/2013/03/17/stochastic-kernels-vs-conditional-probability-distributions/