13 Comments

It might be worth going back to where the whole idea of significance testing started, with plant breeding experiments. While things can go wrong, the method is well suited to the problem. Try out two (putatively) different varieties under identical conditions, and see if one does better. If so, is the improvement too big to be due to chance variation? If it is, you can reject the null hypothesis and recommend adoption of the better variety, at least under the conditions of the test.

Social science problems are much harder, and significance testing doesn't work as described. But it's a social convention and we haven't found a better alternative. We should either admit this and stop pretending that "significance" means what is claimed, or forget about it complete and become subjective Bayesians.

Expand full comment

The role of such "AB testing" in agronomy is also largely overblown.

I'll say it this way: the randomized trial is a terrible knowledge generator but still a useful regulatory device.

Expand full comment

Well done. My cynical take with "causal inference" is that it is no more causal than linear regression estimators *if we believe the structural linear model*, yet it gets wrapped up in exaggerated language. No causal inferential tool will provide causal estimates if the proposed DAG is completely wrong!

The field would be better off it it rebranded under something like "structural inference" or "explicit structural models".

Expand full comment

That's not cynical at all! That's an accurate description of what causal inference is it is.

But yeah, the culture behind this is very weird. The people who cling tightest to the causal inference are the ones most invested in informing government policy. And this is where things get cynical...

Expand full comment

Ok so what exactly do you suggest we do in practice. There are endless critiques, but precious little positive solutions. I say this as someone who works in policy, rather being an academic etc

Expand full comment

The policy question is a great question, and I'm still trying to flesh out my thoughts. I'll write more about it when discussing Lecture 8.

For whatever it's worth, I'm not sure you'll like my answer. Policy questions are inherently about politics and power. How scientific and statistical language should inform such power struggles is a moral question. And the last decade has deeply soured me on technocracy.

Expand full comment

Ok that sounds all very high level though. And it does sound like "I'm not going to like your answer" just because I'm already dubious of it's relevance. Tax cuts will be weighed up, as will various programs to deal with various social ills. etc etc I'm not very well going to put in a brief "this is a moral question". You don't say. How does that help choose between alternatives? "In this brief I will provide a virtue ethical analysis of this tax on cheese...". Yeah, no.

Expand full comment

Sorry, that sounded too harsh. I have just been involved in several discussions like this, with people, like yourself, who are far far smarter than me and far better informed. But they always just seem to end with me feeling that the truly erudite think that it is impossible to really know anything about the social world. I even asked Andrew Gelman a similar sort of question on his blog, and I wasn't really clear that I should be trusting anything more than a scatter plot. Fine I suppose, but again, practical policy decisions have to be made. And it already seems we're drifting in a populist, "post expertise" direction. Unclear if that is good or not.

Expand full comment

How much of the causal theory we live and die by actually comes from statistics versus coming from just the fact that these relationships we observe and confirmed via whatever method just happened to be very useful and works very well in most scenarios?

Expand full comment

What causal theory do you have in mind?

Expand full comment

One thing I had in mind was the causal framework between power and force + velocity. Power is a scalar value and forces + velocities are vector values. You take the dot product between two vector values to create a scalar value and that equals power. I remember as a college student, this would always baffle me because it wasn't evident to me why F*v*cos(theta) = power (where theta is the angle between the force vector and velocity). And more specifically, how did we find that there is some relationship where A dot B = |A||B|cos(theta). At the time, I just accepted it as a fundamental law that we need to use to derive other more complex dynamics later on. Maybe I am overthinking it, but there is something to be said about how the output is a scalar value. Generally, having vector values means you can kind of see what is happening and validate it through your own observation. But when you start describing things in scalar values, the values really do not have any physical meaning and is actually quite abstract. It's just a theory that people (James Watt and Gottfried Leibniz I am presuming?) made up and I am not even sure they used statistics when doing it. It was probably more philosophical/engineering than statistics.

Expand full comment

There are so many cool ideas here! You mention that "theories need to make varied, precise predictions" for us to be even somewhat convinced of their results. But what if we don't have a valid way of prediction, even though it is clear some effect is occurring? How else can we quantify this change besides some sort of significance test?

For example, from my understanding, the economics literature broadly agrees that the Federal Reserve changing interest rates has some causal effect on inflation, but quantifying the exact effect is still very difficult. However, even though we have difficulty predicting inflation from changing interest rates, shouldn't we still be able to conclude that there is some causal effect there?

Expand full comment

Hi Ben. I think statisticians got it wrong when they try to infer causation by using fancy statistics and completely disregarding what the real world is telling them. They got it backwards. They think they can model reality, but of course you can input anything to a model except reality.

In the case of clinical research it’s became a problem. They model clinical variables as if they were neutral, expecting their models will discount everything. But it’s not how the real world of clinical medicine works.

They sometimes try to define causation of interventions that cannot be causative. Either because it’s pure biological non sense or because they don’t account for the marginal effects of each condition or intervention. I have written about it and I invite the comment of anyone interested in how to translate biological plausibility to clinical relevance.

https://open.substack.com/pub/thethoughtfulintensivist?r=20qrtz&utm_medium=ios

Expand full comment