5 Comments
Jun 27Liked by Ben Recht

On the point of how large of a correlation you get get when prevalence is low, you might be interested in concepts like "switch relative risk" https://arxiv.org/abs/2106.06316v1

Expand full comment
author

Will take a look. Thank you!

Expand full comment
Jun 12·edited Jun 12Liked by Ben Recht

If you allow arbitrary nonlinear transformations of the covariates, then it seems like the crud factor is nicely captured by the maximal correlation, see e.g. https://www.jstor.org/stable/2242042. This, of course, does not resolve any of the epistemological or methodological issues.

Expand full comment
author

I don't think this is the crud factor because it's not random. But it's definitely related and interesting. I wonder if there's an analog that somehow lets you compute average pairwise maximal correlation between pairs of variables. I will read the original Friedman and Breiman paper and think about it.

Expand full comment

You could randomize over the choice of transformations, which could include random picks from the buckets of theories and covariates.

Expand full comment