Discussion about this post

User's avatar
Lalitha Sankar's avatar

Ben. I’ve been enjoying these posts. Thanks to Maxim for sharing his sub stack and now getting me addicted to quite a few of these posts.

While you wish to discuss rates and prediction, couldn’t the causal problem that you described in your previous post and continuing here, be simplified to the following setting: the language learning problem is one where the inference is almost entirely dependent on one feature to which the predictor has access, and the depression problem to one where the prediction is really dependent on a lot of features but the predictor simply does not have access to some of those features (lazy doctors, silent patients, biased drug companies, and possibly even too many features such as what someone is eats everyday, could also contribute to lack of such data collection). So, it’s hard to know, if such features may be strongly correlated with the outcome. Absent such data, one may view this as a forced sparse prediction problem. The real issue is that while many of these could perhaps be controlled in the lab setting, it’s impossible to control in the real world unless doctors tell their patients to restrict everything else or notate to evaluate the efficacy of the drug. Curious to see where you got with this. :)

Expand full comment
5 more comments...

No posts