You're probably right

Ben Recht

17 hrs ago

The value of randomized algorithms depends on verification.

Read →

7 Comments

17h

"venerate the scourge of Ronald Fisher" 🤣💀

Expand full comment

Tucker Hermans

15h

This meager guarantee is also all you get from conformal prediction.

Expand full comment

Reply (1)

Ben Recht

14h

Yes, and conformal prediction is even worse because you get the probability of a conjunction: the guarantees are about the past samples AND the next sample.

With 95% probability, you get a training sample, which yields a set output from conformal prediction containing the next sample.

That's way more confusing and open to misinterpretation than confidence intervals.

Expand full comment

Reply (2)

Mark Nelson

I've got to say that I've rarely run into any kind of prediction-interval-like construct where it was what I wanted, as the person interpreting it.

A tolerance interval would usually at least be an improvement imo, with a vaguely PAC style interpretation: there's a 95% chance this interval really does contain 95% of the population. You still don't know whether you got one of the 5% bad intervals, of course, so it still has the verification problem. But at least you don't additionally have the strange conjunction.

Expand full comment

Reply (1)

Ben Recht

fwiw, you can get those kind of guarantees using conformal prediction, but the sample complexity is prohibitively high: https://www.argmin.net/p/cover-songs

Expand full comment

Tucker Hermans

14h

Indeed! The only way I can wrap my head around it's popularity is that the people using it also don't understand confidence intervals or it's just some easy to run code that allows you to claim some form of "AI Safety" via uncertainty quantification.

Expand full comment

Deborah Mayo

14h

Given what you say, I can't see why you pooh-pooh severity and error statistical reasoning. The supposition that knowing a procedure performs well in general is scarcely irrelevant post data. Severity gives an explicit post-data interpretation. Take CIs. That the data warrant inferring a parameter exceeds the lower CI bound is warranted because if the parameter value was less than the lower CI bound, then with high probability we would have observed a test statistic greater than we did. It's analogous for the upper bound. This is what all statistical falsification is about, and really all warranted error-prone reasoning. Knowing the capabilities of our methods enables us to learn what is and is not well warranted in the case at hand.

Expand full comment

arg min

You're probably right