7 Comments
User's avatar
Kevin M's avatar

I just wanna say I really appreciate the fact that you take the effort to turn all the complex math into simple mathematics that mathematically illiterate idiots like me can understand.

Expand full comment
Ben Recht's avatar

Thank you, but you are not an idiot, Kevin.

Expand full comment
Onid's avatar

Thank you for this write-up! I'm currently reading the paper, and it wasn't immediately clear to me what role anti-correlation search was supposed to be playing. This exposition made it very clear.

Expand full comment
Alex Tolley's avatar

So if the 2 sequences, actual vs forecast, are:

10101010101010 actual outcome

* 1010101010101 forecast based on prior outcome

Then, doesn't the defensive forecast maximize the error rate?

Obviously, if you see this pattern, it is a bad idea to mechanically follow the Defensive Forecasting method and bet on the forecast outcome. However, if your betting is probabilistic, 50% on outcome 1 and 50% on outcome 0, assuming no betting cost, you at least do not lose anything.

Is that what you mean by Defensive Forecasting? [I have your paper lined up to read this week, so apologies if I am being stupid with this observation.]

Expand full comment
Ben Recht's avatar

Defensive Forecasting is a general method, the example I gave yesterday for mean estimation was a specific example. In the example from today, where the goal is to get two constraints to zero rather than just one, you get a different forecast.

If the sequence of outcomes is 10101010101010, the sequence of forecasts is

0, 1, 0, 1, 0, 0.5, 0.22, 0.62, 0.27, 0.62, 0.29, 0.61, 0.31, 0.60

If you let it keep going, it eventually settles at 0.5.

Expand full comment
Onid's avatar

Apologies if this was in the paper, but are there any bounds on how fast defensive forecasting might converge in different scenarios?

Expand full comment
Onid's avatar

I had this exact same question when I first read his last post. i think a key intuition to make this make sense is that defensive forecasting isn’t achieving the minimum possible error, it’s achieving an error that is comparable to some particular baseline.

In this case, defensive forecasting is guaranteed to be roughly as good as any possible constant prediction. So for this sequence, if you knew the whole sequence and had to pick one number which you would repeat as a prediction for whole sequence, then that would obviously be 0.5, which is what defensive forecasting would converge to.

Expand full comment