Sitemap - 2025 - arg min

an argmin year in review

Statistical Fatalism

Measures as ends

Prompts for Open Problems

Benchmark Studies

There is no data-generating distribution

There's got to be a better way!

Defining Reinforcement Learning Down

Random Search for Random Search

Reformist Reinforcement Learning

Digitally Twinning

Actions from predictions

How to pick a sample size.

The DOI Directorate

A position on positions

Staging Interventions

Learning from losers

Instrumentalized Actuarial Predictions

The fine art of crate digging

Lore Laundering Machines

Maybe You're Wrong

You're probably right

What is the chance of a Beast Quake?

Stop going for 2 down 8

Benchmarking our benchmarks

Reshelving generalization

Henry Was Right

Sunday Never Knows

How do you know so much about swallows?

Highly optimized optimizers

Universal Cascades

Don't be resulting

Boxes of numbers

Changing the meta

Stuck in the middle

How Snake Oil Becomes Normal Technology

Justify your answer

The Actuary's Final Word

Your noise is my signal

Learning from clairvoyance

The Banal Evil of AI Safety

Patterns, Predictions, and Actions (revisited)

Patterns, Predictions, and Actions (2025)

Just When I Thought I Was Out

Inference From the Best Prediction?

Selecting for complexity

All our games turn into Calvinball

The Negroni Variation

Announcing The Irrational Decision

Digging in the crates

The unpredictability conundrum

Metascience of pull requests

Are developers finally out of a job?

An open mindset

You keep using that word

Standard error of what now?

Two years of substacking

Individual experiences and collective evidence

One out of five AI researchers

Probability Is Only A Game

Restatements or Forecasts?

In Defense of Defensive Forecasting

Strunk and White for Science

Milton Friedman's p-values

The Open Marketplace of Ideas

A Defense of Peer Review

The Good, The Bad, and The Science

May you live in boring times

Physics for Synnoets

Computational Mythmaking

Computer science is what computer scientists do

Correlations and Stories

Concrete Abstract Takeaways

To Measure Is to Know

Machine Learning Evaluation - A Syllabus

Machine Learning Evaluation

Rossi's Metallic Rules

Pretending (Not) to Count

Maybe just believing in AGI makes AGI exist.

Evaluation or Valuation

baby, it's cold inside

All bets are off

Stochastic Coherence

I think it's gonna rain...

Gambling on the Richter Scale

Appraising Tea Leaves

Mathematical Pluralism

Nomological Networks

What does test set error test?

In defense of typing monkeys

The Adaptivity Paradox

Overfitting to theories of overfitting

Machine Learning 101?

Holding out for an explanation

Are radiologists finally out of a job?

Prediction Games

Flavors of overfitting

Thou Shalt Not Overfit

Blurred contexts

Theorem: Chance Equals Fate

Machine Learning Evaluation

Millions Now Living Will Never Die

In the year 2000

Acceptable I-V-vi-IV Songs

Bureaucratic Statistics

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts