The Saga of Highleyman's Data.

The first machine learning benchmark dates back to the late 1950s. Few used it and even fewer still remembered it by the time benchmarks became widely used in machine learning in the late 1980s. In... Continue

Machine learning is not nonparametric statistics.

Many times in my career, I’ve been told by respected statisticians that machine learning is nothing more than nonparametric statistics. The longer I work in this field, the more I think this view is both... Continue

Experiments as randomized algorithms

While every statistics course leads with how correlation does not imply causation, the methodological jump from observation to causal inference is small. Using the same algorithmic summarization and statistical analysis tools that we use to... Continue

Statistics as algorithmic summarization

Though a multifaceted and complex discipline, Statistics’ greatest contribution is a rigorous framework for summarization. Statistics gives us reasonable procedures to estimate properties of a general population by examining only a few individuals from the... Continue

All statistical models are wrong. Are any useful?

Though I singled out a mask study in the last post, I’ve had a growing discomfort with statistical modeling and significance more generally. Statistical models explicitly describe the probability of outcomes of experiments in terms... Continue

Effect size is significantly more important than statistical significance.

A massive cluster-randomized controlled trial run in Bangladesh to test the efficacy of mask wearing on reducing coronavirus transmission released its initial results and the covid pundits have been buzzing with excitement. There have already... Continue

Relative risk is more informative than effectiveness.

The past few weeks have sent a tremendous amount of disappointing news about the Delta variant and the possible waning effectiveness of vaccines. An outbreak among partygoers in Provincetown found a high number of infected... Continue

Digital Witnesses

Doyle derived his LQG counterexample in the time before the ubiquity of numerical computing. This meant that numerical examples did not carry the rhetorical weight of algebraic closed form instances. The need for clean, persuasive... Continue

There are none

In the last post, we showed that continuous-time LQR has “natural robustness” insofar as the optimal solution is robust to a variety of model-mismatch conditions. LQR makes the assumption that the state of the system... Continue

Margin Walker

I want to dive into some classic results in robust control and try to relate them to our current data-driven mindset. I’m going to try to do this in a modern way, avoiding any frequency... Continue