Last week’s blogging and ensuing feedback storm got so many issues swirling in my head that I found myself writing in ten different directions. To address all the issues raised, I needed a logical starting place. I needed to begin at the root of all of academia’s problems: peer review.1
I’ve written here on the blog and elsewhere about my disdain for the peer review system we use to evaluate papers, but a conversation with Kevin Munger earlier this year convinced me I needed far more nuance. Because academia does not exist without peer review.
To explain what I mean, let me distinguish between hierarchical evaluation2 and peer review. In academia, hierarchical evaluation starts with graduate school admissions, which grants access to the first floor of the ivory tower. Graduate students are admitted by the faculty to the departments to which they apply. In the PhD defense, a committee of faculty decides whether a candidate gets blessed as an expert of academic knowledge. They learn the secret handshake if they pass. A step beyond, and far more competitive, there’s “faculty recruiting,” the euphemism used for the bizarre, unprincipled practices where professors decide who gets hired as new professors. There are junior faculty investigator awards, where older professors recognize those who are exemplary in their early professorial careers. Then there’s tenure review, where a department of professors chooses whether or not it wants to keep a professor on its staff for the next half century. There is promotion to “full professor.” No one knows why we do this. Then elections into the National Academy. And so on and so on.
In each of these cases, younger people are evaluated by people of higher rank, ostensibly based on merit. Each stage involves reams of recommendation letters, which always extoll the singular genius of their subjects. Older individuals choose who remains in the system, thereby creating a narrowing circle of authority. The review is not “peer” because the hierarchy and power asymmetries are so evident.
By contrast, peer review ostensibly claims to cut across this hierarchy. The most famous form of peer review is the one I mentioned above, which is the system used to decide what gets published in academic venues. Editors of journals send a manuscript to three people they believe to have relevant expertise, and beg them to carefully read the paper and write a detailed report on whether it should be published. This work is uncompensated. Based on these reviews, the editor makes a publication decision and sends the author the reports as “feedback.” Since there is a potential asymmetry of power, with juniors evaluating seniors, the identity of the reviewers is kept confidential from the authors.3 For some reason, this sort of peer review has to be anonymous, even though in the hierarchical review examples, the evaluated knows every department or professor that’s evaluating them.
I’ve previously criticized the system of blind pre-publication manuscript peer review, but I don’t want to dwell on that today.4 Because it’s only one of the many applications of peer review in the elaborate, confusing mechanisms for building academic knowledge. We also use peer review systems to review grant proposals. Again, in mostly uncompensated work, peers pore over hundreds of pages of grant proposals to decide who gets a share of an ever-dwindling pot of money. More broadly, there’s peer review with every choice we make in our academic career. Whose papers do we cite? Whose open problems should we work on? Who should we invite to give talks? Who do we flame on Twitter? This is all peer review too.
Academia is built upon a foundation of review. It is messy, decentralized collective decision making that selects a group of people and their writing into a canon of thought. That thought is what we call academic expertise. The academic canon is an artiface entirely constructed out of hierarchical evaluation and peer review. It only exists because we say it exists. Academia is a giant social system of credentialized credentialing.
And so I find myself clarifying something I wrote last week. In asking the question, “Who defines what is 'good' and 'bad' science?” I responded rhetorically:
“Should it be the scientists themselves? Mark argues that ‘Researchers serve on review panels for grants; they argue against grants being awarded that would not (in their professional opinion) yield good science.’ I disagree with Mark’s assessment of peer review. Our prestige systems fail to determine what is good and bad all the time. Peer review completely fails to improve science and can’t even reliably detect fraud. At this point, it’s clear most scientists don't work hard at paper reviews, and LLM reviews are rapidly on the rise. Grant panels are frequently plagued by old guys yelling at the clouds that forgot to cite their work. Scientists are prone to herding around hype. That science trudges forward despite all of the apparent ‘irrationality’ is what makes it such a remarkable human endeavor.”
I still agree with this, but want to emphasize its nuance. Academia is by no means a perfect system of deciding what is “good” and “bad.” Although it stumbles and makes many mistakes, it is often quite good at discovering knowledge that the rest of society deems important too.
And the peer review system is all we have. It has its warts, but academic knowledge canonization needs some systems for us to be counted, evaluated, promoted, awarded, and canonized. Academia is a social system after all, and social systems love to classify and rank.
These social systems are daunting to change. Individual faculty can be agile, creative forces, but the aggregate academia is the most conservative, turgid mess. Somehow together that can work well, because we occasionally stumble upon world-changing breakthroughs that shape how we see the world. Peer review might not be all we need, but it’s all we have. Our solutions for any future models of academia start there.
Remember when I said I was going to stop posting about academic navel gazing? I lied.
You see, today’s post is also about evaluation, and everyone loves that topic.
In machine learning, we now let undergrads evaluate full professors. Just like in teaching evaluations. A truly egalitarian system of knowledge curation.
Don’t worry. I’ll come back to it.
You and Munger are my favorite writers tackling this problem by describing it in plain language that makes visible that which is obscured by an elaborate structure of incentives that is hard to see, even for those climbing it to named professorships. Their complexity makes it difficult to see from the outside. And the incentives prevent those on the inside from understanding what's happening because, to use the Upton Sinclair line, their salaries depends upon not understanding it.
I disagree that "No one knows why we do this." There is a long, and probably, too boring story about guilds that go all the way back to the founding of Bologna and Oxford. The question is why have these feudal forms of knowledge production and transfer persisted so long in a modern world dedicated to rationality and objectivity.
My long answer involves what Thorstein Veblen called "trained incapacity," a recognition that the great benefits of academic freedom has a few drawbacks, and to take seriously what Dan Davies has to say in The Unaccountability Machine. And to keep reading you and Munger.
Peer review is incompatible with scale. We need to pick one. My spicy take is that, if we want peer review, then we need to bring back elitism. Otherwise, we're doomed to keep wasting our time peer-reviewing the glut of papers written by every Joe with a GPU out there.