60 Comments
Dec 20, 2023·edited Dec 20, 2023Liked by Ben Recht

As an engineer, I'm constantly disappointed by the amount of work wasted due to misleading claims. We have a feedback loop -- 1) too many papers to review carefully, 2) forcing reviewers to focus on a few hackable signals 3) leading authors to optimize for those signals instead of usefulness.

IE, you can get a paper published by claiming a new optimizer that beats Adam uniformly. Reviewers don't have time to try it themselves, so they let it through on face value. If they tested it, they probably would have objected to claims, independent reproduction effort has found 0 success (https://arxiv.org/pdf/2007.01547 ) .

A recent personal example, there's a specific work that several groups spent significant effort trying to reproduce. I ran into an author at a party and he told me I should stay away, because it doesn't really work (!). I couldn't tell that by reading the paper, evals looked impressive.

Expand full comment
Dec 19, 2023·edited Dec 19, 2023Liked by Ben Recht

As Lana put it in a recent talk she gave (with the absolutely glorious title “Non Serviam”), the problem is that Kids These Days have been conditioned to chase and optimize dense rewards instead of sparse rewards like in the olden days:

https://drive.google.com/file/d/1yjBpvvyxwHJvd99NdLk-d7io7dHtp1ZU/view?usp=drivesdk

Also, in the context of overproduction of CS papers, a couple of recent studies by Vladlen Koltun and collaborators:

1) https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0253397&trk=public_post_comment-text

2) https://arxiv.org/abs/2105.08089

Expand full comment
Dec 19, 2023Liked by Ben Recht

This is a great post, even if it has nothing to do with Prospect Theory. Or it does? :)

The issues you identify are in my opinion starting to become critical. They are acute in CS - and more so in those major conferences with the particle "AI" in their acronym. Their role as being the "big tent" venue where you could choose between sessions covering satisfiability or reinforcement learning has become increasingly and over the past ten years as being considered to be "second-tier" venues, with "easier" thresholds of acceptance.

This easier threshold of acceptance follows from the simple fact that, when 1) all papers that are vaguely relevant are owed "due diligence" and 2) the PC can and will tweak paper assignment algorithms to ensure 1), the logical result is that you have high chances that 50% or more of the reviewers assigned to a paper have little to no connection to the topic of the paper. Even with the Soviet-style quota of the "<25% acceptance rate" this means that if a paper is well written, and thanks to Grammarly and the like this is more and more common, any technical errors require more than 1 hour of working/thinking to be detected, and does not disturb too much well-seated assumptions in its field (so it is unlikely it will make anybody knowledgeable intellectually uncomfortable), you're pretty much guaranteed that you won't have any strong rejection recommendations. If, on top of that, the meta-reviewer (if there is one) is not keen on putting the work to prod the reviewers to exercise some critical thinking, then your chances of acceptance are even better.

Over 2023 I have put something north of 200 hours of work into serving as meta-reviewer in the three major conferences with the "AI" particle on its title. This is the year where the following has become the norm rather than the exception:

- Having to write to the reviewers to provide meaningful and constructive reviews to the best ability of their expertise. In one memorable incident I had to instruct a reviewer to run a spell checker on their review and remind them that the proper English spelling of the determinate article is "the" rather than "teh".

- Writing a meta-review in some cases longer, more detailed and more informed than any of the reviews provided

- Reviewers ignoring requests to substantiate or elaborate on their reviews when challenged to do so.

- Raising concerns with Area Chairs that authors are gaming the keywords system, avoiding to use certain ones to circumvent attracting the "wrong" attention or using keywords with little or no relevance to the facts of the paper submitted.

I do not think I can do this in 2024 again, I am going to pick my battles carefully. The notion that too many papers are being written may sound ludicrous but is totally real. You have focused on the "supply side" of the problem, but I think that pretty harsh and brutal corrective action is necessary on the "demand side" too.

Having a very simple mechanism by which papers that attract zero non-default affirmative bids are automatically rejected would be a good first step. It may lead to some trying to optimize their titles, abstracts and keywords but I would expect that such behavior would be met by the equivalent of the kind of fiery user reviews you can find for video games that mislead or oversell the goods and/or author being banned to submit for some time.

Expand full comment
Dec 20, 2023Liked by Ben Recht

Turing award winner Mike Stonebraker has been the old man shouting at the clouds about this for at least 10 years and I love it. https://www.youtube.com/live/DJFKl_5JTnA?si=201gfCzlHKrSu-o7

He calls this the threat of the “Least Publishable Unit” (LPU). My take is that it is on the leaders of the field to set a higher standard for publication and enforce the kinds of limits you’re talking about. Or stop giving people jobs and funding for papers.

Expand full comment

This is one of the nicest post of yours and I enjoyed reading every sentence. I am a control theorist (Young, but I'm not that bold; Old, but I'm not that old) with interest in estimation and filtering, and was always puzzled by the sheer number of papers and pages in ML communities. It is also getting hard to encourage new students in controls to keep their spirits up and to not compare themselves with the productivity of researchers in ML community.

Expand full comment

Interesting thoughts. This feels like a broader societal issue. We're in an age of information abundance where finding relevant, accurate information is becoming increasingly difficult.

I agree with you, in my industry (civil/structural engineering), finding a readable, well organized research paper is a rare luxury. Substack is good in this regard, you can hone in on individuals.

Expand full comment
Dec 22, 2023Liked by Ben Recht

Ben, I'm not in CS but shared the sentiment, and I got referred to this by a colleague at Berkeley.

I never reply to blogs but am compelled to do so:

Loved your piece!

You should publish it in a journal!

Best wishes.

Expand full comment

Just here to say that google scholar is effectively unmaintained, and it causes so much goodhart's lawing. really sad.

Expand full comment
Dec 20, 2023Liked by Ben Recht

I agree wholeheartedly with what you say, but it's worth pointing out the need for balance: The other extreme is fields like economics, where I'm told one can get a good assistant professorship with zero publications and only a good draft, due to the slow pace of publication. That's not healthy either.

My understanding is that CS collectively decided to spurn the glacial journal process in favor of conferences. And now we are reaping what we sowed in the 80s.

I believe that incremental changes can help,* but to truly fix the problem we need action on the scale of moving from journals to conferences. (To be clear, I'm not advocating that journals are the solution.)

[* One incremental change I support is greater transparency by making rejected papers public -- as [ICLR does](https://openreview.net/group?id=ICLR.cc/2023/Conference#submitted) -- so that we can analyze where the LPUs are coming from.]

My suggestion for a larger change is to move peer review to be later in the process. I.e., only review papers once they have been on arxiv for a while and have gained some traction. Why should we expend reviewer time on a paper that has zero citations?

The historical reason for having peer review at the beginning of the process is that the limited resource is pages in the journal or conference proceedings (and time slots at the conference). Reviewers should filter out papers that aren't worth the paper they're printed on. This is no longer relevant in the age of online publications and mega conferences with 3000 posters. The limited resource now is reviewer time. Thus we should work to conserve reviewer time, rather than having reviewers work to conserve server time.

My proposal is of course not perfect. E.g. peer review also provides constructive feedback to authors, which is more valuable earlier in the process. So I think we do need to have both early and late peer review. But we should place more weight on late peer review when it comes to things like hiring decisions.

Expand full comment
Dec 20, 2023Liked by Ben Recht

Well Ben , as an undergrad who is currently an research intern working on the domain of computer vision, reading this does feel better . Cause i often see other undergrad or even phds with less experience publishing papers at these top conferences and it feels that am i not being productive.

But then when trying to read the papers or replicate it , it feels like all boils down to merging of previous ideas present in the field , most of papers eventually seem like it can be written as blog post and often just manipulation of hyperparameters which leads to better results and baselines which they publish.

But then all this being said eventually the place where you would like to work at or companies follow probably the easiest way to hire right , they look at citations , papers produced etc. They find the easiest way to hire and hence all this becomes a vicious cycle .

The ones who get to work like this eventually get to be rewarded and this continues , ultimately the field might not have huge progress and this just feels like a rigged game as you say!

Expand full comment

Your post reminded me of an interesting parallel between college admissions and conference acceptances. Both processes have ballooned on over-submission, which makes the conference/school look good, but to the detriment of everyone else. So the real question for me is who benefits from this level of irrational enthusiasm? In publishing, as in college admissions, it’s obviously not the students.

Expand full comment

A topic definitely deserving of some curmudgeonly cold water

Expand full comment
Dec 22, 2023·edited Dec 22, 2023Liked by Ben Recht

I think it would be worthwhile to blog about research taste given the insane production of AI papers that are being pushed out now. Discussion / rant about research taste in the AI community? How would you arbitrate good research? What would you define as a good paper that is worthwhile and meaningful? Especially now that there are so many… papers?

Expand full comment

As you imply, this is certainly a don't-hate-the-player-hate-the-game situation. In one of my grad school interviews, the professor seemed more interested in the number of citations I had rather than the research itself (and commented on my GPA from several years ago, weirdly), which should be completely irrelevant. And there are many such cases. Senior researchers in both industry and academia should be held accountable for perpetuating this mindset, because they should know better.

Reading your last paragraph is a sigh of relief. I would love to step back and breathe and think and let curiosity drive me, because that's why I'm here! But even if some of my mentors and peers encourage that, will I be able to go where I want after grad school? As a student, can resisting the game change it? *shakes fist at Capitalism*

Expand full comment
Dec 19, 2023Liked by Ben Recht

I didn't have time to write a short letter, so I wrote a long one instead.

Mark Twain

Expand full comment

Thanks, Ben, for this insightful post. You’ve made me feel a lot better as a PhD student with no publications so far - I’m clearly doing a great service for the field. More seriously, for awhile I considered this topic to be purely a matter of individual research taste. But I see that the deluge of papers strains the reviewing system, decreases the strength of conference publications as a signal, and may weaken the field as a whole. Do you see any fields as good role models here for a healthy publication culture? It’s not clear we’d want to go as far as economics. Maybe statistics?

Expand full comment