The DOI Directorate

Articulating a few concrete positions on archives, surveys, and position papers

Nov 12, 2025

Thanks for the great discussion of and feedback on Monday’s post about the arXiv and position papers. In retrospect, I conflated too many points, and I want to clarify a few of my positions on positions in this follow-up. The main tricky question, which deserves a post in itself, is: what counts as research, and how should we disseminate it?

arXiv wants to be an archive, not a preprint server

The main phenomenon I’ve been trying to understand is how arXiv became a universal digital object identifier system for research. Permanent digital archiving was not its original intention, or at least one that was never clearly articulated. However, the replies from its moderators tell me this role is one arXiv is committed to. I’m not opposed to arXiv embracing such a central role in the research ecosystem, but I’d prefer we all be clear about what we’re doing. Rather than having this archival role be an unwritten rule passed down through mentorship, peer pressure, and gossip, let’s say what we want.

I think the conventional opinion is roughly this:

Whereas before it was on the individual scholar to curate a personal webpage, now they should curate their arXiv listing and Google Scholar page. Google and Arxiv will be better at archiving data than a distributed network of conferences, journals, universities, and researchers. For archiveship, a central clearinghouse is preferable to a distributed ecosystem. Given this central archival role of the arXiv, then we should encourage posting not just preprints, but all published work there.

I flag this as a block quote to highlight my reluctance to fully endorse the position. The position of the previous paragraph relies heavily on centralized trust. We have to trust Google to never kill Google Scholar. We have to trust that arXiv will remain funded because it is too important to fail. I hope this is right.

We should have a public conversation about the costs and benefits of such a system. I personally am uncomfortable with this because I don’t think arXiv moderators should get to set rules on what counts as research. The moderators do an invaluable public service, but if the arXiv is the central academic archive for all work in mathematics, computer science, statistics, and physics, then the respective research communities need more active say in arXiv functions. The arXiv shouldn’t change policies without broadly consulting the impacted stakeholders. This will make neither the arXiv moderators nor the respective communities happy.

We should encourage high-quality survey papers

Along those lines, I’ll publicly voice my dissent to their decision making. The arXiv’s decision to ban survey papers is a baffling and harmful net-negative for scholarship. There are incredible review papers, and writing them should be encouraged. Review papers can be research. And when well done, they can have major impact.

As an illustrative example, some of my favorite papers by Stephen Boyd have been his surveys. Stephen has a track record of writing cogent, clear review papers that synthesize simple introductions to burgeoning fields. His review papers have consistently stimulated original research, giving grad students a leg up on an area in thirty pages. I’m calling out Boyd in particular because he has long been an advocate of the preprint system and circulates his surveys long before publication. That we’d need to wait three years for his work to be approved by the arXiv systems seems very wrong to me.

I get the tricky position the arXiv is in. Bad actors are generating LLM slop reviews to game Google Scholar statistics. Here’s another example of the problem of central clearinghouses. We have to play whack-a-mole with bots. That LLM slop has ruined the ability to use arXiv to host preprint survey papers is a terrible outcome, and I hope the moderators and impacted fields reconsider it.

Commentary with a DOI

In the comments, most of the people defending position papers were arguing that there should be space for publishing perspectives and commentaries in AI/ML adjacent fields. I very much agree with that.

Fields like Statistics and economics have a very strong tradition of what researchers in computer science might consider position papers. One of the things I admire about statistics journals is their commentary system, which invites people to share broader perspectives inspired by a centrally published piece. These commentaries themselves can open problems, be taught in classes, and form the basis for new research directions.

Again calling out one of my personal favorite examples, the best papers by David Freedman aren’t necessarily hardcore Annals of Statistics papers. They describe issues with standard techniques, flag interesting counterexamples, and provide good discussion points. Do the manuscripts collected in Statistical Models and Causal Inference count as research? The answer is: of course yes.

Some people might disagree, and program committees and journal editors should have full discretion to decide what kind of work is in scope for their venues. The question for the scholar is then what to do when you can’t find a venue for your work. My answer has been to look elsewhere. I’ve been publishing in more unconventional venues, even submitting papers to philosophy journals. Not everything needs to be sent to ICLR.

Another solution is to grow a commentary system in our own fields. While computer science doesn’t have the same tradition of commentary as statistics or econ, CACM is still willing to entertain surveys and perspectives. But if we need more options, we have to create them. You could argue that position paper tracks are a solution for this need. If that’s the case, then let’s model those tracks on the best practices of fields with a strong commentary culture.

However, I still think that if you want to write a commentary that says “we need to build a group of people to do this” or “all research x needs to do y,” you should just do it. Run a workshop, even an informal one. Build a community of peers to do the research that you want to see. Collective action doesn’t happen by position papers. You have to organize and make it happen.

John Quiggin

Nov 12

I (economist) use Google Scholar less and less. That's partly from aversion to Google in general as well as the specific experience of being burned by abandoned Google products, as you mention. But more because the alternatives, including AI literature summary tools, are getting better and better.

For arxiv, we have RePeC, though this is pretty much a one man show.

1 reply

Paul Beame

The "universal archive" role of arxiv and its (ideally) permanent links are critical for many open-access journals that act as overlays on arxiv. Maintaining this focus seems critically useful for the field.

ArXiv was set up with the main filter being the technical one of being able to produce latex source files. It was also set up without a linking or commenting option, which seems to be a larger drawback. As a result of the first, there are a significant numbers of papers that have been posted there that are completely worthless and wrong: For example, there continues to be a steady rate of complete garbage P=NP and P!=NP papers posted on arxiv - 20 years ago, these papers actually formed a large proportion of all submissions under certain subject identifiers. Without a linking or commenting feature how can the community let others know what the problems are? The partial solution that some groups have found is to produce other arXiv papers with refutations of these bogus claims, but there is no way of adding "cited by" links through the system so someone reading the original garbage will see them.

The issue that arxiv has with review articles is clearly one of the rate of garbage production and the costs of serving that garbage rather than the fact that there is garbage. The "review article" limitation might be the current expedient, but it doesn't seem to work longer term as people use agents to swamp the system with other kinds of useless papers. Absent linking/commenting, maybe arxiv could avoid a ban and use levels of serving - the analog of an ordinary library putting some books that nobody seems to have a use for in the basement stacks - to reduce the costs.

4 more comments...

arg min

Discussion about this post

Ready for more?