This post digs into Lecture 3 of Paul Meehl’s course “Philosophical Psychology.” You can watch the video here. Here’s the full table of contents of my blogging through the class.
Imagine trying to logically reconstruct scientific verisimilitude from the pages of Science and Nature without being clued into the complex sociology of publication. Aggregated scientific sociology bleeds into the scientific literature, and it’s impossible to assess publications without considering the context of discovery. Even the most starry-eyed undergrad seems clued into the rampant cynicism of scientific publication. You can’t rationally reconstruct science from the publication record without considering everything you don’t see in the publication record.
Again, what’s striking about Lecture 3 is how Meehl lists the same familiar problems that everyone still moans about today. And in all cases, the problems are worse today than they were in 1989. No matter what program committees and editorial boards do, we have not been able to shake what seems like inherent problems to the scientific method.
First, Meehl discusses the file drawer effect. That is, researchers who get results that don’t agree with the outcome they wanted don’t publish the papers. Meehl says he’s not sure how much of a problem this is, as it’s hard to quantify. But he notes plenty of projects he’s abandoned, and this effect must at least partially exist.
Second, there’s the problem that everyone has convinced themselves—mostly rightly so—that journals won’t publish null results. This exacerbates the file drawer effect. It also suggests that the incentives in publication systems bias the publication record, only giving clues about what might work, hiding much of what doesn’t. In general, since my prior is almost nothing works, this is not a bad policy. But focusing on “original research” leads to perverse devaluation of reproduction as many journals are hesitant to publish replication studies or reanalyses. I’ve received emails from editors stating, “Any submission that presents itself mainly as a critique of a single other published paper would not be appropriate.”
Third, we know that some topics are trendy. Trendy topics get published in high-profile journals, while less trendy topics sometimes aren’t even worked on. These trends are purely sociological, yet strongly bias the results we see. Lior Fox pointed out a fun example in the comments describing how the latent learning debate I discussed yesterday was never settled. The debate was just abandoned. Scientists got bored, and new grad students moved on to shinier topics. The scientific literature is littered with such abandonment.
This problem bleeds into conferences and journals where editors and reviewers dock papers points unless they are working on the current thing. Such gatekeeping again shows a skewed view of the work that scientists are doing, and it also affects the sorts of questions scientists answer in the first place.
And, yet, despite this aggressive, ideological gatekeeping, we still have the problem of too many papers. Meehl constantly complains about overpublication throughout the course, but the numbers he lists are exponentially smaller than what we deal with today. It used to be the case that PIs had to publish to get funding. But now graduate students have convinced themselves they need to publish to get industrial jobs. And undergraduates think they have to publish to get into graduate school. And high school students are now being encouraged to publish to get into college. Meehl saw an alarming trend, but scientists couldn’t stop it. So we’re all now saturated with a mess of pdfs.
As a result, we simply can’t read everything published in a literature that is already irrevocably biased. But perhaps heroic summarizers will come to the rescue. You should be able to get a sense of the literature by reading a related work section, right? Nope. Meehl again notes that you have to take psychology into account. The deviation between what a Related Work section says a paper does and what a paper actually does depends on what the authors want you to believe about a paper. We’ve all seen examples (usually with reference to our own work) where related work sections completely mischaracterize papers.
Meta-analyses and systematic reviews don’t solve this problem. First, the idea that metaanalytic researchers are free from bias is a romantic illusion. Organizations, like Cochrane, imprint their values on their reviews. Second, the idea that the tools of metanalysis are handed down from an unbiased omniscient agent is arrogant. Metaanalyses have to focus on narrow topics with a sufficient number of publications meeting the reporting bar of mostly arbitrary systematic reviewing standards. The estimation of bias of each paper in the review also depends on the mindset of the systematic reviewer. And the accepted statistical meta-analytic tools are sociological conventions with no particular validity (ugh, forest plots).
Like Meehl, I’m not against metaanalysis. Summaries have clear value to their respective communities, but the summaries are flecked with the dispositions of the authors. Metaanalysis, systematic reviews, and tutorials all have to be read with an understanding that you are still getting a partial view of the entire picture of facts.
My proposed solution to this missing data problem is removing the guardrails. Let’s change scientific review so papers aren’t evaluated on their trendiness. What if we just assessed if results were novel and if authors maximized the reader’s ability to reproduce and reanalyze the work? Regarding novelty, a clear replication counts as novel as long as you cite the original work. Regarding reproducibility, I realize this is context-dependent and might require specific expensive equipment. We can be fair-minded about what is sufficient in a case-by-case evaluation. Then we can just throw things on preprint archives and devise some bean-counting scheme to send to our deans and middle managers for promotions.
You might argue this will produce even more papers. Sure, we’ll talk about that in the next post. But the bottom line is humans can’t process everything. They could never process everything. The sin here is hiding information under the guise of validity confirmation. As Meehl says “No method is a truth-grinding machine.” Access to more information at least ameliorates the problem of not knowing what’s missing.
A partial defense of 'trendy topics'... Aren't 'trendy topics' code for areas of research that still have new, important, low hanging fruits (often because of some technological breakthrough). I don't know how to assess whether editors are biased against untrendy topics. Though, I would anticipate that publications on older, less trendy topics more often inadvertently reinvent the wheel or are far more incremental. I'd be interested in seeing a study that can deconvolve this effect.
Wonderful read as always. This sentiment is reflected in the recent changes at eLife, in that papers are not accepted/rejected, but merely passed through review. However, that does come with the condition that papers must first make it past the editors, which brings with it, its own biases.