How computer science has inherited the same mistakes as other disciplines
Peer review is at the heart of academic research, an essential step if you want to publish in any scientific journal whether it is a medical, physics, or biology journal. Although in computer science it is not the journals but the conferences that lead the way, choosing the lucky papers that will be selected and published, they also use peer review. In the same way, in fact, program committees conduct peer reviews.
The result has been that research in the field of machine learning/artificial intelligence has encompassed the same problems as other disciplines, creating a power system based not on journal editors but on the committees of conferences.
What is peer review? What is it for? Why do some consider it the basis of a toxic culture? How could it be improved? How has this impacted the computer science community? This article attempts to answer these questions
The eyes that see all the flaws
What is peer review? Giving an unambiguous answer is difficult. Over the years it seems to have become a form of liturgy necessary to propitiate mysterious entities responsible for the fate of a scientific article.
“Peer review is thus like poetry, love, or justice. But it is something to do with a grant application or a paper being scrutinized by a third party” — Richard Smith
No one is capable of evaluating the quality of their work on their own; researchers are often incapable of finding errors and flaws in their own research. The initial idea of peer review is that external researchers (the peers) should evaluate the quality of a scientific article before publication and provide suggestions on how to improve the manuscript.
In the most classic form of peer review, the editor of a scientific journal receives a manuscript and chooses two experts in the field to read and provide an opinion. Normally, the editor after reading the manuscript chooses two relevant experts in the field and they provide anonymous opinions (for the author of the manuscript). At the end of the review, the author receives the journal’s decision and/or in some cases a set of comments to which they must respond if they want their work published.
The system has since evolved and now each journal uses its own version. Some journals send the manuscript to an advisor first; others select at least three or four reviewers. In other cases, even the reviewers are unaware of who authored the manuscript (double-blind).
This pastiche — which is not far from systems I have seen used — is little better than tossing a coin, because the level of agreement between reviewers on whether a paper should be published is little better than you’d expect by chance — Richard Smith
The value of belonging to a club
Peer review today is used in a variety of areas, whether it is to publish an article or to allocate funds. Agencies call in experts to conduct peer reviews and decide which projects should receive grants.
On the other hand, it is believed that an expert-reviewed article or project adds value. After all, during peer review, authors go through a rigorous process, and if there are errors or weaknesses, reviewers request their correction. If the authors do not correct the weaknesses in their manuscript, there is only one fate: rejection.
The most prestigious journals and conferences pride themselves on having the highest rejection rates ever (Nature accepts less than 8 percent of the 200 papers it receives per week). Moreover, scientists hope to publish in these journals because greater prestige for their research (or at least greater perceived value) is associated with them.
In addition, an article published in a journal or accepted by a prestigious conference will, on average, be more cited, receive more publicity, and be read by a larger audience. All this translates into career advancement, winning prestigious awards, etc….
But is it really so? Are the articles rejected or left outside the door of these prestigious newspapers worthless?
“We’re judged by the company we keep.
And we’re wrong, Monsieur. Judas kept excellent company.” — from the movie Ridicule (1996)
How computer science still has human reviewers
Even the article on the most sophisticated artificial intelligence still passes through the hands of human reviewers. In computer science, the path of a manuscript is to be published on arXiv in a preprint version, then be submitted to a conference, corrected according to the dictates of the reviewers both on arXiv and in the manuscript for the conference, and have a link added to the conference if accepted. The value of an article depends on the value of the conference, which in turn is more prestigious the lower its acceptance rate.
As in the case of scientific articles, peer review is also blind in computer science. Conferences in general use single-blind reviews where the authors do not know the name of the reviewers. More prestigious conferences, to try to reduce bias, have implemented double-blind reviews where the reviewers are not aware of the authors. In theory, this should prevent peer review from favoring or disfavoring authors.
The most important conferences have included the meta reviewer to make sure the process is scientifically rigorous. The meta reviewer is an expert in the field who should make the final decision when reviewers have not reached an agreement on the fate of a manuscript.
Who watches the watchmen?
“the practice of peer review is based on faith in its effects, rather than on facts”. — source
Peer review is severely impacting the lives of researchers. “Publish or perish” is a common expression in academia. In fact, scientists are evaluated by the quality of their publications. However, often the evaluation is not consisting in evaluating the relevance of their research but the impact factor (importance) of their publication. Since peer review is the key step for selecting which manuscripts can be published, you may wonder if this process works well and/or what its flaws are.
“A major criticism of peer review is that there is little evidence that the process actually works, that it is actually an effective screen for good quality scientific work, and that it actually improves the quality of scientific literature” — source
The first objection to the peer review process is that we do not have precise criteria to define the quality of an article or a good research proposal. What makes an article valuable? Or, what makes a research proposal capable of leading to interesting findings? In a sense, what research was important is discovered after the fact (derived applications, how many times it was cited, and so on).
A 2008 study showed that by submitting to several journals an article with eight deliberate errors, out of 420 reviewers less than half of the reviewers identified only two errors. In fact, anyone who has gone through peer review often has the impression that many of the decisions are arbitrary; reviewers’ comments are particularly harsh only because they are protected by anonymity rather than in the name of good science.
In addition, the number of experts is limited and with the increased rate of submission, we do not have enough competent people to conduct the review. On the other hand, peer review is not paid; it is a free service provided by researchers without any reward. Over the years, the workload for those working in academia has increased, so many researchers turn down requests from publishers. This leads to manuscript authors receiving poor quality comments or having to wait long periods of time before publication.
Another serious problem with peer review is that it is not free of bias. In fact, several studies have shown that grant-awarding has disadvantaged minorities and women. In addition, renowned authors are much more likely to be accepted into prestigious journals regardless of the quality of their work, thus perpetuating a cycle of inequality, where established authors can receive awards and funding and others share the crumbs.
New science, old sins
Recently, Professor Edward Lee published a blog post criticizing peer review as the cause of the emergence of a toxic culture. He noted, on the one hand, how conference program committees maintain a certain acceptance rate and many papers were rejected without a real reason, or they were justifying the rejection with “lack of novelty.”
“too many of us use a notion of science, outdated since Thomas Kuhn, which views progress of a discipline as the accretion of new facts” — Edward Lee
This spasmodic search for novelty stems precisely from this outdated view of science. Besides being a barrier for many researchers, it does not allow them to publish work that would be fundamental (different approaches, application of the same idea in other fields) for future work. After all, no work is totally original but based on the insights of previous researchers.
In addition, being a reviewer brings a considerable advantage. In computer science, having the opportunity to read articles before they are published leads to an unfair advantage. Often anonymous reviewers are in conflict of interest with authors and have articles that are similarly submitted to the conference. The result is that they have an added incentive to reject competing articles.
Lee suggests that poor-quality articles manage to get published (because after a few corrections they are submitted to every conference until they are accepted). On the other hand, many doctoral students and other students are instead frustrated by this culture of rejection to the point of leaving academia.
Moreover, even if there is a deadline, most articles are submitted within the last 24 hours (or even after the deadline) leading to what is called “emergency review.” The limited time for reading and the pressure are leading to reduced attention dedicated to each article (a few researchers have to read sometimes more than 1000 papers). Thus in a similar case, the reviews can look more like a stochastic choice.
- AAAI: 9000 submissions, 15 % acceptance rate
- ICML: 5630 manuscripts, 21 % acceptance rate
- NeurIPS: 9122 manuscripts, 25 % acceptance rate
With these premises, it is disputable if reviewers will read all these manuscripts or just limit the reading to the title and abstract. If it is the case, this does not allow for evaluating the value of a manuscript or for flagging it for “lack of novelty.” Thus, with such a number of articles, it is difficult to believe that such a large number of the rejected manuscripts are unworthy.
Memories from both sides of the river
Anyone who has pursued a doctorate has had to submit at least one article at some point. Often, it is required that at least one article (if not more) to be published in order to defend one’s doctoral thesis.
Each conference or scientific journal has specific guidelines for formatting the manuscript. Although some of these are debatable and not always clear, they must be followed thoughtlessly.
Navigating the different submission guidelines is often a challenge in itself.
The first step is obviously to avoid desk rejection (when the editor decides that the manuscript due to “lack of novelty” or lack of relevance will not be forwarded to reviewers). The response is an impersonal email that is virtually identical to every rejected manuscript. Typically, the perception is that the manuscript has not actually been read and has been discarded by some stochastic process.
After thrilling waiting, reviews are often disappointing. In fact, much of the commentary is not pertinent to the manuscript. Moreover, the tone of the commentary is quite aggressive as if the purpose is to attack the submitted work. Moreover, very often reviewers try to increase their statistics by suggesting that they cite their own articles even if these are off-topic.
“The author should cite the seminal work of XXXX which used a similar model in the context of (another totally different field)” — paraphrasis of a real reviewer comment I received
In recent years, being able to achieve a stable job position in academia has become exceedingly competitive. Being able to obtain a position is usually contingent on the number of publications or the quality of publications. This has led to the proliferation of predatory journals and conferences. In fact, recent years have seen a proliferation of invitations to submit manuscripts to paid journals or conferences that guarantee publication in exchange for payment.
Every time, after publishing or attending a conference, my email is flooded with emails from predatory journals and conferences. These are often unknown conferences that claim the attendance of prestigious researchers (from unknown institutions and companies) and seem only vaguely relevant to your field of interest.
“seems NLP is not the top priority at your (company), since at 3 days from the deadline no one registered to (bogus NLP conference)” — Taken from the real email of predatory conference sales representative after I ignored all the previous emails.
Although these conferences are clearly without value, many desperate researchers often attend in the hope that they can boost their resumes.
On the other hand, being a reviewer is not easy either. With fewer and fewer people available to conduct peer reviews, you are often contacted (even for articles not exactly in your field). While it is true that a reviewer has to read articles that were full of methodological errors, often manuscripts of quality are flagged for rejection by the other reviewers. In these cases, it is unclear how an editor decides to proceed (rejection or acceptance).
The journals provide a format, in which they actually push to pursue novelty and synthesize one’s review using multiple choices (1 outstanding, 5 poor) almost as if it was a review on TripAdvisor. This gives the impression that editors are more interested in novelty article more than the overall quality of the findings.
Even Albert Einstein had a manuscript rejected by Physical Review thus, we should not necessarily considered rejection equivalent to poor quality
In addition, when you are required to be a reviewer you have a tight time frame to finish the review. While conferences and journals have strict text limits for manuscripts, authors put most of the results in the appendix (sometimes definitely abusing it), making it tiring to read. Personally, I found all the constant references to the appendix annoying, which makes reading a single article rather painful.
How to save science by itself
An interesting article showed that it is difficult to quantify the beneficial effect of peer review. Analyzing 9000 reviews and 2800 submitted contributions in computer science, the authors noted that there is no correlation between the ranking of the peer-review process and the impact of the number of citations. In fact, there are countless examples of papers rejected by conferences that in the long run have had a profound impact on computer science:
Several attempts to improve peer review are underway. As it has often been criticized, reviewers abuse blind reviews in judging articles with particular animosity. On the one hand, there are experiments where reviewers’ names are published after the decision on the manuscript. In other cases, the authors’ comments and responses are published.
Journals such as Nature provide the opportunity for inquiries to the editor before submitting a manuscript. Conferences could implement a help desk where they submit abstracts or request advice well before the deadline. This would help the authors to improve the manuscript before submission.
In addition, conferences that receive thousands of submissions should increase the number of reviewers to ensure higher quality reviews. In addition, paying reviewers for service could ensure quality. In fact, often the selected reviewers ask members of their own research team to help them or carry out the review (which is an additional burden on the shoulders of the Ph.D. students). Or at least services as a reviewer should be considered for career advancement or when applying for funds.
On the other hand, the power of editors and meta reviewers is almost absolute, so it should be balanced especially in the case that reviewers are divided on the decision. At present, although these attempts are encouraging they are not sufficient, and the system should be discussed in depth.
Also, an article submitted on ArXiv should be valued more for a researcher’s career. Peer review could be conducted on ArXiv without the need for a deadline, leading reviewers to have enough time and no pressure (perhaps by granting authors the right of reply). Or even use the comments of registered and verified users as a kind of public peer review. Thus, we might try to move from an editor-centered model to a community-centered model.
Every day more and more articles are submitted, many of which have little relevance, obvious errors, or even examples of scientific misconduct. Therefore, we need to be able to separate which articles are relevant and which are not. For years the solution has been peer review, we have had blind faith in the authority of journals and conferences, but it is time for a critical analysis of the system.
Peer review through a pillar of today’s science, however, is not without its problems. The result is that many articles are rejected simply because of an arbitrary decision. Reviewers’ comments are often arrogant and not even entirely adequate. After all, anonymity guarantees a position of power without accountability. Peer review ultimately results in an expensive process that lengthens publication time with no guarantee of success.
Not even computer science and new disciplines are immune from problems that have historically impacted other disciplines. At a time when more and more papers are being produced and researchers’ careers are heavily dependent on publications, we should rethink what has been an untouchable pillar of science, peer review.
I am sure anyone who has submitted an article has encountered frustration, if you would like to share your story in the comments I am glad to hear.
Here is the link to my GitHub repository, where I am planning to collect code and many resources related to machine learning, artificial intelligence, and more.
Or feel free to check out some of my other articles on Medium:
- About publish or perish, check here and here
- About the “Monument to an Anonymous Peer Reviewer” check here, here, and here
- Ph.D. students and academia burnout: here, here, here
- Computer science Ph.D. student struggling: here, here
- more about the seminal papers that have been rejected by conferences: here
- Articles on how to rethink computer science peer review: here
- About predatory conferences: here, here, here, here