Citation Statistics in Computer Science: The Evolution of Machine Learning, Self-Attention and the Rise of Cosmic Catastrophe
Comparing citation counts is fraught with unfairness and inconsistencies. For instance, because the Microsoft work was published a decade ago, it has had more time than younger papers to accrue citations. High volumes of research are being seen in computer science. Nature commissioned bibliometricians to do an analysis that controls for some of these factors, but, perhaps because only articles with extreme citation counts are being considered, similar papers remained on top, with only recent, heavily cited papers on the COVID-19 pandemic entering the lists (see Supplementary information).
The open- source nature of much of the early academic work in machine learning has given it increased citations. The sixth- most-quoted paper is titled Random forests and presents an improved machine- learning method. A Statistician from Utah State University collaborated with a late author to extend the method. She says the paper is popular because the method is open-source, free and easy to use. It also works extremely well off the shelf, with little or no customization required.
Hot on the heels of that work came Microsoft’s now top-cited paper. And in 2017, researchers at Google published a landmark paper7 titled ‘Attention is all you need’, which presented the neural-network architecture known as a transformer. This underlies the advances in large language models that power tools such as ChatGPT by efficiently implementing a mechanism called self-attention, which lets networks prioritize relevant information when learning patterns. That paper is this century’s seventh-highest cited.
Many people credit the deep-learning revolution, in which multi-layered artificial neural networks revealed their wide usefulness, to the work described in a paper3 Hinton co-authored in 2012. AlexNet was given a network that beat other methods in a competition to identify objects in images. That paper is number 8 on the twenty-first-century list, and a review paper4 on deep learning by Hinton and his co-authors is number 16. (The 2009 paper5 ‘ImageNet: A Large-Scale Hierarchical Image Database’, which presented the data sets that researchers used to train classification software, is number 24.)
Most papers from the twenty-first century make it into the all-time top 50, although a few have taken longer to get citations. Software descriptions and computer-aided findings have made notable rises in the ranking. Bibliometricians say that citations should be revised for the year of publication in order to make fair comparisons between disparate works.
But some older AI studies were still citation magnets in 2023. One paper3 published in 1997 — which describes an early neural-network architecture for modelling languages called long short-term memory (LSTM) — was the tenth most cited in 2023. Because of their efficiency, LSTMs remain popular for processing some data. Jürgen Schmidhuber, a pioneer of AI and one of the authors of the 1997 paper, who is now at King Abdullah University of Science and Technology, Thuwal, Saudi Arabia, says that not all old AI papers get the same recognition, in part because those articles put forward ideas before they were feasible.
The first quarter of the twenty-first century has produced a number of large scientific discoveries, including the first measurements and the discovery of the particle known as the Higgs boson. There are no advances described in the top-cited papers published since 2000.
“This reveals the really important papers for a community,” adds Lutz Bornmann, a sociologist of science at the Max Planck Society in Munich, Germany. Bornmann and Haunschild, together with Andreas Thor, a computer scientist at the Leipzig University of Applied Sciences, Germany, conducted the technically challenging analysis for Nature. They relied on a modified version of software that Bornmann and Thor, with others, introduced in 2016 to help researchers to explore cited references1.
The concept behind ResNets was one of the factors leading to AI tools that could play board games (AlphaGo), predict protein structure (AlphaFold) and eventually model language (ChatGPT). One of the paper’s authors, Kaiming He, who now works at the Massachusetts Institute of Technology, says that deep learning before them was not that deep.
The Top 10 of All-Time-Cited Research by Physicists: The Work of David Schmittgen, a Chemical Scientist and the Google Scholar Scholar
The OpenAlex database (one of Nature’s sources for this article) does attempt to aggregate mentions by merging preprint and final article, says co-founder Jason Priem at OurResearch, a non-profit scholarly services firm in Vancouver, Canada, which developed the database. And Google Scholar tries to group all versions of a work and aggregate citations to it, says co-founder Anurag Acharya, who works for Google in Mountain View, California.
A pharmaceutical scientist submitted a paper in the 1990’s, that included data from a technique that can quantify the amount of DNA in samples. He used equations from a technical manual to analyse his data. He was told by one of the reviewers that he couldn’t cite a user manual in a paper. So Schmittgen, now at the University of Florida in Gainesville, contacted the creator of the equations and together they published a paper9 that could be cited.
And cited it was: more than 162,000 times, according to the Web of Science. That is enough to send Schmittgen’s paper into the top ten of all-time-cited research.
Schmittgen’s paper is popular because its formulae provided a simple way for biologists to calculate changes in gene activity in response to different conditions, such as before and after treatment with a drug. A software program named DESeq2 is described in a paper as being able to calculate changes in gene activity even though it doesn’t use Sequencing data.
The paper at number five was written by George Sheldrick, a British chemist who died in February. He created the SHELX suite of computer programs to analyse the scattering patterns of X-rays after they are shot through crystals of molecules, with the aim of revealing the molecules’ atomic structure. When he began the work in the 1970s, “my job was to teach chemistry, and I wrote the programs as a hobby in my spare time”, he told Nature a decade ago. The paper he wrote in 2008 suggests when SHELX programs are used that work be cited tens of thousands of times.
Three of the top-cited papers are familiar to cancer researchers. The reports numbered 9 and 10 are the works of the World Health Organization which tracks global cancer statistics every two years. GLOBOCAN data are used by researchers, advocates and policymakers who need to give the incidence or mortality rate for a specific cancer type, says Freddie Bray, lead author of the papers and a cancer epidemiologist at the International Agency for Research on Cancer in Lyons, France.
There is a review14 that attempts to distil the complexity of cancer into a few characteristics commonly found in tumours. The Ludwig Institute for Cancer Research in Lausanne, Switzerland and co-author Douglas Hanahan discussed how these marks of cancer have helped to shape the field.
The top 100 most cited papers of all time: from mental illness to global research, from 1955 to 2014, according to the Nature news team
At number four on the list is what is sometimes called ‘psychiatry’s bible’: the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), which was published in 2013, nearly 20 years after the previous iteration. The book15 describes the criteria for categorizing and diagnosing mental disorders including addiction and depression and is extensively used by researchers and health professionals worldwide. Most of the databases in the analysis recorded DSM-5 as the only Nature book in the list.
Psychologists Virginia Braun and Victoria Clarke were used to getting only a handful of citations on their papers about gender and sexuality. So they watched with astonishment as their 2006 paper16 became the third-most-cited article published this century. “It has a life of its own”, according to Clarke.
After the paper’s publication, researchers started referring to thematic analysis — as outlined by Braun and Clarke — as the method they used, which sent its citation count off the charts. The paper has “been completely life-changing”, says Clarke. She and Braun have since pivoted much of their work to thematic analysis and have received invitations to meetings across the world. “It was entirely accidental,” adds Braun, who works at the University of Auckland in New Zealand.
In fact, many papers on biological laboratory techniques dominate the list of the most highly cited papers of all time, according to data provided to Nature by the US firm Clarivate, which owns the WoS. The top-100 list also includes papers on artificial intelligence (AI), research software and statistical methods.
The nature news team analysed records from two large research databases because their public-facing versions allow analyses to be made back to 1900. These give slightly different rankings and greater citation counts, but generally feature similar papers. (Full details of the top-100 lists, including an analysis of median rankings across three databases, are in the Supplementary information.)
All of the lists are still headed by the 1951 paper. The paper2 by Microsoft researchers was ranked fifth by the median rankings analysed across three databases at the2015 conference on Artificial Intelligence.
Paul Wouters, a retired scientometrics researcher at the University of the Netherlands, said rising annual volumes of research papers could explain how some modern papers have risen up the charts.
Researchers advance by standing on giants’ shoulders. So, which research giants are still getting cited frequently today?
Some of the most famous papers did not cite the original work because it was too expensive when it was published.
Remarkably, another research paper4 written nearly three decades ago was the fourth most referenced work in papers published in 2023. In 1996, three researchers at Tulane University in New Orleans, Louisiana, published a clever, fast approximation that could be used in software to help researchers to calculate the interactions of electrons in materials, as a way to understand the materials’ properties.