The Disruptive Effects of Publications, Citation and Authorship Practices: A Regression-Based Analysis of the CD Index for Models 4 and 8
In this supplement, we look at how research institutions and countries are working to build the innovation economies of the future with these challenges in mind. It includes a focus on the emergency response to the pandemic and what lessons this might provide in the future for quickly bridging the gap between basic science and its application; a discussion on the importance of promoting diversity, equity and inclusion in modern innovation; and an analysis of innovation in China, where there are still ambitions to forge a stronger link between blue-sky research and commercial success.
The coefficients on the year indicators are negative and growing in magnitude over time in all of the models shown, and this is consistent with the patterns that we reported based on an unadjusted CD5 values index. In Extended Data Fig. 8, we visualize the results of our regression-based approach by plotting the predicted CD5 values separately for each of the year indicators included in Models 4 (papers) and 8 (patents). To enable comparisons with raw CD5 values shown in the main text, we present the separate predictions made for each year as a line graph. We observe the values of the CD index decline even when we do not account for changes in publication, citation and authorship practices.
Several recent papers have introduced alternative specifications of the CD index12. We evaluated the declines in disruptiveness using two different variations of the same word. One criticism of the CD index has been that the number of papers that cite only the focal paper’s references dominates the measure13. The (rmsDI_lrmnok) variant is less susceptible to this issue. Another potential weakness of the CD index is that it could be very sensitive to small changes in the forward citation patterns of papers that make no backward citations15. DI* has been suggested as an alternative indicator of disruption that addresses this issue. We calculated the number of randomly drawn papers and patents for each from the analytic sample. Results are presented in Extended Data Fig. 7a (papers) and b (patents). The blue lines indicate disruption based on Bornmann et al.13 and the orange lines indicate disruption based on Leydesdorff et al.15. Across science and technology, the two alternative measures both show declines in disruption over time, similar to the patterns observed with the CD index. Taken together, these results suggest that the declines in disruption we document are not an artefact of our particular operationalization.
“The data suggest something is changing,” says Russell Funk, a sociologist at the University of Minnesota in Minneapolis and a co-author of the analysis, which was published on 4 January in Nature. You don’t have as much intensity of breakthrough discoveries.
The authors found that the research done in the 1950s was more likely to use words like creation or discovery, but the research done in the 2010s was more likely to refer to incremental progress.
“It’s great to see this [phenomenon] documented in such a meticulous manner,” says Dashun Wang, a computational social scientist at Northwestern University in Evanston, Illinois, who studies disruptiveness in science. They are looking at this in a lot of different ways, and I like it very much.
Disruptiveness is not inherently good, and incremental science is not necessarily bad, says Wang. He says the first direct observation of the waves was revolutionary and the product of incremental science.
John Walsh is a specialist in science and technology policy at the Georgia Institute of Technology. He says that it might be a good thing to have more reproduction and replication in a world that cares about the validity of findings.
Finding an explanation for the decline won’t be easy, Walsh says. The proportion of disruptive research went down between 1945 and 2010 but the number of highly disruptive research hasn’t changed. The decline from 1945 to 1970 was steep and the decline from the late 1990s to 2010 was gradual. He says that it is necessary to make sense of the leveling off in the 2000s.
Following previous work14, we created ten rewired copies of the observed citation networks for both papers and patents. After creating these rewired citation networks, we then recomputed CD5. Owing to the large scale of the WoS data, we base our analyses on a random subsample of ten million papers; CD5 was computed on the rewired network for all patents. For every paper and patent, we compare the observed CD5 value to those in the same paper or patent in the ten rewired citation networks. The value of the CD5 is more disruptive than would be expected by chance, and so the negative z scores indicate that the observed values are lesser.
In addition, we also include controls for the ‘mean age of team members’ (that is, ‘career age’, defined as the difference between the publication year of the focal paper or patent and the first year in which each author or inventor published a paper or patent) and the ‘mean number of previous works produced by team members’. Increased rates of self-citations may hint that scientists and inventors are more interested in their own work, but they may also be driven by the amount of work available for self-citing. Similarly, although increases in the age of work cited in papers and patents may indicate that scientists and inventors are struggling to keep up, they may also be driven by the rapidly aging workforce in science and technology78,79. For example, older scientists and inventors may be more familiar with or more attentive to older work, or may actively resist change80. These control variables help to account for these alternative explanations.