We use publicly available data to show that published papers in top psychology, economics, and general interest journals that fail to replicate are ci

Nonreplicable publications are cited more than replicable ones

submited by

Style Pass

2021-05-21 22:30:15

We use publicly available data to show that published papers in top psychology, economics, and general interest journals that fail to replicate are cited more than those that replicate. This difference in citation does not change after the publication of the failure to replicate. Only 12% of postreplication citations of nonreplicable findings acknowledge the replication failure. Existing evidence also shows that experts predict well which papers will be replicated. Given this prediction, why are nonreplicable papers accepted for publication in the first place? A possible answer is that the review team faces a trade-off. When the results are more “interesting,” they apply lower standards regarding their reproducibility.

The replication crisis in social sciences refers to the failure to replicate a large fraction of published experiments (1) and the selective publication of results and specifications (2–4). Three influential replication projects (5–7) tried to systematically replicate the findings in top psychology, economics, and general science journals. In psychology, only 39% of the experiments yielded significant findings in the replication study, compared to 97% of the original experiments. In economics, 61% of 18 studies replicated, and among Nature/Science publications, 62% of 21 studies did. In addition, the relative effect sizes of findings that did replicate were only 75% of the original ones. For failed replications, they were close to 0% [see also (8–10)]. Prediction markets, in which experts in the field bet on the replication results before the replication studies, showed that experts could predict well which findings would replicate (11).

Here, we use the findings from these three replication projects to correlate replicability with citations and test whether papers that failed to replicate are cited significantly more often than those that were successfully replicated, both before and after the replication projects were published. We collected two types of measures: (i) replicability measures and prediction market results, which are publicly available for all three replication projects; and (ii) Google Scholar citations from the date of publication until the end of 2019. We additionally collected several proxies for the quality of these citations: how often citations are themselves cited, whether they are published, and the impact factor of the journals in which they are published. We examine the relationship between citations and other measures of impact and replicability across the three replication projects.