By posting preprints on bioRxiv, authors are able to make their findings immediately available to the scientific community and receive feedback on draft manuscripts before they are submitted to journals.
If this is the intention of bioRxiv — to give authors feedback on draft manuscript before submission — then it’s worth noting that in a sample of more than 1,200 papers published across a variety of Nature journals, 57% of the preprints were posted after the paper was submitted to the journal that would complete the editorial review, peer-review, and publication. A total of 5% of papers were posted after acceptance, while 0.2% were posted after publication.
Only 29% of published papers were posted as a preprint more than 10 days before the paper was submitted to the journal that published it, while 26% of preprints were posted within 10 days before or after submitting it to that journal. The scattergram below shows the data (negative values connote days after submission, meaning the preprint was posted after submission):
The trend is toward a longer period occurring between submission and posting, doubling from 20 days post-submission for 2016 preprints to 40 days post-submission for 2018 preprints. The overall effect of the data seems to suggest less posting prior to submission and more after (the data are from 2016-2018, reading left to right).
Overall and on average, these preprints were posted slightly more than a month (34 days) after being submitted to the journal that would ultimately publish the work. This is usually enough time for a paper to clear initial editorial review and in some cases to have cleared initial peer-review.
This all puts a slightly different light on the purpose of bioRxiv and the policies that allow preprints to be posted. Instead of a pre-publication, pre-submission system, bioRxiv appears to be a pre-publication, under-review, Green OA system most of the time, if these data prove generalizable.
The human behavior behind this is easy to understand. Good authors are cautious. They don’t want to be scooped or embarrassed. So, it makes sense they would only post a paper that has received some signal about its fate, and perhaps some review and positive feedback, maybe even a letter noting that only minor changes remain, bolstering their confidence that their paper is in decent shape and bound for publication.
We also can’t tell from these data if any papers were reviewed and rejected elsewhere prior to the ultimately successful submission, review, and publication process.
An incidental finding is that the authors doing this may be among the most savvy. The time to publication for these papers was less than average for the same Nature journals I measured recently, suggesting to me that the research is better and the papers more competently written and managed. In the plus column, bioRxiv may be attracting some of the better papers.
But in the minus column, we still have the 32% of preprints that go nowhere, and now perhaps the rising threat that bioRxiv is yet another source of functional Green OA. Preprint policies generally seem to assume that preprints won’t threaten the version of record (VOR), but if the preprints going up on bioRxiv have received the bulk of peer-review and editorial review input, they are closer to the VOR than journals may assume.
I can’t see how bioRxiv is doing what it claims to do — letting authors “receive feedback on draft manuscripts before they are submitted to journals” — with 57% of the preprints being posted after submission to the journal that will ultimately publish them, and usually after some editorial signal has been sent, if not complete reviews. Instead, it seems like bioRxiv is a way for savvy authors to market their work and make a free version of their mostly-reviewed or almost-published paper available.
With functional Green OA becoming all the more possible with access broker plugins and now preprint servers that may be post-review Green OA services, policies about preprints may need revisiting.