Cleaning Up Our Own Mess

"Preprints with The Lancet" shows how the ecosystem has become confusing

Note: In the US this week, we are seeing the bitter fruits of an information space opened to abuse, lax in standards, unrelenting in exploitations, and deleterious to users. Consequences include mask denialism, anti-vaxxers, political polarization, and now violence.

We are being forced to remember that information is power. We are also reminded to handle information with the appropriate amount of care, expertise, and caution.

That is coincidentally a theme in today’s post. Be safe. Take care of one another.


I find myself bewildered by what the modern scholarly publishing landscape has become, with recent findings including a high degree of duplicate preprinting and uncertainty about who stands behind what.

But, in today’s information ecosystem, it’s like anything goes:

  • DOIs for anyone, even predatory publishers.
  • Unreviewed draft papers indexed in PubMed.
  • Journals reviewing preprints without publishing the articles.
  • White supremacists and alt-right media exploiting preprints.
  • FBI warnings that we’re being exploited.
  • Misinformation posted — and defended — on servers run by major science centers.
  • Abandoned and misleading draft papers archived forever, even defended.
  • Cats and dogs living together. Mass hysteria!

I’ve always thought our collective job was to work to derive order from chaos, sort wheat from chaff, and help readers and authors find firm, common ground with relevance, novelty, and quality as the fundamental coins of the realm. Version of record. Single source of claim. Trusted intermediary. Truth-seeking organizations. All that good stuff.

Some speculate that order is being restored by publishers. Back in April, and echoed yesterday, members of the Ithaka S+R team have asserted on “The Scholarly Kitchen” that publishers are:

. . . promoting preprints but at the same time working to domesticate them, bringing them within their article submission workflows and linking preprints and versions of record in a way that will over time serve to deprecate the ability of the former to disrupt the latter.

The question posed by the Ithaka S+R team is whether publishers can tame the preprint space.

Based on what I’ve seen, various publisher initiatives are doing the opposite — injecting disorder where order once seemed achievable by flooding the zone with preliminary, duplicative, and unlinked versions of unreviewed papers and data.

It actually takes a lot of work to make things this messy. Doing nothing — no preprint servers, no additional DOIs, no pre-review versions, no haphazard approaches to linking, no duplication — would create less disorder and confusion.

Initiatives positioned and executed in a less than stellar manner just make things unnecessarily messy.

Leadership Tip of the Week: Managing Chaos | Lead Read Today

Take, for example, the Lancet, which has been running a preprint experiment for a couple of years, deciding to make it permanent in September 2020.

Preprints with The Lancet (which scans oddly like a disappointing acoustic evening with a subset of the Foo Fighters) is an initiative between SSRN and the Lancet — both owned by Elsevier. In one of the most roundabout disclaimers — and, again, why are we disclaiming things we’re branding and publishing? — the boundaries of responsibility are explained as follows:

Preprints with The Lancet is part of SSRN’s First Look, a place where journals identify content of interest prior to publication. Authors have opted in at submission to The Lancet family of journals to post their preprints on Preprints with The Lancet. The usual SSRN checks and a Lancet-specific check for appropriateness and transparency have been applied. Preprints available here are not Lancet publications or necessarily under review with a Lancet journal. These preprints are early stage research papers that have not been peer-reviewed. The findings should not be used for clinical or public health decision making and should not be presented to a lay audience without highlighting that they are preliminary and have not been peer-reviewed. For more information on this collaboration, see the comments published in The Lancet about the trial period, and our decision to make this a permanent offering, or visit The Lancet’s FAQ page, and for any feedback please contact preprints@lancet.com.

This disclaimer template appears on all the preprints in Preprints with The Lancet, even those that have a peer-reviewed counterpart paper. It’s not a good look for the brand.

There are more than 7,600 preprints in Preprints with The Lancet. To discover how it works and — more importantly — how they link the SSRN preprint with the Lancet-reviewed paper, I got in touch. After multiple exchanges, I was finally told the following was a good example, with the link in the “Suggested Citation” going to the paper in Lancet Global Health. See if you can find it:

The helpful Lancet person assured me that this enabled linking “on both sites,” which I assumed meant SSRN and Lancet Global Health. However, I couldn’t find such a link back to the preprint, with the article information showing this:

As it turns out, for Preprints with The Lancet, it’s not hard to find examples where the linking is executed differently, incorrectly, or not at all.

In this example, a paper in Lancet Psychiatry isn’t linked to from its SSRN preprint, despite the preprint being from 2018 and the paper from 2019. There are others.

Not THAT Lancet

In this case, the SSRN preprint is marked as having a “published” counterpart, with a notice just above the full title in small blue letters, linking to the VoR. However, the wording is inaccurate — the article appears in Lancet Global Health, not The Lancet — and it is the only instance I found in my spot-checking where one of these blue header links appears.

This one also has a slightly different treatment at the bottom of the abstract, linking as before in the “Suggested Citation” box, but adding a “View Published Version” button not seen on the other example.

No Linking to Non-Lancet Published Versions

It’s easy to find examples of Preprints with The Lancet having published counterparts in other journals. None of these is linked from the SSRN preprint. The examples are listed by publisher, not by the journal name in the short list below:

This was just from the first dozen I checked. Some have not been published at all, but remain on the SSRN Preprints with the Lancet server — disclaimed, and judging by the dates, likely abandoned by their authors.

I thought linking was what the Internet was all about . . .

More Duplication

As dreaded/expected, there are duplicates within Preprints with The Lancet that are not the result of versioning — SSRN gives each version a new DOI, so it takes some digging — and which have not been flagged as duplicates. Here are a half-dozen examples I took the time to chase down:

  • August 2018 and October 2018 — Very similar title, same authors, same abstract, different page count. Published in September 2019 with Wolters Kluwer, no link to the published version from either SSRN posting.
  • December 2018 and March 2019 — Identical titles but for one modifying statement, same authors, same abstract, 1 page difference in page count. No identifiable published version.
  • July 2018 and March 2019 — Very similar titles, same authors, same trial number, same OR and CI reported in the abstracts, March version looks like it’s undergone some revision, possibly a rejection after peer review. Possibly published in February 2020, but if this is the paper, it’s been changed a lot through editorial and peer-review.
  • January 2019 and February 2019 — Identical titles, identical authors, same abstract, 1 page difference in length of manuscript. Published in September 2019 on BioMedCentral, no link to the published version from either SSRN posting.
  • May 2019 and August 2019 — Very similar titles, identical authors, identical findings in abstract. No identifiable published version.
  • April 2019 and April 2019 — Identical titles, identical authors, identical abstracts, deposited within days of one another. No identifiable published version.

At this point, I stopped. People at SSRN can figure out how to identify and handle these kinds of things.

Pity the Users

So, taking a user-experience standpoint — and remember, last year STM International’s Tech Trends used “Focus on the User” as the planning model for 2024 — I saw that users may encounter one trial’s reporting in a variety of venues, with different review standards, and different or no linking to a final Version of Record. Users may even find disappearing preprints, as the Lancet admits in its note about making its pilot permanent:

. . . Some authors wanted their preprint taken down once the paper had been rejected by a Lancet journal. . . . We have taken these papers down when requested to do so . . .

Conclusion

Based on current evidence, the answer to the Ithaka S+R team’s implicit question — Can publishers domesticate preprints, linking them and versions of record in a way that will over time serve to deprecate the ability of the former to disrupt the latter? — seems to be just the opposite. Publishers are creating a confusing information space where the version of record is easily subsumed, the value of publishers as trusted intermediaries can be dismissed, and users are exposed to a confusing landscape of uncertain versioning, quality, and trustworthiness.

What could possibly go wrong?


Give a gift subscription

Subscribe to The Geyser

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe