Google Scholar’s Fake Citations

Seeding Google Scholar with fake authors and citations proves not only easy, but profitable

Reluctantly, I’m covering a preprint. It’s not flawless and would clearly be improved by thorough, expert peer review, but its most interesting findings probably will hold up.

Why the confidence?

  • Because people are already actively pursuing effective scams that fit the claims the authors make and match the sting they perpetrated.
  • Because the problem has been described for more than a decade in peer-reviewed papers — and apparently has gone unaddressed.

Here are the topline results from the recent preprint:

  • Citation metrics in Google Scholar can be and are being manipulated
    • Preprints, “special issues,” and bulk publishing can be used to plant fake citations in Google Scholar, and drive up various citation-related metrics in an illegitimate manner
    • Even post hoc screening fails to eliminate the problem, as Google Scholar indexes and then caches citations that may appear for only a short while before being taken down
    • Google Scholar misses obvious instances of citation manipulation
  • Google Scholar is more widely used for faculty evaluation than previously documented
  • Effective front-end screening can thwart malfeasance, while allowing fakes to get through due to ineffective screening creates lasting problems

The authors detail creating fake author profiles and 20 ChatGPT-generated papers, and then using these to seed four services — ResearchGate, arXiv, Authorea, and OSF — in order to see if citation metrics in Google Scholar could be manipulated in in favor of the fake author.

The fake papers included citations to non-existent papers attributed to the fake author.

In other words, could a fake author become an h-index star via Google Scholar in relatively short order and completely illegitimately?