In early 2019, word reached me about a startup attempting to do something near and dear to my heart — qualify citations by intent. My own attempt at this (SocialCite) washed out because it relied on a social media model. This new one, Scite.ai, embraced machine learning, and seemed more likely to scale as ML solved the speed problem — how to describe citations quickly and uniformly at high volume — and the retrospective problem — going back to describe citations from the last year or the last decade or the last century. Intrigued, I spoke with the CEO and founder, Josh Nicholson. We hit it off, and I became an advisor, a role I’ve enjoyed.
Scite is much farther along compared to the first time we talked, so I wanted to check in and share their story from a journalistic point of view. Running a startup is business in the raw. I think you’ll find the following interview informative on many levels. Nicholson has been able to make good progress over the past year and a half, even in the face of some complexities (and symptoms of gastric distress emanating from a loyal companion).
Q: We last spoke in April 2019, so just about 18 months ago. Where is Scite now compared to 18 months ago, in general business terms?
Nicholson: Wow, it feels like just yesterday that we did the last interview, yet at the same time it feels like a decade ago. I think we’ve made a lot of progress since then but I also feel like we are just getting started, which is quite exciting. Some highlights:
- We received a Fast-Track Small Business Innovation Research (SBIR) grant by the National Institute on Drug Abuse (NIDA) of the National Institutes of Health (NIH) for a total of US$1.7M. With this and previous investments and grants, we have a runway for the next two years. This is critical for us to test certain business hypotheses and find product-market fit.
- We added over 514M citation statements (750M-236M).
- We have more than doubled the precision of our model classifications, which are now at 0.80,0.85, and 0.97 for supporting, disputing, and mentioning classifications, respectively.
- We have signed nearly a dozen indexing agreements and are now regularly processing articles from the following sources: John Wiley & Sons, Karger, Rockefeller University Press, Thieme, Microbiology Society, Cambridge University Press, IOP, Frontiers, International Union of Crystallography, arXiv, bioRxiv, medRxiv, Pubmed Central, Research Square.
- We have launched the scite Reference Check, which is now available to individual users but is also integrated into the Manuscript Manager submission system. This helps authors, editors, and peer reviewers, make sure the articles they are referencing are reliable.
- We have launched scite Visualizations, which allows you to visualize how multiple papers cite one another.
- We launched Journal dashboards and Funder dashboards, which allow you to see journal and funder level metrics.
- We launched Custom Dashboards, which allow you to create a dashboard on any group of articles you wish. This can be used by private R&D to better understand and track how articles related to certain diseases or topics have been cited or for individuals to sync with their reference libraries.
Q: Scite appears to have become more customer-facing in some ways, with new dashboards and visualizations. Why push in this direction now? How did this come to fruition?
Nicholson: Since our last interview, our primary focus has been, and still very much is on data and infrastructure. Specifically, we are focused on making sure we have good citation coverage, fresh data (rapid processing of new articles), and accurate classifications. As mentioned above, we have made significant improvements on all of these, so we have shifted some of our efforts to more user-facing features. This shift really reflects the transition of the company from an interesting idea and resource to a viable business and core part of scholarly communication.
Q: Startups typically have to pivot in some way to fine-tune their approach and unique value proposition. Has Scite made any significant pivots in the last 18 months?
Nicholson: We haven’t made any significant pivots in the last 18 months. We believe that being able to see how an article has been cited and if it has been supported or disputed is something that is clear to understand and valuable to researchers and organizations. The hardest part has been executing this vision given that most publications are not openly available, most scientific publications are only available as PDFs, which are hard to text mine, and scientific writing can have a lot of nuanced sentences so determining if something is supporting or disputing is technically very challenging. We have rethought some business strategy, specifically making our badges openly available to embed instead of charging for this, but there have not been any major pivots.
Q: What has been the most surprising thing that has occurred since we last spoke?
Nicholson: We had one founding team member leave pretty abruptly and without much explanation about half a year ago. This was personally hard as he was a friend and a major contributor early on. However, in retrospect, I think it has helped the company as it has given us more room to hire new people, and we have brought on some really truly inspiring software engineers who are really like co-founders, providing meaningful contributions to the code, product insights, and even feedback and ideas on sales and marketing. This is actually pretty typical for startups to lose a co-founder early on, but I think it was indeed a big surprise we did not anticipate.
Q: A key measure of success for a new citation tool is researcher acceptance. How is that proceeding?
Nicholson: Overall, we have found researchers to be quite receptive to scite. With that said, I think the biggest challenges we have in terms of adoption are coverage and understanding. When a researcher first hears about scite, about 90% of the time they will search their own publications. If we have fewer citations than Google Scholar, Web of Science, or Scopus, they might think the tool has limited value. What we have found, though, is that if we explain why that is (we need access to full-text to extract citation statements) and the different uses of the tools, they come to see the value. Additionally, there is some confusion over what supporting and disputing actually mean. To clarify, we are not simply looking at positive or negative sentiment but rather we are classifying citation statements on whether or not they indicate supporting or disputing evidence. Thus, it is not enough to be critical or positive about a paper, there needs to be an indication that there is evidence to support or dispute the claim. We have taken this approach to avoid opinions and focus on data and analyses. I do see value in both approaches, though, so perhaps this is something we will classify citation statements on in the future.
Q: How has the SARS-CoV-2 pandemic affected life at your start-up? Do you hear from other entrepreneurs about unique challenges or opportunities coming from these unforeseen circumstances?
Nicholson: We have been remote since day one of the company's life so there was no shift from an office to working from home. If anything the pandemic has helped the community and potential customers realize the importance of what we are doing. With a massive amount of publications and preprints on COVID-19 being published over the last 6 months, the need to be able to quickly understand and evaluate publications has become even more evident. Work has also been a solace for many of us – being able to work on something that helps researchers do better research during a pandemic is rewarding. We are hyper-focused on our mission of making research more reliable. The need for that is especially pronounced right now.
Q: What do you hope we’ll be talking about with scite in 18 months from now?
Nicholson: We have some exciting partnerships we are working on now. In 18 months I hope to look back on these partnerships detailing how they helped contribute to the success of scite and the research community at large. Additionally, we have some big announcements coming very soon from our work with academic publishers, which I think will take scite from “an interesting startup up” to a “vital piece of scholarly communication.”
Q: What’s the weirdest dispute you’ve found using scite?
Nicholson: I found myself searching “dog flatulence” recently for some flatulence issues my dog Pete (https://scite.ai/dog) was having. It was there I found a disputing citation (https://scite.ai/reports/flatulence-in-pet-dogs-m2D0nr) on the incidence of dog flatulence, something I didn’t think I would ever be using scite for!