Interview: Craig Van Dyck of CLOCKSS

Transitioning out as Executive Director, Van Dyck talks preservation and the future

Interview: Craig Van Dyck of CLOCKSS

Craig Van Dyck is a fellow English major, which may be part of the reason I always find him so easy to talk with. Or maybe it’s that laconic Texan air he possesses. Van Dyck earned his BA in English at the University of Texas, Austin, in the 1970s, where he notes it took him nine years to get the degree. Looking back, I wish I’d stretched mine out, too. But let me allow Van Dyck to tell you himself . . .

Q: Please tell us about yourself and your career.

Van Dyck: Leaving college, I had no concept of having a career. I would have been a character in the movie Slackers. My wife and I married and moved to New York City in 1978. I accidentally got an entry level job in the publishing operation of the American Society of Mechanical Engineers. After I was there for a year or so, I realized it was interesting, and, to my surprise, I was good at it.

After 43 years, I feel extremely lucky to have stumbled into scholarly publishing, probably one of the few professions that I could have succeeded in. I worked at Springer-Verlag New York 1986-96, then Wiley til 2015, and since then CLOCKSS. I have been fortunate to have good bosses and colleagues. Scholarly publishing is a wonderful environment: literate, with an above-average level of humanity (i.e., a lot more good people than charlatans).

Q: You’re stepping down from CLOCKSS after a successful run as Executive Director, and dealt with related issues at Wiley and Springer-Verlag. How has the archiving and production world changed during your career? How do the two intersect?

Van Dyck: When I started in 1978, “cold type” was in the process of replacing “hot type.” Things have moved on! Around the mid/late 1980s we began using only acid-free paper, at libraries’ insistence. This meant the print record would be preservable for hundreds of years. Thank you, librarians! Then, in the late ‘90s, when it became clear that the e-versions of journals were becoming the primary version for end-users, librarians thought, “Wait a minute! We don’t own e-copies! How do we know that these e-publications will be available for the long-term?” From this concern was born the idea of trusted third-party preservation service providers, such as “dark archives” like CLOCKSS and Portico. The fact that end-users would not access the content via CLOCKSS made publishers feel OK about allowing CLOCKSS to maintain copies of the publishers’ content. (CLOCKSS only provides access if a journal is otherwise unavailable on the Web — i.e., the publisher no longer makes it available, and nor does anyone else.)

I would like to emphasize this point about archiving versus preservation: making a back-up copy or two is not preservation. Preservation means continuously ensuring that the digital bits are healthy, and not only that there is redundancy via multiple copies. It’s also essential to have the right technologies, organizations, governance systems, geographies, security, forward migration, and legacy plans in place. You can read more in a document about the “Levels of Preservation” from the National Digital Stewardship Alliance.

Q: Let’s pull back a little. Why is digital archiving so important? Why should publishers prioritize it as an activity?

Van Dyck: Scholars build their research on the work of their predecessors. Researchers and readers must be able to go back and look at the resources that an author has referred to. And of course authors want to know that their work will continue to be available. In the rare cases that online scholarly content disappears from the Web, then a service like CLOCKSS can step in to ensure the ongoing access to the material.

When e-archiving began in the early 2000s, publishers were somewhat resistant — providing their high-value resources to a preservation service seemed like a potential threat to their own presentation of the content, and it was another cost-of-doing-business. But today, publishers recognize and accept the value of long-term digital preservation. For one thing, it differentiates a proper publisher from one that might not be doing what they should be doing on behalf of the community.

CLOCKSS preserves a growing collection including 43 million journal articles, 250,000 books, and other elements of the scholarly publishing infrastructure such as software and the Crossref metadata database. We aim to preserve the DataCite metadata, and the ORCID public data file, later in 2021.

Q: CLOCKSS works as a coalition. Can you describe that? What are the strengths and weaknesses of this approach?

Van Dyck: The CLOCKSS Board is equally comprised of libraries and publishers. The Board members work at the coalface of scholarly communications. They bring a keen sense of the issues confronting their organizations. The Board — all unpaid volunteers — considers CLOCKSS’s priorities and directions, taking into account both the library and publisher perspectives, and gives guidance and direction to the Executive Director. CLOCKSS is truly community-governed, which is a real differentiating strength.

That said, not everyone agrees on everything. But what about authors and readers? In effect, libraries and publishers represent the needs of authors and readers, which they are accustomed to doing in their “day jobs.”

Q: What’s the most surprising or unexpected thing that occurred during your time at CLOCKSS?

Van Dyck: The realization that some fully Open Access journals are more susceptible to disappearing. OA lowers the barriers to entry for journal publishing — e.g., the publisher need not sell subscriptions, nor keep track of who has access to what, nor maintain a log-on and access control system. A free, open source publishing software like the Open Journal System also lowers the barrier to entry. As a result, many many small OA journals have been started, often by non-publishing people like professors working as a one-person shop from their academic department, as a “labor of love.” This kind of journal might not enjoy the same kind of long-term commitment to its existence, and could be closed down if it does not gain traction in its community.

Keep in mind, also, that for APC journals, the economic incentive for OA publishing ends as soon as the processing fee is paid, unlike subscription journals, for which publishers can license access to “back files.” So, we find an increasing occurrence of vanishing OA journals, as Laakso et al. have pointed out. Often these journals lack the technical, financial, and organizational wherewithal to include their journals in a preservation system; thus, unfortunately, the scholarship published in such journals may disappear, if its editors or publisher ceases to support it.

CLOCKSS is currently collaborating with the Directory of Open Access Journals (DOAJ), the Public Knowledge Project (PKP), Internet Archive, and ISSN International / Keepers Registry to improve the preservation of OA journals, as described in this announcement from World Preservation Day in November 2020.

Q: CLOCKSS struck up some partnerships during your tenure. Can you describe these, and why they were viewed as important?

Van Dyck: CLOCKSS has several different kinds of partnerships. Of course all of our publisher participants and library supporters are partners with us. Our main partnership is with Stanford University Libraries. CLOCKSS is a stand-alone 501(c)3 charitable organization, with an arm’s-length relationship with Stanford. But it is a short arm! We use the open source LOCKSS software, which was invented at Stanford Libraries by David Rosenthal and Vicky Reich about 20 years ago. We contract with the LOCKSS team within Stanford Libraries for technical and operational services. We also have partnerships with aggregators who represent multiple publishers to us; and library consortia who represent multiple libraries to us — these relationships add efficiency for CLOCKSS, as a kind of one-stop-shop. Finally, we are always looking for partners who can help us to preserve new forms of digital scholarship, such as software, or data sets, or dynamic/interactive research. We participate in Mellon grants that focus on the challenges of preserving dynamic web-based scholarly outputs, e.g. “digital humanities.”

Q: You’ve worked with and interacted with a number of publishers over the years. Is there something you wish more people knew about publishers?

Van Dyck: Yes. That — in addition to the joy of delivering high-value information to a community that is working to improve the world — publishers do a lot of boring, tedious work that no one else would want to do. Publishing is a publisher’s mission. For others, publishing might be a sidelight, or even just a hobby. The ecosystem of scholarly communications is stronger if there are robust, sustainable, professional organizations that are single-mindedly dedicated to publishing in the best ways possible, and investing in the future development of scholarly expression.

Here are examples of the boring, tedious stuff, which I’m listing here at the end of this answer in case readers want to skip it: copyright registration; format consistency; communications with sometimes recalcitrant authors, suppliers, partners, competitors, customers, administrative agencies, governments; chasing late payers; filing tax returns. Someone named Kent Anderson has compiled a pretty complete list.

Q: What do you think the future holds for digital archiving and scholarly publishing?

Van Dyck: The scholarly record is growing more diverse and dynamic — evolving as the Web evolves. We will find more ways to preserve these new forms of scholarly content, in sustainable and scalable ways. This is an exciting challenge — not only from a technical viewpoint, but also financial and organizational. With e-journals, we have mature and normalized publishing practices, and with e-books, while less so, still manageable. But as scholarship evolves, we confront many diverse approaches that will take years to resolve into a set of best practices. It will require a lot of collaboration — to share the costs and to maximize the benefits.

One of the most gratifying things is to see a new generation of professionals take up the challenges of digital preservation. A few years ago, I had the sense that we as a community were not growing the next generation of colleagues who commit some of their time to working at the industry level — collaborating with competitors and solutions providers, so there is a rising tide to lift all boats. But in the past few years, I have seen that there is such a new generation. This makes me confident that the professionalism of scholarly communication will continue to evolve as the needs of scholarship evolve.

Now that I am retiring, I am very pleased that Alicia Wise has been named as my successor. Alicia and I will work together closely  during the transition. I am certain that CLOCKSS is poised to play an even stronger role in the community, with Alicia at the helm.

Finally, to end on a lighter node, and to bring us back to the beginning: When I told a colleague that my retirement plan is to resume being the slacker that I was before I started my career, he told me he has the same plan. I look forward to seeing him in the coffee shop!

Subscribe now

Give a gift subscription

Subscribe to The Geyser

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.