Intro

What are Preprints?

Preprints are preliminary versions, or manuscript versions, of scholarly works – especially journal articles – that are made available to the (professional) public. As a rule, they are non-peer-reviewed versions whose public release primarily serves to expedite the sharing of research findings. Preprints are made freely available to the public on preprint servers, thereby also making an important contribution to green open access.

The key takeaways from this article are

1

Preprints are early versions of scholarly publications that are made available to the (professional) public – as a rule without prior peer review.

2

Preprints, which are particularly common in physics and other natural sciences, are made freely available to the public on preprint servers and make an important contribution to green open access.

3

Preprint servers enable the publishing of so-called overlay journals – that is, online journals that use the infrastructure of a preprint platform for submissions and publication.

Characteristics of Preprints

Many months may elapse between the submission of a manuscript and its publication in a journal. According to one study (Huisman & Smits, 2017), the average duration of the peer-review process is 17 weeks. In addition, time is needed for typesetting and, if applicable, copy-editing and printing. In order to expedite the dissemination of research findings, preprints are shared in many disciplines.

However, cultures differ greatly across disciplines. In physics, preprints are an indispensable communication tool – depending on the subfield, preliminary versions of over 90% of all journal articles are made available to the public on the preprint server arXiv prior to publication of the final version (Gentil-Beccot et al., 2009). By contrast, authors in other disciplines have not yet adopted a comparable preprint culture. It may be that not all journals in these disciplines accept manuscripts that have previously been posted as a preprint.

As a rule, preprints have not yet undergone any scholarly quality assurance processes. Substantively, they may therefore differ considerably from the published versions, or they may not even be accepted for publication. While scholars and scientists are aware of this, problems may arise if, for example, the media pick up findings from preprints and disseminate them to a large audience without indicating that they are preliminary and have not been peer reviewed. To enable non-scientific readers to better understand the nature of preprints, medical preprint servers display caution statements.

Preprint Servers

There are numerous preprint servers. The following figure shows some commonly used ones:

In principle, preprints can be disseminated through all possible channels, for example, email, institutional repositories, or research institution websites. As a rule, they are uploaded to preprint servers, which are usually dedicated to one or more specialist fields. Although no peer review takes place on these plat­forms, uploaded publications usually undergo a basic screening process to ensure that they are scientific and fit the disciplinary scope of the server. Pre­print servers have often emerged from and are financed by scientific commu­nities. Some are operated by research institutions or professional societies. Only rarely are they operated by publishers.

The oldest and largest preprint server is arXiv, an open access repository for scholarly articles in eight subject areas, including physics, mathematics, and computer science, that was founded in 1991. As of April 2021, arXiv hosted over 1.8 million documents. Other important preprint platforms are bioRxiv (biology), SocArXiv (social sciences), EarthArXiv (geosciences), medRxiv (medicine), and ChemRxiv (chemistry).

Functionalities differ across platforms. As a rule, revised versions of papers can be uploaded – for example, after acceptance by a journal – and a link can be provided to the final published version (version of record). Some platforms also offer a comments function so that authors can receive feedback, and findings can be discussed in the community.

History

As early as the 1960s, there was a system in biology whereby reports of current research findings – especially in the form of preprints – were reproduced and circulated by postal mail by a central registry (Cobb, 2017). A similar system also existed in the field of high-energy physics (Gentil-Beccot et al., 2009). With the advent of the Internet, the sharing of manuscripts became much easier, as they could be circulated via email. In 1991, Paul Ginsparg founded arXiv at Los Alamos National Laboratory as a server for preprints from high-energy physics (Ginsparg, 2017). The manuscripts were thus freely available to all interested parties. The scope of arXiv was later extended to include other areas of physics, as well as mathematics, computer science, and neighbouring fields such as quantitative finance and quantitative biology.

It was only after the turn of the century that other fields followed arXiv’s lead. For example, in 2013, bioRxiv was launched as a preprint server for biology, thereby making preprint sharing more popular in the life sciences. The initiative ASAPbio advocates the use of preprints in biology. Boosted by the success of the Open Science Framework (OSF) – an open source infrastructure that, inter alia, enables preprints to be uploaded and searched for – there has been a veritable boom in preprint servers for the most diverse subject areas since 2016. Nonetheless, the acceptance of preprints and the willingness to share them varies greatly across disciplines. During the COVID-19 pandemic, preprints have played an important role in the rapid dissemination, checking, and correction of research findings on the SARS-CoV-2 virus (Fraser et al., 2020; Gianola et al., 2020).

This chart on preprints in the life sciences illustrates how strongly the importance of preprints has increased in recent years:

Preprints and Open Science

In contrast to journal articles, which are often behind paywalls, preprints are freely accessible in the long term and are thus a key component of green open access. When a journal article is not freely accessible, the preprint version can be found via services like Unpaywall provided it is linked to the publisher’s version (the version of record). Preprints also make the process of gaining and communicating scientific knowledge more transparent. They enable critical discussion of works in the community before final publication Thus, authors can receive and take into account feedback in addition to the actual peer reviews. Preprints thus form the basis for informal or formalised open peer review (Frick, 2020).

Overlay Journals

Preprint servers enable the publishing of so-called overlay journals (Gowers, 2015) – that is, online journals that use the infrastructure of a preprint platform, especially arXiv, for submissions and publication. Authors post their papers on arXiv and submit a link to the arXiV preprint to the journal in question, which then organises the review process. If the paper is accepted, it is included in the journal – that is, it is linked from the journal website and assigned a DOI and possibly a comment from the editor. In this way, the advantages of arXiv (cost-effective, freely accessible, well-known infrastructure) are combined with those of a journal (peer-review, reputation, possibly an Impact Factor). Successful examples of overlay journals are SIGMA (Symmetry, Integrability and Geometry: Methods and Applications), Logical Methods in Computer Science (LMCS), Discrete Analysis, and Quantum.

References

Further Reading

  • Ettinger, C., Sadanandappa, M. K., Görgülü, K., Coghlan, K., Hallenbeck, K. K., & Puebla, I. (2022). A guide to preprinting for Early Career Researchers [OSF Preprints]. https://doi.org/10.31219/osf.io/e59tk