Published Research Papers
Completed research papers published in serials, monographs or theses which fall under specific collecting policies of research libraries or archives and are managed through dedicated repository infrastructures.
Examples
Published research papers in scholarly E-Books and Electronic Journals; Electronic theses (E-theses)
Hazards
Lack of skills, commitment or policy from publishers; uncertainty over IPR or the presence of orphaned work; embedded complex objects; unstable funding for repository; lack of strategic investment; complex external dependencies; lack of persistent identifiers; bespoke formats; lack of legal deposit mandate
Data Types/File-FormatsMitigations
Strong documentation including intellectual property rights; clarity of preservation path and ensuing responsibilities; credible preservation plan; proven capacity of repository; legal deposit preservation copying; post-cancellation access service; persistent identifiers used consistently; non-proprietary formats used and validated; minimal or well managed external dependencies.
Bit List History
Added to list: 2017Last Review
2023 Review
This entry was added in 2017 under ‘Published research outputs,’ though without reference to the capacity of the repository infrastructure. The 2019 Jury amended it to presume the existence of repository infrastructure and noted that the aggravating conditions (which introduce risks) and good practice enhancements (which reduce it) are most relevant to repository operations.
While the 2020 Jury found no change in trend, the 2021 Jury agreed it should remain Vulnerable and discussed improvements and initiatives towards the preservation of research data and outputs, pointing to a 2021 trend towards reduced risk. The 2022 Taskforce agreed risks were on the same basis as before (no change to the trend).
The 2023 Council agreed with the Vulnerable classification and risks remained on the same basis as before (‘No change’ to trend), also noting a slight decrease in imminence of action with no significant trends towards greater or reduced risk. Additionally, the 2023 Council recommended that a nomination received for a new ‘E-theses’ entry would provide a valuable example to this entry rather than as a new, standalone entry. The 2023 Council recognized that further scoping and input are needed for this entry and recommended that the next major review revisit and restructure the entry.
2024 Interim Review
These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).
Additional Information
The 2023 nomination for E-theses highlights distinct risks tied to these digital published materials. E-theses tend to be sole documents which when published by universities may get harvested into other aggregators or resources but in many cases the only copy (with no physical/analogue copy) sits on an Institution’s repository. In addition, many are deposited in PDF format (of many varieties and many don’t even attempt to use PDF/A etc.) risking long term accessibility and re-use. However, the breadth of risks goes beyond just the PDF variety, as e-theses often include databases, audiovisual materials, websites, and more.
The loss of tools, data or services within this group would impact on people and sectors around the world. Particularly those involved with reproducibility and those wishing to use the datasets for further research.
Although there have been improvements in current practice, policies and workflows, there is still a significant corpus of information that was deposited before these improvements came into force. It is unlikely that there will be the time, will or resources to bring this information up to current standards.
Case Studies & Examples
- A recent analysis from Martin Eve of CrossRef shows scholarly content at risk. The findings, based on the assessment of around 7.5 million of the e-books and articles for which CrossRef provides a fixed identifier or Digital Object Identifier, suggest that around a quarter of academic publications are not being preserved for the future. For c. 2 million articles in the study there were no evidence of them being preserved, and 4.3 of works studied were preserved in at least one place. See Digital Scholarly Journals Are Poorly Preserved: A Study of 7 Million Articles, Eve, M. P. (2024), Journal of Librarianship and Scholarly Communication .
- Breaking down barriers in e-only thesis submission: how digital preservation contributes to the conversation at the University of Glasgow, Konstantelos, L. (2021), Digital Preservation Coalition Blog [accessed at 2023-10-24].
- From “research output'' to “research data'' - a willingness to move forward?, Klungthanaboon, W. (2021), Digital Preservation Coalition Blog [accessed at 2023-10-24].
- Preservation, Trust and Continuing Access for E-Journals, Beagrie, N. (2013), Digital Preservation Coalition.
- Preserving E-Books, Morrissey, S, and Kirchhoff, A. (2014), Digital Preservation Coalition.
- Resources and recent outputs from Public Knowledge Project (PKP) Preservation Network, which developed to digitally preserve Open Journal Systems (OJS) journals. See PKP Preservation Network, Public Knowledge Project [accessed at 2023-10-24].