Custom Online Databases
Data collected, presented and disseminated in custom online databases that is not stored elsewhere, particularly data at risk when it is locked in the database because no export or harvest options are available.
Databases Research Outputs Web
Examples
Custom databases created project websites for research, citizen science
Hazards
Lack of export options; lack of system maintenance; expired domain; lack of export functionality; lack of technical knowledge and skills; limited or dysfunctional data management planning; web capture challenges that means unlikely to be picked up by automatic crawlers; uncertainty over IPR or the presence of orphaned works
Organisational ChangeMitigations
Backup and documentation; preservation capability in designated repository; use of open formats and open source or other licencing that enables preservation; enabled export options; robust data management planning; documented and managed professionally
Bit List History
Added to list: 2023Last Review
2023 Review
This was a new Bit List entry nominated and approved by the 2023 Council to draw attention to the particular challenges of preservation for custom online databases. This entry focuses on distinct risks relating to online databases that cannot go through traditional web archiving tools. While there are challenges to preserving databases both off- and online, it was nominated in the context of projects which set up a custom online database to record, present, and disseminate collected data, but this data is not stored elsewhere (e.g. in a long-term digital archive) and often is locked in the database because no export or harvest options are available. Identified areas of risks for these online databases can include: the maintenance of the system after the end of a project when it is not ensured, and online databases disappear because of security issues or because the domain expires; not all data is open and, after the end of a project, no one is responsible for granting access; the data is not stored elsewhere (e.g. in some trusted repository); the data is locked in and cannot be exported in (e.g. CSV) for further re-use.
Additionally, the nomination of the entry also highlighted a gap in the Bit List for databases more broadly. The 2023 Council agreed a new higher-level Databases category should be created to address this gap, inviting nominations for other database-related entries to be considered for the next major revision of the Bit List.
2024 Interim Review
These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).
Additional Information
The preservation is highly dependent on the software used but, no matter what, once the project has reached its end, it starts to become vulnerable.
Often, the online databases are of interest to a sub-discipline-specific group of people, e.g. archaeologists specialized on cuneiform tablets. But the material itself often is then invaluable for this group because of the great effort invested in compiling it.
Databases for citizen science also provide an example where the upload of information directly into it makes it distinctive.
Emulation can be used to preserve these databases. For example, Yale University is preserving databases, especially SQL databases for websites, using EAASI. There are technical challenges, but the databases can be preserved, and have found issues are often around access to data and workforce development of technical skills to undertake preservation actions. There is a risk, however, that some of the databases cannot be exposed to the web as they have no survival time and/or cannot make them available as they were intended to be used.
Case Studies & Examples
- Academics Retire and Servers Die: Adventures in the Hosting and Storage of Digital Humanities Projects, Cummings, J. (2019), Digital Humanities Quarterly [accessed at 2023-10-24].