Endangered

Custom Online Databases

Data collected, presented and disseminated in custom online databases that is not stored elsewhere, particularly data at risk when it is locked in the database because no export or harvest options are available.

Databases Research Outputs Web

Examples

Custom databases created project websites for research, citizen science

Imminence

3/5
Action is recommended within three years, detailed assessment within one year.

Effort

2/5
It would require a small effort to preserve materials in this group, requiring the application of proven tools and techniques.

Hazards

Lack of export options; lack of system maintenance; expired domain; lack of export functionality; lack of technical knowledge and skills; limited or dysfunctional data management planning; web capture challenges that means unlikely to be picked up by automatic crawlers; uncertainty over IPR or the presence of orphaned works

Organisational Change

Mitigations

Backup and documentation; preservation capability in designated repository; use of open formats and open source or other licencing that enables preservation; enabled export options; robust data management planning; documented and managed professionally

Bit List History

Added to list: 2023
2024: No change.

Last Review

2023 Review

This was a new Bit List entry nominated and approved by the 2023 Council to draw attention to the particular challenges of preservation for custom online databases. This entry focuses on distinct risks relating to online databases that cannot go through traditional web archiving tools. While there are challenges to preserving databases both off- and online, it was nominated in the context of projects which set up a custom online database to record, present, and disseminate collected data, but this data is not stored elsewhere (e.g. in a long-term digital archive) and often is locked in the database because no export or harvest options are available. Identified areas of risks for these online databases can include: the maintenance of the system after the end of a project when it is not ensured, and online databases disappear because of security issues or because the domain expires; not all data is open and, after the end of a project, no one is responsible for granting access; the data is not stored elsewhere (e.g. in some trusted repository); the data is locked in and cannot be exported in (e.g. CSV) for further re-use.

Additionally, the nomination of the entry also highlighted a gap in the Bit List for databases more broadly. The 2023 Council agreed a new higher-level Databases category should be created to address this gap, inviting nominations for other database-related entries to be considered for the next major revision of the Bit List.

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Information

The preservation is highly dependent on the software used but, no matter what, once the project has reached its end, it starts to become vulnerable.

Often, the online databases are of interest to a sub-discipline-specific group of people, e.g. archaeologists specialized on cuneiform tablets. But the material itself often is then invaluable for this group because of the great effort invested in compiling it.

Databases for citizen science also provide an example where the upload of information directly into it makes it distinctive.

Emulation can be used to preserve these databases. For example, Yale University is preserving databases, especially SQL databases for websites, using EAASI. There are technical challenges, but the databases can be preserved, and have found issues are often around access to data and workforce development of technical skills to undertake preservation actions. There is a risk, however, that some of the databases cannot be exposed to the web as they have no survival time and/or cannot make them available as they were intended to be used.

Case Studies & Examples


Keep Me Informed