Native Cloud Formats
This entry covers all native cloud formats that exist within a cloud system but cannot be exported in their native format. The data for these formats is held within the system and they are rendered within a browser.
Examples
The most widely known example of this is Google formats such as Google Docs, Sheets, Slides and Jamboard.
Hazards
Lack of skills, commitment or policy from corporate owners; dependence on proprietary products or formats; lack of export functionality; insufficient documentation; lack of conformance or validation; lack of preservation commitment or planning; inaccessibility to automated web crawlers; uncertainty over IPR or the presence of orphaned works.
Complex PlatformsMitigations
Reduction of dependencies; improved export functionality; clear migration pathways; application of records management standards; version control; integrity checking; comprehensive documentation; access to web harvesting; technology watch.
Bit List History
Added to list: 2023Last Review
This was a new Bit List entry added in 2023 to draw attention to the particular challenges of preserving native cloud content that cannot be exported and preserved in their native cloud formats. While there are some similarities with the ‘Cloud-based Services and Communications Platforms’ entry risks relating to dependencies on service and provider business models and the terms and conditions imposed, this entry focuses specifically on the distinct risks relating to preservation of digital content and data in native cloud formats (with these formats held within cloud-based systems and rendered within web-based browsers). Currently, in order to view the files outside of the system, an export format has to be chosen (e.g., PDF, Microsoft Office, HTML). This process has issues with proving the integrity of the exports, as conventional methods (such as checksum) are not valid. There is also the issue that the original cloud formats hold all edits and versions, the export may only preserve the current version of the file without edit history and misleading revision identifiers. As the cloud formats are browser-based, web archiving options have also been explored, but there is no current automated way to harvest a large collection of files. For these reasons, major efforts are needed to develop new tools and techniques to capture and preserve the data to prevent or reduce loss.
The 2024 interim review concluded that these risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).
Case Studies & Examples
- Case studies demonstrating good progress in this area, for example from the TNA and University of Sheffield,shared as part of a DPC event on 14th November 2023. See Does it have to be this hard? Preserving content from Microsoft 365 and Google Workspace, Beking, A., Hooper, B., Leming, R., Tilbury, J., Oeyen, Q., Young, P. and Gardner, R. (2023), Digital Preservation Coalition Event [accessed at 2024-09-06].
- One example, which is part of the Google Workspace and illustrates how quickly things can become unsupported, is the announced closure of the ‘Jamboard’ collaborative online whiteboard platform, which will be discontinued from the end of 2024. See Google’s whiteboarding app is joining the graveyard, Shakir, U. (2023), Verge [accessed at 2023-10-24].
- Google Jamboard is winding down, Google Google Jamboard Help Center [accessed at 2023-10-24].
- How Can We Preserve Google Documents?, Mitcham, J. (2017), Digital Archiving at the University of York [accessed at 2023-10-24].
- Preserving Google Drive What about Google Sheets?, Mitcham, J. (2017), Digital Archiving at the University of York [accessed at 2023-10-24].
- PDF/A and read-only in SharePoint, Pinsent, E. (2017), Digital Preservation at UoL.
- Forensic Analysis of Cloud-Native Artifacts, Roussev, V. and Mcculley, S. (2016), Digital Investigation Vol. 16.
- How I Reverse Engineered Google Docs To Play Back Any Document’s Keystrokes., Somers, J. (2014), [accessed at 2023-10-24].
- What's up (with Google) Docs? – The Challenge of Native Cloud Formats, Young, P. (2021), Digital Preservation Coalition Blog [accessed at 2023-10-23].