Data Storage and Management

From GRDI2020

(Redirected from Data Storage)
Jump to: navigation, search

This is a GRDI challenge; return to Main Page with all the challenges and recommendations


Contents

Introduction

Data storage and management is geared to provide a trustworthy environment, where data is stored as follows:

  • time - temporary, for a fixed term, or perpetually
  • security - open content vs. defined access restrictions
  • bit-preservation - disaster recovery (e.g. a data center burns down, but the data are replicated to multiple geographic locations) and precautions against bit-rot (e.g. data are migrated to new storage media every few years)
  • accessibility/usability - data need to be fetched via proprietary command-line tools from tape backup vs. data are available within a few seconds via a REST-ful API. Representational State Transfer (REST) is a style of software architecture for distributed hypermedia systems, such as the World Wide Web.

Issues of data storage and management include technological issues (hardware- and software-related) as well as organisational issues (e.g., physical access control to server). Data centres have taken the responsibility for addressing these issues.

10-year vision

Ten years from now, there will be a distributed infrastructure for data storage that ensures the retention of bit-streams for defined periods of time (e.g. 10 years for good scientific practice or legal reasons). This services is designed and operated in a trustworthy way (i.e. independent of technologies and organisational risks; cf. Bit Preservation), and supports the particular access patterns of its dedicated community (cf. the OAIS). From a user perspective, the bit-preservation service is usable and can be embedded in (existing) application environments. From an infrastructure perspective, data centres cooperate across institutional and national borders to make this work. From a policy and funder perspective, there is a sustainable business model for the bit-preservation service in place, and a response team (on ministerial level) is inaugurated to rescue the data from those institutions (e.g., national data centres) that fail despite all technological and organisational precautions.

Challenges and Recommendations

Challenges in data storage are closely linked to other topics, including Data Curation and Preservation and Data Use - Virtual Research Environments for embedding into user environments, and Funding, Sustainability and Governance. The following recommendations therefore touch on all these related topics although written from the perspective of data storage.

The more technological aspects of these recommendations can basically be delivered tomorrow. However, structural and financial questions—such as developing a bit-preservation service for trusted data retention—involve the concerted effort of data centres and funding institutions. They will also possibly involve several years of fine-tuning the structure and business model of such a service.

External Links

Personal tools