Logo

Search in DATA AFFAIRS

Learning unitReuse of Research Data

Introduction

The reuse of research data is not a fundamentally new concept; it has always been part of scientific work. However, digitalization has opened new pathways. For example, the goal of the Human Relations Area Files (HRAF), established in the USA in 1949 as a non-profit consortium of universities, colleges, libraries, and research institutions, was to collect (already published) ethnographic texts, images, and later films, categorize them thematically and regionally, and make them available for cross-cultural comparative studies. Originally, these texts and images were stored on microfiche, which required specialized magnifying devices for reading. This technology significantly facilitated material collection for comparative studies and secondary analyses. Today, HRAF documents are digitally accessible and continually expanded1see Website: https://hraf.yale.edu/.

With the advancement of digitalizationDigital data are created through digitalization, which involves converting analog materials into formats suitable for electronic storage on digital media. Digital data offer the advantage of being easily and accurately duplicated, shared, and machine-processed. Read More and the push for Open Science'Open Science encompasses strategies and practices aimed at making all components of the scientific process openly accessible and reusable on the internet. This approach is intended to open up new possibilities for science, society, and industry in handling scientific knowledge” (AG Open Science, 2014, translation by Saskia Köbschall). Read More, the reuse of research data has become increasingly central. Research data from publicly funded projects are considered a public good and, in digital form, should ideally be openly accessible and usable. In the best case, data are available free of charge (Open AccessOpen Access refers to the free, costless, unrestricted, and barrier-free access to scientific knowledge and materials. For third parties to reuse these materials legally, the creators must grant usage rights through a licensing agreement. Free CC licenses, for example, specify exactly how data and materials may be reused. Read More) and openly licensedIn a license agreement or through an open license, copyright holders specify how and under what conditions their copyrighted work may be used and/or exploited by third parties. Read More, allowing reuse without the need to contact the original data providers for permission. This requires a reciprocal relationship of trust between data providers and data users. Data providers should prepare their data meticulously, adhering to data protection laws and research ethics, and carefully assess the potential for third-party reuse (see articles on Archiving, Data Protection, Informed Consent). When dealing with sensitive qualitative data, open access may not be possible; in such cases, agreements should clearly outline how the data can be used. Data users, in turn, should handle the data respectfully, acknowledge the original data providers through proper citation, and strictly avoid any form of data misuse (RatSWD, 2023, p. 33; DGfE, 2020, p. 4).

A responsible and reflective approach to existing data – considering both the data providers and third parties involved – is essential for re-analysis. To avoid arbitrariness and misinterpretation, data users should thoroughly engage with the contextual information (see article on Data Documentation) and understand the background of the original research, as well as the nature and characteristics of the data. Additionally, users should reflect on their own (ethnographic) positions and perspectives and incorporate these into their new analyses and arguments (Huber, 2019, p. 8). Reusing personalPersonal data includes: 'any information relating to an identified or identifiable natural person (data subject); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier, or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural, or social identity of that natural person(…)” (EU GDPR Article 4 No. 1, 2016; BDSG §46 para. 1, 2018; BlnDSG §31, 2020). Read More and sensitive dataWithin the category of personal data, there is a subset known as special categories of personal data. Their definition originates from Article 9(1) of the EU GDPR (2016), which states that these include information about the data subject’s: Read More, however, presents significant data protection challenges.