Data Protection
Definition
In the context of research, data protection refers to the protection of personal research dataPersonal data includes: 'any information relating to an identified or identifiable natural person (data subject); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier, or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural, or social identity of that natural person(…)” (EU GDPR Article 4 No. 1, 2016; BDSG §46 para. 1, 2018; BlnDSG §31, 2020). Read More from unauthorized or unlawful processingThe term 'processing' is defined as 'any operation or set of operations which is performed on personal data or on sets of personal data, whether or not by automated means, such as collection, recording, organisation, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction;' (BlnDSG §31, 2020; EU GDPR Article 4 No. 2, 2016). Processing therefore refers to any form of working with personal data, from collection to erasure. Read More. This ensures the right to informational self-determination. Data processing includes any form of handling personal data, from collection, organization, and storage to adaptation, retrieval, querying, provision, and deletion (BlnDSG, 2018).

Source: Data Protection, Anne Voigt with CoCoMaterial, 2023, licensed under CC BY-SA 4.0
Introduction
"German law supports the research interest by recognizing the fundamental freedom of science: 'Art and science, research and teaching shall be free' (Article 5, Paragraph 3, Sentence 1, GG). Researchers are – within certain limitations – generally free to determine their research subject and the methods they use. [...] However, the freedom of research may be restricted if it conflicts with other fundamental rights, such as the right to informational self-determination, which is regulated in data protection law."
(RatSWD, 2017, p. 14)
The right to informational self-determination ensures that every person fundamentally has control over the disclosure and use of their personal data (BVerfG, 1983), creating a tension between this right and the freedom of research. It is enshrined as a fundamental right in the European General Data Protection Regulation (EU-GDPR).
Accordingly, all personal data are considered particularly worthy of protection. These include: “any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person” (EU-GDPR Article 4 No. 1, 2016). Additionally, special categories of personal data (also referred to as sensitive dataWithin the category of personal data, there is a subset known as special categories of personal data. Their definition originates from Article 9(1) of the EU GDPR (2016), which states that these include information about the data subject’s: Read More) receive heightened protection. These include data revealing ethnic, social, or national origin, political, ideological, and religious beliefs, or sexual orientation, as well as health data, genetic, and biometric data that can uniquely identify a person. The processing of such data is generally prohibited (EU-GDPR Article 9, Paragraph 1, 2016).
Data protection regulations, particularly those concerning the processing of personal data for research purposes, are specified in federal and state data protection laws. While the Federal Data Protection Act (BDSG) applies to federal public bodies and non-public entities (§ 1 (1) BDSG), state data protection laws govern public institutions at the state level. Consequently, the Berlin Data Protection Act (BlnDSG) applies to research projects conducted at Freie Universität Berlin.
Metschke and Wellbrock (2002) expand the concept of personal data (direct identification) by introducing another category: personally relatable data (indirect identification):
"Considering this definition, the scope of informational self-determination protection includes not only data concerning a specific (individualized) person but also individual data points that may not explicitly or immediately identify a person but enable their identification when combined with other information. These are referred to as individualizable or personally relatable data."
(Metschke & Wellbrock, 2002, p. 19).1Translated by Saskia Köbschall.
Thus, personally relatable characteristics are indirectly identifying attributes – features that (often) only allow for identification when combined. To ensure adequate data protection, it is essential to consider the possibility of identifying individuals using contextual information. Since this type of information is not always immediately apparent, further investigation may be necessary to determine whether datasets require anonymization.
Motivation
In qualitative social sciences, research often involves sensitive data from study participants. Ethical research considerations and data protection regulations require that the identities of researched individuals be safeguarded. It is therefore crucial to address data protection as early as possible and repeatedly throughout the research process, developing strategies to protect personalPersonal data includes: 'any information relating to an identified or identifiable natural person (data subject); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier, or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural, or social identity of that natural person(…)” (EU GDPR Article 4 No. 1, 2016; BDSG §46 para. 1, 2018; BlnDSG §31, 2020). Read More and especially sensitive dataWithin the category of personal data, there is a subset known as special categories of personal data. Their definition originates from Article 9(1) of the EU GDPR (2016), which states that these include information about the data subject’s: Read More of study participants.
If personal data is processed during research – meaning any form of handling personal data from collection to deletion – the general principles for processing personal data apply (§ 32 BlnDSG, 2018). These principles stipulate:
- The processing of (special categories of) personal data requires explicit consent from the affected person (see article on informed consent). Processing must be conducted only for specified and clear purposesThe processing of personal data is only permissible for specified and clear purposes. These purposes should ideally be determined as precisely as possible before data collection and, where feasible, documented in a consent form as part of the research project. Further processing steps are tied to this initial purpose. If the purposes change or expand during the research project – for instance, if new research questions arise during data analysis – additional consent from the affected individuals may need to be obtained. Data must be deleted once the purpose has been fulfilled. Read More, which should be defined as precisely as possible before data collection and documented within the research project and, if possible, in a consent formInformed consent refers to the agreement of research participants to take part in a study based on the basis of comprehensive and understandable information. The design of an informed consent must address both ethical principles and data protection requirements. Read More. Further processing steps must align with this purpose. If research purposes change or expand – for example, if new research questions arise during data analysis – new consent may need to be obtained.
- The extent of personal data processing must be proportionate to the purpose. This means collecting and processing only the minimum necessary personal data.
- Personal data must be accurate. Individuals have the right to request corrections to incorrect data (§ 44 BlnDSG, 2018).
- Researchers must ensure the security of personal data. This includes protection against unauthorized access or data loss, which can be achieved through secure server storage, backups, and access restrictions (see articles on data storage and data security).
- Personal data must be deleted once it has fulfilled its research purpose. Exceptions include data intended for reuse in future research, which must first be anonymized or pseudonymized unless explicit consent for storing and reusing non-anonymized data has been obtained (see articles on anonymization and pseudonymization).
Since written consent forms themselves contain personal data (such as a signature), they must be stored separately from the actual research data.
If obtaining consent is not possible, personal data must be prepared at the time of collection – e.g., through anonymizationAccording to the German Federal Data Protection Act (BDSG § 3, para. 6 in the version valid until May 24, 2018), anonymization is understood to mean all measures for modifying personal data in such a way 'that the individual details about personal or factual circumstances can no longer be assigned to an identified or identifiable natural person, or can only be assigned to an identified or identifiable natural person with a disproportionate investment of time, cost and labor.” Anonymized data is therefore data that does not (or no longer) provide any information about the person concerned. As such, it is not subject to data protection or the General Data Protection Regulation (GDPR). Read More – to minimize the risk of identification. This presents an ongoing challenge: balancing the collection of comprehensive and precise data necessary for later analysis with ensuring the protection of study participants (RatSWD, 2020).
Methods
The following measures and activities may be relevant at different stages of the research project. The listed considerations can help ensure proper implementation of data protection (RatSWD, 2020, pp. 33; Meyermann & Porzelt, 2019, pp. 8).
Research Planning
- Define the research purpose as precisely as possible before data collection.
- Assess personal data considerations:
- Will personal data be collected, and to what extent?
- Will personal data be stored permanently, or will contact details be deleted after data analysis and data anonymized?
- Can the collection and processing of personal data be avoided altogether?
- How can processing be structured to minimize intrusion for the affected individuals?
- If personal data is processed:
- Prepare an informed consent form (see article on informed consent).
- Develop a plan for data storage, retention, and technical protection (including specifying retention periods) (see article on data storage).
Data Collection
- When possible and necessary: obtain informed consent in written form (for legal security) or verbally for recorded interviews (see article on informed consent).
- Collect only as much personal data as required for the research purpose (data minimization!).
- Ensure secure storage of collected data (see article on data storage).
Data Processing and Analysis
Note: As long as data processing and analysis serve the same research purpose as data collection, further processing steps are generally permitted. If new research goals or methods emerge during analysis, additional consent may be required.
- Store direct identifiers separately from research data (see article on data security).
- Implement an anonymization strategy to protect participants' identities: anonymize and/or pseudonymize data as early as possible (see article on anonymization and pseudonymization).
- Ensure secure data storage (including versioning and backups) (see article on data storage).
- Evaluate personal research data for its intended research purpose:
- Has consent been obtained for additional purposes?
- If not, data must be deleted after the designated period (see article on secure deletion in data security).
Data Publication
- The publication of personal data is only permitted with prior consent for publication.
- Otherwise, anonymized/pseudonymized data may only be published with explicit consent for publishing such anonymized/pseudonymized data (see articles on anonymization and pseudonymization and informed consent).
- Exception/special case: The publication of personal data may be permitted if it is essential for presenting research findings on historical events (RatSWD, 2020, p. 30).
Data Retention and Archiving
- Research data should generally be stored for at least 10 years after project completion (DFG, 2022).
- Data retention is only permissible if participants have consented to archiving in their consent form (see article on informed consent).
- Store and archive research data in secure facilities such as repositories and research data centers within or outside the institution (see article on archiving).
- Implement access and usage restrictions for potential future reuse.
Data Reuse
- Assess whether data can be reused for research purposes beyond the original scope (purpose limitation).
- According to Article 5(1)(b), second clause of the GDPR (DSGVO), further processing of data for scientific research is not considered incompatible with the original purposes. This means reuse is permitted if a legal basis for further processing exists (RatSWD, 2020, p. 32).
Notes
- 1Translated by Saskia Köbschall.
Literature and References
Berliner Datenschutzgesetz (BlnDSG, 2018). Gesetz zum Schutz personenbezogener Daten in der Berliner Verwaltung (Berliner Datenschutzgesetz – BlnDSG) vom 13. Juni 2018. Berliner Vorschriften- & Rechtsprechungsdatenbank. https://gesetze.berlin.de/bsbe/document/jlr-DSGBE2018V1IVZ
Bundesverfassungsgericht. (BVerfG, 1983). Leitsätze zum Urteil des Ersten Senats vom 15. Dezember 1983. Bundesverfassungsgericht (BVerfG). https://www.bverfg.de/e/rs19831215_1bvr020983.html
Deutsche Forschungsgemeinschaft. (DFG, 2022). Leitlinien zur Sicherung guter wissenschaftlicher Praxis. Kodex. https://doi.org/10.5281/zenodo.6472827
Europäische Datenschutz-Grundverordnung. (EU-DSGVO, 2016). Verordnung (EU) 2016/679 des Europäischen Parlaments und des Rates vom 27. April 2016. intersoft consulting. https://dsgvo-gesetz.de
Metschke, R. & Wellbrock, R. (2002): Datenschutz in Wissenschaft und Forschung. Berliner Beauftragter für Datenschutz und Informationsfreiheit. https://www.hu-berlin.de/de/datenschutz/einwilligung/datenschutz-in-wissenschaft-und-forschung
Meyermann, A. & Porzelt, M. (2019). Datenschutzrechtliche Anforderungen in der empirischen Bildungsforschung. Eine Handreichung. Frankfurt am Main : DIPF | Leibniz-Institut für Bildungsforschung und Bildungsinformation. https://doi.org/10.25656/01:21990
Rat für Sozial- und Wirtschaftsdaten. (RatSWD, 2017). Forschungsethische Grundsätze und Prüfverfahren in den Sozial- und Wirtschaftswissenschaften. RatSWD Output, 9(5). https://doi.org/10.17620/02671.1
Rat für Sozial- und Wirtschaftsdaten. (RatSWD, 2020). Handreichung Datenschutz. (2nd. ed.). RatSWD Output, 8(6). https://doi.org/10.17620/02671.50
Additional Literature
Kienbaum, J., Fischer, P. & Paßmann, S. (2023). Forschungsdatenmanagement bei personenbezogenen Daten – eine Handreichung. Zenodo. https://doi.org/10.5281/zenodo.7428524
Citation
Voigt, A. (2023). Data Protection. In Data Affairs. Data Management in Ethnographic Research. SFB 1171 and Center for Digital Systems, Freie Universität Berlin. https://en.data-affairs.affective-societies.de/article/dataprotection/