Methods: Data Management Plan
In principle, every researcher can create an individually formulated data management plan. A „good“ DMP should address technical, organizational, structural, legal, and ethical aspects, as well as sustainability. Depending on the research project, the scope of a DMP may vary.
A data management plan ideally includes the following elements (Helbig et al., 2020), of which the first five should always be considered – this applies even to empirical final theses. Points six to nine are primarily intended for larger projects where data reuse is planned from the outset.
- Project Information: Description of the project’s content, administrative project details, funding institutions, project participants, and responsibilities.
- Relevant Guidelines, Recommendations, and Third-Party Requirements: e.g., from professional associations, universities regarding data handling (if available).
- Description of Planned Data Collection Methods and Resulting Data TypesThe terms 'file type' and 'file format' are often used interchangeably. A distinction is made between proprietary and open file formats. Proprietary formats usually require fee-based software to access, as they may not be compatible with other programs (e.g., PowerPoint for .ppt files or Photoshop for .psd files). In contrast, open formats such as .rtf or .png are based on standards and can be opened by many programs. Read More and FormatsThe terms 'file type' and 'file format' are often used interchangeably. A distinction is made between proprietary and open file formats. Proprietary formats usually require fee-based software to access, as they may not be compatible with other programs (e.g., PowerPoint for .ppt files or Photoshop for .psd files). In contrast, open formats such as .rtf or .png are based on standards and can be opened by many programs. Read More: (e.g., field notes, observation protocols, recorded interviews, photos, films, etc.).
- Data StorageData storage generally refers to the process of saving data on a storage medium or device (digitalized data). Research data are unique and valuable, and should be stored securely to protect them from loss and unauthorized access. Various measures, such as regular backup routines, can help minimize potential data loss. Read More, SecurityData security encompasses all preventive physical and technical measures aimed at protecting both digital and analog data. Data security ensures data availability and safeguards the confidentiality and integrity of the data. Examples of security measures include password protection for devices and online platforms, encryption for software (e.g., emails) and hardware, firewalls, regular software updates, and secure deletion of files. Read More, and Organization: Type and location of storage, backupThe term backup means data protection or data recovery and refers to the copying of data as a precaution in the event that data is lost, e.g. due to hard drive damage or accidental deletion. The data can be restored with a backup. For this purpose, the data record is additionally saved on another data carrier (backup copy) and stored offline or online. Read More routines, data exchange, and measures to prevent data loss.
- Ethical and Legal Aspects: Handling of research ethics questions, implementation of data protectionData protection includes measures against the unlawful collection, storage, sharing, and reuse of personal data. It is based on the right of individuals to self-determination regarding the handling of their data and is anchored in the General Data Protection Regulation (GDPR), the Federal Data Protection Act (Bundesdatenschutzgesetz), and the corresponding laws of the federal states. A violation of data protection regulations can lead to criminal consequences. Read More regulations (e.g., use of informed consentInformed consent refers to the agreement of research participants to take part in a study based on the basis of comprehensive and understandable information. The design of an informed consent must address both ethical principles and data protection requirements. Read More, anonymizationAccording to the German Federal Data Protection Act (BDSG § 3, para. 6 in the version valid until May 24, 2018), anonymization is understood to mean all measures for modifying personal data in such a way 'that the individual details about personal or factual circumstances can no longer be assigned to an identified or identifiable natural person, or can only be assigned to an identified or identifiable natural person with a disproportionate investment of time, cost and labor.” Anonymized data is therefore data that does not (or no longer) provide any information about the person concerned. As such, it is not subject to data protection or the General Data Protection Regulation (GDPR). Read More, pseudonymizationPseudonymization is 'the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures to ensure that the personal data cannot be attributed to an identified or identifiable natural person' (BlnDSG §31, 2020; EU GDPR Article 4 No. 5, 2016). Read More measures).
- DocumentationResearch data not only form the basis of scientific publications by researchers but are also often made accessible to others. This requires that research data be documented in a clear and understandable way. This becomes essential if data publication is intended. Metadata - structured information about other data -plays a central role in finding, searching, and using research data. Various scientific communities have established metadata standards, which are conventions for describing and documenting research data through metadata. Read More: Planned types of data and accompanying materials (measures to ensure data traceability over time and by third parties).
- ArchivingArchiving refers to the storage and accessibility of research data and materials. The aim of archiving is to enable long-term access to research data. On one hand, archived research data can be reused by third parties as secondary data for their own research questions. On the other hand, archiving ensures that research processes remain verifiable and transparent. There is also long-term archiving (LTA), which aims to ensure the usability of data over an indefinite period of time. LTA focuses on preserving the authenticity, integrity, accessibility, and comprehensibility of data. Read More and Data Retention Beyond Project Completion: Selection of suitable data for archiving, definition of conditions for archiving and reuse, choice of an appropriate archiving environment.
- Responsibilities and Roles: Assignment of tasks for backups, creation, and maintenance of the DMP.
- Costs and Effort: Planned expenses and resources for research data management (e.g., cost of pseudonymization efforts).
The working group ‚Greening DH‘ of the association ‚Digital Humanities in German-Speaking Countries e.V.‘ has compiled recommendations and suggestions for filling out data management plans (only in German) under the following link: https://dhd-greening.github.io/rdm/empfehlungen_dmp. These recommendations aim to promote resource-efficient and sustainable practices wherever possible.
Literature
Helbig, K., Anders, I, Buchholz, P., Favella, G., Hausen, D., Hendriks, S. et al. (2020): Erfahrungen und Empfehlungen aus der Beratung bei Datenmanagementplänen. Bausteine Forschungsdatenmanagement, 2/2020, 29–40. https://doi.org/10.17192/bfdm.2020.2.8283