Logo

Search in DATA AFFAIRS

TaskExercise 3

Exercise 3

Compare the sample Data Management Plan for the DFG grant with the one for the BMBF grant. What differences do you notice? What stands out regarding the structure and level of detail?

Data Management Plan for a DFG grant (full text)

Concept for Handling Research Data (DFG)

(Original in German by FDM@HU-Berlin, licensed under CC0 see: https://cms.hu-berlin.de/de/ueberblick/projekte/dataman/muster-dmp-dfg, translated by Saskia Köbschall)

Data Description:

The research data to be collected in project XYZ will be gathered through an online questionnaire. The software LimeSurvey, provided by the Computer and Media Service (CMS) of Humboldt-Universität zu Berlin, will be used for this purpose. The analysis of the survey data will be conducted using the open-source statistical software R and will be stored in the form of the dataset (CSV), the R analysis script (R), and a series of graphics (TIFF). Additionally, a README file (TXT), the questionnaire (PDF/A), and a codebook (PDF/A) will be created to describe the data.

In addition to the research data collected within the project, publicly accessible data will be reused and referenced. These include public statistics (CSV), reports (DOCX, PDF), and legal regulations (HTML, PDF). The total expected file size is a maximum of 50 GB.

Commented [KH1]: How are new data generated in your project? Are existing data being reused? What types of data, in terms of data formats (e.g., image data, text data, or measurement data), are generated in your project, and how are they further processed? To what extent do these data accumulate, and what is the expected data volume?

Documentation and Data Quality:

Metadata will be created using the web form of the GESIS – Leibniz Institute for the Social Sciences following the discipline-specific DDI standard.Additional documentation of the research data will be provided in the form of a README file, the questionnaire, the codebook, and the R syntax. Keywords will be assigned according to the discipline-specific thesaurus TheSoz. The data will be classified under the Social Sciences Classification via the web form.

The quality of the data will be verified using statistical methods, focusing primarily on representativeness and reliability. For instance, participation rates will be compared to the respective proportions in official statistics, and weighting will be applied where necessary. To use the collected data, spreadsheet software, a word processing program, statistical software, and a PDF viewer will be required.

Commented [KH2]: What approaches are being used to ensure that the data are described in a transparent and comprehensible way (e.g., the use of existing metadata or documentation standards, or ontologies)? What measures are being taken to ensure high data quality? Are quality controls planned, and if so, how will they be conducted? What digital methods and tools (e.g., software) are required to use the data?

Storage and Technical Security During the Project Duration:

The secure storage and backup of the data will be ensured by the project management in collaboration with the responsible IT officer of Institute XYZ throughout the project duration. For storing and collaboratively processing data during the project, the university-owned cloud storage „HU-Box“ will be used. This enables clear access management and simple usage administration. For sensitive data, encrypted and password-protected folders will be used, which can only be accessed and processed by authorized staff members. A nightly automated backup will be performed.

Commented [KH3]: How are the data stored and secured during the project duration? How is the security of sensitive data ensured during the project period (access and usage management)?

Legal Obligations and Framework Conditions:

Participants will be informed within the online questionnaire about the future publication of the data while maintaining anonymity. The online survey will be conducted in compliance with GDPR and developed in consultation with institutional data protection officers. This includes obtaining informed consent from respondents and a separate consent for the future publication of collected data.

An ethics review will be obtained in advance from the responsible ethics committee of Humboldt-Universität. To clarify copyright ownership of the data, a cooperation agreement will be established with project partner Z, and a data management plan will be developed as part of the project.

Commented [KH4]: What legal particularities exist in relation to handling research data in your project? Are there any expected impacts or restrictions regarding future publication or accessibility? How are aspects of usage rights, copyright, and ownership issues considered? Are there important scientific codes of conduct or professional standards that should be taken into account?

Data Sharing and Long-Term Accessibility:

In addition to direct analysis by the project team, the dataset will also be relevant for other research projects. Since no comparable data is currently available for secondary analysis, the collected research data, R analysis scripts, and the questionnaire will be made available under a CC-BY license via GESIS – Leibniz Institute for the Social Sciences. The GESIS data archive will assign the study a Digital Object Identifier (DOI). As outlined in the DFG guidelines on good scientific practice, project results and all relevant research data will be stored for at least ten years at Humboldt-Universität zu Berlin. For this purpose, a procedure will be established with the institutional IT officer to transfer the data to the CMS backup service. The curation of the data after project completion will be handled by GESIS – Leibniz Institute for the Social Sciences.

Commented [KH5]: Which data are particularly suitable for reuse in other contexts? According to what criteria are research data selected for availability to others? Are you planning to archive your data in an appropriate infrastructure? If so, how and where? Are there embargo periods? When will the research data be accessible to third parties?

Responsibilities and Resources:

In accordance with the Principles for Handling Research Data at Humboldt-Universität zu Berlin (https://hu.berlin/forschungsdaten-policy), the project management is responsible for all aspects of research data management. However, specific sub-tasks will be delegated to project staff. For example, three person-months (PM) are allocated for preparing research data for publication in the repository.

The provision and archiving of data via GESIS – Leibniz Institute for the Social Sciences will be free of charge after consultation with the repository. The long-term storage of the data will also be free of charge via the CMS of HU Berlin.

Commented [KH6]: Who is responsible for the proper handling of research data (description of roles and responsibilities within the project)? What resources (costs, time, or other) are required to ensure appropriate research data management within the project? Who will be responsible for curating the data after the project’s completion?

This template follows the Checklist for Handling Research Data issued by the German Research Foundation (DFG) in its version from December 21, 2021.

Data Management Plan for a BMBF grant (full text)

Data Management Plan (BMBF)

(Original DMP in German FDM@HU-Berlin, licensed under CC0 see: https://cms.hu-berlin.de/de/ueberblick/projekte/dataman/muster-dmp-bmbf, translated by Saskia Köbschall)

Project Name: Analysis of Inclusive Education Competence of Educators in Brandenburg (AIBEE-BB)
Research Funder: Federal Ministry of Education and Research
Funding Program: Qualification of Educational Professionals for Inclusive Education
Funding Code: 20XXXYZ16
Principal Researcher/Scientist: Kerstin Helbig
Principal Researcher/Scientist ID: http://orcid.org/0000-0002-2775-6751
Contact Person for Data Management: Maxi Musterfrau
Contact Person for Data Management ID: http://orcid.org/andereORCID
Contact: Tel. +49(0)30 2093-70072, Kerstin.Helbig@cms.hu-berlin.de

Project Description: The project examines the competence of educators in the field of inclusive education in Brandenburg daycare centers. Educators will be interviewed in focus groups about their current approaches to inclusive education. Additionally, data from the Federal Statistical Office will be reused. The data collection serves to analyze the significance of inclusion in Brandenburg and to assess the need for support, necessary further training, and services in the field of inclusive education.

Creation Date: Version 1 from 16.03.2016
Modification Date: Version 2.3 from 26.04.2016
Applicable Guidelines: Principles for Handling Research Data at Humboldt-Universität zu Berlin; Open Access Declaration of Humboldt-Universität zu Berlin

Data Collection

Focus groups will be organized and surveyed throughout Brandenburg. The responses will be recorded as video files and subsequently transcribed. The responses will be analyzed using MAXQDA. Excerpts from the videos will also be used for teaching and further education.

Additionally, existing data will be utilized. A secondary analysis of the Statistics on Children and Staff in Daycare Facilities (EVAS 22541) from the Federal Statistical Office will be conducted. This statistic is part of the Child and Youth Welfare Statistics (KJH). The data will be analyzed using the statistical software R. The data is representative, as it is based on a complete census.

Data Storage

Storage and backup will be ensured throughout the project duration by the project leader in cooperation with the responsible IT officer of the university’s Computer-Media Service. The Humboldt-Universität zu Berlin’s infrastructure will be used for this purpose. The research data will be stored in the HU-Box and secured with a password. Access is restricted to authorized staff members. A backup of the data is performed daily. Version control is automated.

File naming follows the following standard:

[FocusGroup]_[Location]_[YYYYMMDD].mp4

[Statistics]_[FileType]_[YYYYMMDD]_[Version].csv

Files will be stored in the most open, standardized formats possible. These include PDF/A, CSV, MPEG-4 (audio track WAVE), and TIFF if necessary. Where conversion to an open format is not possible, original formats will be retained.

Data Documentation

Metadata will be created according to the DDI standard through GESIS – Leibniz Institute for the Social Sciences. Additionally, metadata will be recorded in the forschungsdaten-bildung.de portal. Additional research data documentation is planned. The following documents will be created:

  • Transcription manuals
  • Focus group guidelines
  • QDA files
  • R syntax
  • Consent forms
  • Anonymization measures

Keywords will be assigned using the discipline-specific thesaurus TheSoz. The study will be classified by GESIS according to the Social Sciences Classification.

Legitimacy

The data will be handled and made available in compliance with legal regulations. Transcripts will be anonymized before release. Focus group videos will be altered or made available without anonymization only upon request. Participants will receive an informed consent form before participating. Official approval for conducting the focus groups will be obtained in consultation with the data protection officer.

Data Sharing

The digital research data collected will be published Open Access under a Creative Commons CC-BY license, provided there are no data protection concerns. Additional data will be provided under restricted access. The data will be made available through the GESIS – Leibniz Institute for the Social Sciences.

The data holds significant potential for teaching and can serve as a comparative basis for national or regional studies. Therefore, a maximally open access policy is pursued.

Data Preservation

Research data underlying publications, as well as other significant project milestone files, will be archived for at least ten years. Data without a legal archiving basis will be deleted shortly before the project ends. The data protection officer of Humboldt-Universität zu Berlin will be involved in this process. The total expected data volume is approximately 100 GB. Long-term archiving will be carried out by GESIS – Leibniz Institute for the Social Sciences for a minimum of 10 years. Additionally, the project results and all relevant research data will be stored for 15 years on the SAN of Humboldt-Universität zu Berlin.

Responsibilities and Resources

The project leader is responsible for the secure storage and long-term archiving of the generated digital research data, in collaboration with the institute’s IT officer. For the preparation of research data for publication and accessibility, an additional 3 PM (person-months) are allocated. The data availability and archiving through GESIS – Leibniz Institute for the Social Sciences is provided free of charge. Similarly, no additional costs arise from using the HU-SAN storage system.

Overall, the information in the BMBF-DMP is significantly more detailed than in the DFG-DMP. The structure is also more granular, as each section addresses only one element at a time.

The BMBF-DMP also contains more detailed information on ethical and legal aspects (section: Legitimacy). This could be intended to increase the chances of success when applying for funding or indicate that the DMP was already more thoroughly developed at this stage.

It is also noticeable that there are already very clear specifications regarding the file types of expected research data (section: Data Storage) as well as the planned types of documentation (section: Data Documentation). In the final section (Responsibilities and Resources), there are notes on the planned required resources in terms of personnel positions.