Logo

Search in DATA AFFAIRS

SectionMethods: Data Documentation and Metadata

Methods: Data Documentation and Metadata

Source: Types of Data Documentation, Anne Voigt with CoCoMaterial, 2023, licensed under CC BY-SA 4.0

Metadata and Metadata Standards

Metadata describe (research) data. They provide structured information about the research context, methods and analysis procedures used, research teams, available datasets, and more. Typically, metadata can be categorized into:

  1. Bibliographic Metadata (e.g., title, author, thematic focus of the subject)
  2. Administrative Metadata (e.g., file format, access rights, licenses)
  3. Process Metadata (e.g., methods used in data collection)
  4. Descriptive Metadata (e.g., additional information about the content and origin of the data) (Forschungsdaten.info, 2023d)

Metadata can be summarized and published in pre-structured templates like ReadMe filesReadMe files in the context of systems or projects contain information about the respective system, project, etc., to help users orient themselves. Read More. Archives, repositories, or research data centers also often provide forms with metadata structures for data archiving1 See for example, Qualiservice: https://www.qualiservice.org/en/the-helpdesk.html#downloads.

For social sciences, discipline-specific metadata standards have been established:

  1. Data Documentation Initiative (DDI)2 https://ddialliance.org/
  2. Dara Metadata Schema3 https://www.da-ra.de/downloads#version-3-0

These standards are integrated into digital databases and available for free download. However, these standards are primarily geared toward quantitative research data and are less suitable for qualitative research data, which are predominantly generated in social and cultural anthropological research. For empirical ethnographic research, supplementary documentation through data reports, study reports, and context materials is therefore indispensable.

ReadMe Files

ReadMe files are simple text or TEI-XML files stored in formats such as .txt, .md, or .xml. They include key metadata in a compact and structured format, such as project name, team members, funding, naming conventions, folder structures, and abbreviations. They can also record changes and data versioning. ReadMe files are typically machine-readable and can be published independently. They serve as a practical overview, are usually machine-readable and may look like this:

Examples

Documentation of research project XYZ
Creator(s):
Research context and hypotheses (reason(s) for data analysis):
Creation date of file(s):
Data collection/creation method(s):
Used Software (incl. version and add-ons), tools or devices:
Data (file names (incl. version), content, methods for data cleansing, language of data):
Softwarecode (file names (incl. version), content, programming language):
Additional documentation files (e.g. codebook, lab notebook, questionnaire):
Information on access and terms of use (license)
Notes:

Data Reports and Study Reports

Metadata alone are often insufficient for documenting qualitative research data. A data or methods report – referred to as a study report by Qualiservice – offers an alternative documentation method. Researchers can describe contexts, connections, and additional information in free text or bullet points, as well as record changes.

The report should include (similar to the metadata templates) the institution and persons, the research question, the preliminary work and conceptualization of the topic. Likewise, methods should be mentioned and further steps of data processing and analysis (such as transcription, evaluation procedure, interpretation and perspective of the researcher) should be presented. Furthermore, references to further contextual information and reuse potentials can be established. It is recommended to keep the report short and concise with essential information, notes and descriptions, and to use it as a practical and detailed summary of the research. As an overview and summary of the research, the report is an advantageous orientation aid for both the researchers themselves and for the potential research team and can be used as a stand-alone publication (RatSWD, 2023, p. 27).

Context Documents and Materials

For qualitative research data or data collected through ethnographic research, providing context materials aids documentation and reuse scenarios. These materials include artifacts such as written documents, images, videos, and objects of mundane, sacred, or artistic origins that are collected (not generated) by ethnographers. They can be analyzed and contextualized according to the research question (see article Data in ethnographic research).

For data documentation, this understanding can be extended to include contextual documents that “arise” during the research: questionnaires, interview guidelines, systematic observation protocols and other related data collection instruments, field and method reports, the respective transcription rules, anonymization measures and evaluation programs, etc. Context documents of this kind serve as data documentation and lead to a better understanding of the research and the research results.

For archiving, it is important to subject the documents and materials to a considered curation and sorting by type (such as interview data, surveysIn the social sciences, a survey refers to standardized, quantitative overview studies that provides information on specific groups or observational units, such as households, family structures, age groups (youth, retirees, workers, etc.), or individual companies and organizations. Survey data are usually collected through questionnaires or structured interviews. These data represent statistical microdata, allowing for the investigation of relationships and characteristics at the individual level. Surveys are standard methods in quantitative social research and are also employed in social and cultural anthropology to gather general information on social parameters, such as household compositions, economic conditions, or age structures within a population. Read More, observation data or media data, etc.) in advance, i.e. to consider which of one’s own research data are suitable for archiving and re-use and which are not. This decision is closely linked to ethical and data protectionData protection includes measures against the unlawful collection, storage, sharing, and reuse of personal data. It is based on the right of individuals to self-determination regarding the handling of their data and is anchored in the General Data Protection Regulation (GDPR), the Federal Data Protection Act (Bundesdatenschutzgesetz), and the corresponding laws of the federal states. A violation of data protection regulations can lead to criminal consequences. Read More aspects. Once the selection criteria have been clarified, a precise derivation and listing of the materials and documents used, as well as the “tools” and “instruments” used in the research, can be carried out. This makes the research context, the perspective of the researcher, as well as the method, topic, research question, etc. comprehensible and interpretable.

Literature

Evidence in Data Affairs

Data Documentation and Metadata

Article, Learning unit