Logo

Search in DATA AFFAIRS

SectionMethods: Anonymization and Pseudonymization

Methods: Anonymization and Pseudonymization

Researchers in disciplines that employ ethnographic methods must fundamentally address questions of pseudonymization, as „qualitative researchers typically focus on sensitive topics that are reconstructed from the subjective perspectives of respondents. Thus, the very personal details and individual references, institutions, organizations, and third parties that need to be anonymized are at the heart of the research“ (Kretzer, 2013, p. 21Translated by Saskia Köbschall.).

A common practice in social and cultural anthropology is to replace personal names with pseudonyms when writing up research results (Imeri, Klausner & Rizzolli, 2023, p. 243) – for example, changing „Marta“ to „Barbara“ – and, if necessary, modifying other identifying characteristics to protect research participants or third parties mentioned in the material. However, the latter is not always straightforward. For instance, it is not always feasible to replace or omit the name of a significant office, but by mentioning it, the respective office holder at the time of the research can be identified.

A common strategy in social and cultural anthropologyis to further obscure specific research locations by using fictitious place names, often without explicitly specifying the research regions. Additionally, methods and strategies of fictionalization are used in the field, wherein personal information is replaced, supplemented, or narratively transformed with fictional elements to protect research participants. Through creative pseudonymization adapted to local conditions, ethnographers can still provide a dense description of the living conditions they study.

However, it is seldom acknowledged that, while these strategies effectively prevent the identification of individuals by outsiders – such as readers of publications or users of archived datasets – they often fail to conceal identities within the communities being studied. This is particularly true in the case of extended case studiesThe Extended Case Method (ECM) was developed in British social anthropology during the 1950s and 1960s and has become one of the standard qualitative methods in the field. It can be defined as the detailed documentation and analysis of specific events or event sequences observed in the field, from which general theoretical principles can be derived. Unlike a singular, time-limited case study, the ECM investigates the interconnectedness of multiple social events over extended periods, where the same actors are involved. This method allows for the capture of social negotiation processes. ECM data typically consist of 'thick descriptions' that contain numerous sensitive personal references, which require particularly careful protection and anonymization. Read More, which are often used to reconstruct complex sequences of events. In such cases, even if personal and place names are pseudonymized, those involved typically know who the respective protagonists were. Ethnographers must therefore approach the preparation and presentation of case studies with particular care and sensitivity. Careful fictionalization often remains the only means to ensure data protection in these instances. Pseudonymization and anonymization are therefore not merely mechanical procedures, but rather complex and creative processes.

Moreover, pseudonymization is difficult to implement „on the run“ during the ethnographic documentation process, since ethnographers typically work with the real names of their research participants in observation protocols (which later serve as the basis for reconstructing complex case histories during analysis). This practice is tied to the fact that personal names are strong identifiers, greatly facilitating ethnographers‘ orientation within their material, whereas using pseudonyms during the documentation process tends to create significant distance. Most social and cultural anthropologists therefore pseudonymize their data material only prior to publication or when preparing it for archives. This means, however, that they must store their primary material with heightened security measures (see the article on Data Protection).

Dealing with multimedia data (images, audio, video) poses a particular challenge, as such data are difficult to pseudonymize or anonymize. Consequently, data protection must be handled with extreme care in these cases (see interview with M. Kramer). Research Data Centre (RDC) Qualiservice addresses this issue by significantly restricting access to multimedia data containing personal references, allowing access only on-site in Bremen. For photographs, blurring or pixelating faces has become a common practice in publications; for audio data, voices can be distorted, although this often results in an unpleasant sound. Overall, such distortion techniques significantly impair the informational value of visual and audio documents, although they remain indispensable in certain contexts.

Particular attention must be paid to issues of anonymization and pseudonymization, especially in light of the growing importance of social media content in ethnographic research.

Tips and Tools

There are various software tools for anonymization and pseudonymization (e.g., IQDA Qualitative Data Anonymizer or eAnonymizer) which can automatically pseudonymize data such as transcribed interviews during their creation. It is recommended that only excerpts of interviews be published in pseudonymized form, to prevent third parties from reconstructing complete contexts.

A particularly useful tool is QualiAnon, an anonymization tool developed as open-source software by the Research Data Centre (RDC) Qualiservice in Bremen (Nicolai et al., 2021). QualiAnon supports the anonymization/pseudonymization of text data. As open-source software, it is freely available for researchers to use2For further information, see the QualiAnon User Manual (Nicolai & Mozygemba, 2023)..

The German Network of Educational Research Data also provides helpful guidance and examples of distortion strategies (Meyermann & Porzelt, 2014).

Literature

Evidence in Data Affairs

Anonymization and Pseudonymization

Article, Learning unit