Logo

Search in DATA AFFAIRS

Learning unitAnonymization and Pseudonymization

Exercise 3

This exercise relates to Example 4 from the practical examples.

Example 4: Anonymization through Aggregation, Asher & Jahnke (2013)

In the article “Curating the Ethnographic Moment” (Asher & Jahnke, 2013), the authors discuss challenges and practices in research data management concerning ethics, obtaining informed consent, and anonymization/pseudonymization strategies. Researchers were interviewed about their experiences. A sociologist described the following dilemma:

“I wanted to do life histories with priests [in central Pennsylvania], and part of the problem was . . . we got into a situation where people might tell me things about their personal lives that are sort of not confidential in the IRB sense but that might be upsetting to their congregations – like I talked to one priest who had been married three times, where if the congregation had known about that they would have been very upset. There’s nothing illegal about it; this person’s not shy about telling that, but it could have been damaging.” (2-13-111411).

(Asher & Jahnke, 2013)

This excerpt includes a reference to the research location, Central Pennsylvania. A reader comments on this in the article’s online discussion:

“I am finding this article to be extremely useful and interesting. However, I noticed this “I wanted to do life histories with priests [in central Pennsylvania]” and I think you should remove the geographic reference. There can’t be that many priests in central Penn. and marriage certificates are public records. By including this geography in your article, you may yourself be compromising the privacy of the potential respondents”.

http://www.archivejournal.net/essays/curating-the-ethnographic-moment/

The authors justify the choice as follows:

“Your point is well taken. We chose to replace a specific geographic reference (in this case a town) with the more general and nonspecific “central Pennsylvania” in order to retain contextual information while expanding the population of potential people to a large enough degree to make identification difficult. Since “central Pennsylvania” can be used to refer to almost anywhere between Philadelphia and Pittsburgh, it would take a very committed person to compile a list of priests and cross reference it with marriage records–both very difficult tasks, especially since the person in question could have been married anywhere. However, as an added precaution, we have also omitted information about when the researcher was conducting this work and denomination of the priest the researcher was discussing, which further expands the population that would have to be investigated. We therefore believe the risk of identification is very low, but you are correct in noting that researchers and archivists need to be aware that seemingly innocuous details can result in breeches of confidentiality.“

(Asher & Jahnke, 2013)

How do the authors justify mentioning Central Pennsylvania as the research location? What are the arguments for and against removing the geographical reference (Pennsylvania)?

Asher justifies the decision to mention Central Pennsylvania as the location by stating that he wanted to provide contextual information (and thus a representation and perception) of the approximate locality of the research, while at the same time keeping it rather unspecific to make the exact identification of the priests more difficult. The ethnographer considered the risk of investigative research and the traceability of the mentioned priests to be very low. Although it would have been possible to omit the location entirely, in this case, the state of Pennsylvania potentially marks the cultural and social research framework and background of a study on the biographies of priests in the eastern United States.