Logo

Search in DATA AFFAIRS

SectionIntroduction: Online Ethnography

Introduction: Online Ethnography

„…it is difficult to describe yourself as doing “anthropological research” if you are sitting at a computer and typing back and forth with invisible people”.

(Tratner, 2016, p. 178)

Online ethnography is increasingly becoming an integral part of socio-cultural anthropological research, as analog and digital worlds are becoming more intertwined and the boundaries of the “field” are shifting into virtual spaces. Accordingly, methodological and ethical aspects of online research are receiving growing attention (Hine, 2006; Boellstorff et al., 2012; Sanjek & Tratner, 2016; Franken, 2023).

„To what extent are the procedures and assumptions that are currently taken for granted in ethnography suitable for online research? How does one take ethnographic fieldnotes on a social network site (SNS), a multiuser dungeon (MUD), or a community blog? How do we deal with the large amount of data available online?“

(Schrooten, 2016, p. 80)

In the following, we will exemplarily address some challenges associated with online ethnographic research. It should first be emphasized that we are not discussing digital methodsDigital methods employ computational techniques for the acquisition, processing, and analysis of data. They represent an emerging interdisciplinary field focused on developing and applying computer-based methods to analyze social, cultural, and societal phenomena. The two main branches are Digital Humanities (DH) and Computational Social Sciences (CSS). A good introduction to digital methods in qualitative research is provided by Franken (2023). Read More that use computational analysis techniques to collect and process large datasets (Big Data). Instead, we focus on the ethnographic investigation of virtual worlds and communities as well as digitized everyday life. Franken (2023) provides an excellent overview of the possibilities of digital methods within qualitative social research.

A milestone in online ethnographic research is the book Coming of Age in Second Life: An Anthropologist Explores the Virtual Human by Tom Boellstorff (2008). The American socio-cultural anthropologist spent two years conducting participant observation in the virtual world of this online environment through his avatar, Tom Bukowski.

Source: Dr. Tom Boellstorff’s Talk, Blue Myanamotu licensed under CC BY 2.0

He was interested in what ethnographic research could reveal about virtual worlds. His goal was not to compare the analog and digital worlds or to identify the real-world individuals behind the avatars he interacted with in Second Life. Instead, his research focused on decoding the structures and patterns created by the avatars in Second Life.

„I took their activities and words as legitimate data about culture in a virtual world.”

(Boellstorff, 2008, p. 61)

Boellstorff consistently applied ethnographic methods: participant observation, interviews, informal conversations, and focus group discussions with the avatars. He obtained their informed consentInformed consent refers to the agreement of research participants to take part in a study based on the basis of comprehensive and understandable information. The design of an informed consent must address both ethical principles and data protection requirements. Read More to record their conversations and document his observations. The names of the avatars – which were not real names – were pseudonymizedPseudonymization is 'the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures to ensure that the personal data cannot be attributed to an identified or identifiable natural person' (BlnDSG §31, 2020; EU GDPR Article 4 No. 5, 2016). Read More, and information about individual figures was altered to such an extent that they could not be easily identified in Second Life. He reports having created over a thousand pages of field notes, mostly handwritten, which, alongside saved chat logs and audio and video recordings of scenes and conversations in Second Life, served as the data foundation for his monograph

Boellstorff’s study demonstrates that it is generally advisable to approach online ethnographic research in a conventional manner – observing, participating, asking questions, taking notes, and creating scratch notes and field notes. In addition – and this is crucial – it is important to save data produced online, such as chat logs, images, and video materials, for later analysis. In this context, Boellstorff’s comments on the ease with which data can be obtained and stored in virtual contexts are noteworthy:

“The ease of obtaining data in virtual worlds can also be a curse, because the very process of memory and handwriting force ethnographers to focus on what seem to be the most consequential incidents encountered during participant observation”.

(Boellstorff, 2008, p. 71)

Many ethnographers conduct research both offline and online, following their research participants into and within the digital realm. This occurs, for example, when they study virtual communities often created by migrants to provide each other with guidance and tips regarding their new environments, or when they examine how different political, ethnic, or religious minorities present themselves online. Many researchers continue their analog studies via messaging services after returning from the field. Depending on which virtual area or extension of analog life is being studied, different data collection and recording techniques may be appropriate.

A fundamental question is how researchers document or store online content. It is important to distinguish between research-induced and process-induced digital data (Baur & Graeff, 2021; Franken, 2023). Research-induced data are explicitly generated by researchers, e.g., through interviews, questions in chat groups, surveysIn the social sciences, a survey refers to standardized, quantitative overview studies that provides information on specific groups or observational units, such as households, family structures, age groups (youth, retirees, workers, etc.), or individual companies and organizations. Survey data are usually collected through questionnaires or structured interviews. These data represent statistical microdata, allowing for the investigation of relationships and characteristics at the individual level. Surveys are standard methods in quantitative social research and are also employed in social and cultural anthropology to gather general information on social parameters, such as household compositions, economic conditions, or age structures within a population. Read More, or personal conversations in platforms like WhatsApp groups. In contrast, process-induced data are pre-existing online, such as posts in forums, blogs, or social media. Within process-induced data, there is a further distinction between trace data – unintentionally left by people online – and social media data, which are consciously created by individuals, for instance, when setting up a profile on a platform and filling it with content (Franken, 2023, p. 67).

Research-induced digital data can be handled similarly to analog dataAnalog research materials are artifacts or objects of ritual or everyday use. They are created or collected during ethnographic fieldwork and may include items such as photographs, notes, books, audio tapes, drawings, or sculptures. To make these materials reusable online, they must first be digitized and provided with appropriate metadata, allowing them to be made accessible in a repository, for example (Forschungsdaten.info, 2023). The organization OpenAIRE provides guidelines for the secure handling of analog, non-digital research datasee: https://www.openaire.eu/non-digital-data-guide. Read More and are subject to the same data protectionData protection includes measures against the unlawful collection, storage, sharing, and reuse of personal data. It is based on the right of individuals to self-determination regarding the handling of their data and is anchored in the General Data Protection Regulation (GDPR), the Federal Data Protection Act (Bundesdatenschutzgesetz), and the corresponding laws of the federal states. A violation of data protection regulations can lead to criminal consequences. Read More, securityData security encompasses all preventive physical and technical measures aimed at protecting both digital and analog data. Data security ensures data availability and safeguards the confidentiality and integrity of the data. Examples of security measures include password protection for devices and online platforms, encryption for software (e.g., emails) and hardware, firewalls, regular software updates, and secure deletion of files. Read More, and ethical requirementsResearch ethics addresses the relationship between researchers, the research field, and the subjects/participants of the research. This relationship is critically examined against the backdrop of vulnerabilities and power asymmetries created by the research process (Unger, Narimani & M’Bayo, 2014, p.1-2). Due to the processual and open-ended nature of ethnographic research, ethical questions arise throughout the research process in various ways, depending on the research context and methods. However, research ethics does not end with leaving the field; it also encompasses issues related to data archiving, data protection, and sharing research data with participants (see, for example, ethics guidelines by the DGSKA or the position paper on archiving, provision, and reuse of research data by the dgv). Read More. They can be analyzed just like analog interviews or conversation protocols and included in publications in anonymizedAccording to the German Federal Data Protection Act (BDSG § 3, para. 6 in the version valid until May 24, 2018), anonymization is understood to mean all measures for modifying personal data in such a way 'that the individual details about personal or factual circumstances can no longer be assigned to an identified or identifiable natural person, or can only be assigned to an identified or identifiable natural person with a disproportionate investment of time, cost and labor.” Anonymized data is therefore data that does not (or no longer) provide any information about the person concerned. As such, it is not subject to data protection or the General Data Protection Regulation (GDPR). Read More or pseudonymizedPseudonymization is 'the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures to ensure that the personal data cannot be attributed to an identified or identifiable natural person' (BlnDSG §31, 2020; EU GDPR Article 4 No. 5, 2016). Read More form.

But what about process-induced data? How should information extracted from social media be handled? There is a widespread assumption that information circulating in social media is freely available, as the data producers (i.e., platform users) have agreed to the platform’s terms and conditions, thus legally permitting data use through official platform access. But is this usage ethically justifiable? Users consented to the platform’s terms, not to participating in research (Franken, 2023, p. 31). Since obtaining informed consent is not possible with automated data storage, it is necessary to carefully evaluate on a case-by-case basis whether using the data for research purposes is ethically acceptable (Franken, 2023, p. 31).

In closed or semi-public chat groups and forums that require registration with administrators, informed consent can be more easily obtained – either directly from administrators or by posting consent requests in the chat (see also Schrooten, 2016). Ideally, researchers should also ask whether threads may be saved. Such data should be securely stored and not published as complete datasets. Even though users in chat groups rarely operate under real names, it is still possible – especially in specific groups – to identify the individuals behind profiles, even when ethnographers have replaced usernames with pseudonyms. This is because digital data can be easily de-anonymized through “reverse searches” online:

“When we quote literally, it often only takes entering this text into a search engine to reveal the source“.

(Franken, 2023, p. 31)1Translated by Saskia Köbschall.

In short, data protection considerations must be carefully observed when handling digital information. A useful rule of thumb is provided by Unger et al.:

„1) Publicly Accessible Data: If the data are completely publicly accessible, informed consent is not strictly required. However, quoted text passages should be heavily anonymized. In cases involving ethically sensitive topics or high risks to individuals if traced, paraphrasing the content is recommended.

2) Semi-Public Data: For data from forums requiring login credentials, informed consent should be obtained via gatekeepers, such as forum moderators or website administrators. The same anonymization and citation standards as for public data should apply.

3) Private Data: For private data – such as content visible only to friends on Facebook – informed consent is mandatory. Although such data are difficult to trace back, allowing for less strict anonymization, direct quoting is generally acceptable. Nevertheless, it is advisable not to use usernames and to modify other personal details when necessary. For non-text-based data like images or videos, the same anonymization principles apply, including anonymizing visual and audio elements such as faces, surroundings, or voices.“

(von Unger, Franken & Egger, 2022, p. 8)2Translated by Saskia Köbschall.

When socio-cultural anthropologists engage in virtual communities, pose questions, participate in discussions, and document these activities, their approach and perspective become transparent and traceable, which can be advantageous. Some suggest that in online participant observation, fieldnotes seem to write themselves (Nardi, 2016): the ethnographer asks a question or comments in a chat and receives responses that only need to be saved. This highlights the dialogical nature of the research, but also raises the question of who ultimately owns the data (Jackson, 2016, p. 57).

This transparency also means that all “fieldwork errors or missteps are visible in these communications and preserved openly for all who may view the website” (Tratner, 2016, p. 188), as ethnographers usually identify themselves in their usernames. For example, Susan Tratner identified herself as “anthropologistmom” in her study of American parenting websites to remind chat participants of her role as a researcher (Tratner, 2016, p. 177).

Literature

  • Baur, N. & Graeff, P. (2021). Datenqualität und Selektivitäten digitaler Daten. Alte und neue digitale und analoge Datensorten im Vergleich. In: Blättel-Mink, B. (Ed.): Gesellschaft unter Spannung. Verhandlungen des 40. Kongresses der Deutschen Gesellschaft für Soziologie. (2021.) https://publikationen.soziologie.de/index.php/kongressband_2020/article/view/1362

  • Boellstorff, T. (2008). Coming of Age in second Life: An Anthropologist explores the Virtual Human. Princeton: Oxford Press.

  • Franken, L. (2023). Digitale Methoden für qualitative Forschung. Computationelle Daten und Verfahren. UTB Münster. (UTB Studium). https://www.utb.de/doi/book/10.36198/9783838559476

  • Jackson, J. (2016). Changes in Fieldnote Practices over the Past Thirty Years. In US Anthropology. In  Sanjek, R. & Tratner, S. (Eds). (2016). eFieldnotes. The Makings of Anthropology in the Digital World. Philadelphia. University of Pennsylvania Press. https://doi.org/10.9783/9780812292213

  • Nardi, B.A. (2016). When Fieldnotes Seem to Write Themselves: Ethnography Online. In  Sanjek, R. & Tratner, S. (Eds). (2016). eFieldnotes. The Makings of Anthropology in the Digital World. Philadelphia. University of Pennsylvania Press. https://doi.org/10.9783/9780812292213

  • Schrooten, M. (2016). Writing eFieldnotes. Some Ethical Considerations. In  Sanjek, R. & Tratner, S. (Eds). (2016). eFieldnotes. The Makings of Anthropology in the Digital World. Philadelphia. University of Pennsylvania Press. https://doi.org/10.9783/9780812292213

  • Tratner, S. (2016). New York Parenting Discussion Boards: eFieldnotes for New Research Frontiers. In  Sanjek, R. & Tratner, S. (Eds). (2016). eFieldnotes. The Makings of Anthropology in the Digital World. Philadelphia. University of Pennsylvania Press. https://doi.org/10.9783/9780812292213

  • von Unger, H., Franken, L., & Egger, N. (2022). Digitale Daten in der qualitativen Lehrforschung. Handreichung zum digitalen Datenmanagement für Studierende. https://www.qualitative-sozialforschung.soziologie.uni-muenchen.de/ressourcen/hinweise_qualitativ1/digitale-daten.pdf

Evidence in Data Affairs

Online Ethnography

Article, Learning unit