Site Links



Search for ERIC Digests


About This Site and Copyright


Privacy Policy

Resources for Library Instruction


Information Literacy Blog

ERIC Identifier: ED470201
Publication Date: 2002-07-00
Author: Brem, Sarah K.
Source: ERIC Clearinghouse on Assessment and Evaluation College Park MD.

Some Ethical Considerations and Resources for Analyzing Online Discussions. ERIC Digest.

Online discussions can be a valuable source of information for researchers and educators studying the interactions of a community of interest (Klinger, 2000), looking for ways to improve interactions within a working group (Ahuja & Carley,1998), or assessing student learning (Brem, Russell, & Weems,2001). Discussion forums frequently offer automated tracking services, such as a transcript or an archive, that allow detailed analyses of conversations that may have taken place months or even years ago. While online discussions present new opportunities to teachers, policymakers, and researchers, their analysis also presents new concerns and considerations. This Digest introduces ethical considerations related to acquiring and analyzing online data and provides resources to support sound practice.


Because online conversation is relatively new and unfamiliar, and takes place at a distance, it is relatively easy to overlook possible ethical violations. People may not realize that their conversations could be made public, may not realize that they are being monitored, or may forget that they are being monitored because the observer's presence is virtual and unobtrusive. Some participants may feel relatively invulnerable because of the distance and relative anonymity of online exchanges and may use these protections to harass other participants. Researchers working with online exchanges of information must provide the same levels of protection as they would in the case of face-to-face exchanges, including clearance with an institutional review board, privacy and confidentiality assurances, and informed consent.


A researcher affiliated with a university or similar institution requires the approval of an institutional review board (IRB) created for the protection of human beings who participate in studies. Teacher-researchers and others who do not have an IRB and are not associated with any such institution should nevertheless follow ethical principles and guidelines such as those laid out in The Belmont Report. available at <>. Other useful resources include Sales and Folkman (2000), and NIH ethics resources at <>.


The least problematic conversations are those that take place entirely in the public domain; people know they are publishing to a public area with unrestricted viewing, as if they were writing a letter to the editor. Newsgroups are an example of such exchanges anyone with web access can access conversation from the past twenty years. In many cases, this sort of research is considered "exempt" under federal guidelines for the protection of human subjects; for researchers at institutions with an IRB, the board must confirm this status. Still, even public areas may contain sensitive information that the user inadvertently provided; novices are especially prone to accidentally giving out personal information or including personal information without considering possible misuse. In addition to the usual procedures for anonymizing data (e.g., removing names, addresses, etc.), there are some additional concerns to address. Every post must be scoured for both intentional and unintentional indicators of identity. Here are some common ways that anonymity is compromised:

* Usernames like "tiger1000" do not provide anonymity;

people who are active online are as well known by their

usernames as their traditional names. Usernames must be

replaced with identifiers that provide no link to the actual


* Be vigilant in removing a participant's signature file

appended to a post and any other quotes, graphics, and

other idiosyncratic inclusions that are readily identifiable

as belonging to a particular individual.

* Identifying information is often embedded in a post

through quoting; for example, if I were quoted by another

participant, my email address might be embedded in the

middle of his or her message as "tiger1000

(<>) posted on 1 February 2002, 11:15."


If a domain establishes any degree of privacy through membership, registration, passwords, etc., or if a researcher wishes to contact participants directly, then the communications should be considered privileged to some degree. In addition to the safeguards required for public domain data, using these conversations in research requires at very least the informed consent of all participants whose work will be included in the analysis, with explicit description of how confidentiality and/or anonymity will be ensured. the procedures for informed consent, recruitment, and data collection will require "expedited" or "full" review by an institutional review board. Once approval has been given, consent forms will have to be distributed to every participant, and only the contributions of consenting members can be stored and analyzed.


If a researcher sets up a site for collecting data, regardless of how much privacy and anonymity promised, he or she is ethically bound to inform all potential participants that their contributions will be used as data in research. One example of how to provide this information has been implemented by the Public Knowledge Project. To see how they obtained consent, visit <>. Likewise, if participants are to be contacted directly, a researcher needs to make their rights clear and obtain their permission to use the information they provide for research purposes before engaging in any conversation with them.

In addition to preserving the safety and comfort of participants, researchers must also consider their intellectual property rights. All postings are automatically copyrighted under U.S. and international laws. Extended quotes could violate copyright laws, so quoting should be limited, or permission should be obtained from the author prior to publication. For more about U.S. and international laws, visit <>.


Online research is similar, but not identical to, face-to-face (f2f) research. With online research, participants in a conversation may not know at what point they are being monitored, or, because the monitoring is so unobtrusive, they may forget about it; in f2f research, participants are aware of the research setting. In f2f interactions, researchers can examine body language and intonation as well as the words spoken; in an online interaction, researchers have to look beyond the words written to the electronic equivalents of gestures and social conventions (e.g., "emoticons" such as :-) for smiling or using all capital letters as the equivalent of shouting).

Instead of collecting data using audio and video recording as in f2f conversations, researchers interested in preserving online conversations require ways to download or track the electronic files in which the information is stored. They must then analyze the data, just as those conducting f2f research do.

Common approaches to data analysis of online discussions include grounded theory, quantifying techniques, experimental manipulations, and ethnography. For information about analyzing discourse, see Stemler (2001); techniques and considerations that are specific to online discourse can be found in the 1997 special issue of The Journal of Computer Mediated Communication, "Studying the Net."). Information about tools for theory-based data manipulation is available at:

<> and

< gy/ Qualitative/Tools/>.


The study of online discourse is still quite new, and there is much about the treatment and analysis of these data that has not yet been addressed. When faced with a situation for which there is no standard procedure, the best course of action is to begin with established techniques and then adapt these to the online environment. Researchers should have a rationale for any adaptations or deviations they decide to make in order to establish credibility with editors and peers and allow others to adopt, recycle, and refine their approach.


Ahuja, M.K., & Carley, K.M. (1998). Network structure in virtual organizations. Journal of Computer-Mediated Communication, 3 (4). Available online: <>

Brem, S.K., Russell, J., & Weems, L. (2001). Science on the Web: Student evaluations of scientific arguments. Discourse Processes, 32, 191-213. Available online: < tm>.

Klinger, S. (2000). "Are they talking yet?": Online discourse as political action. Paper presented at the Participatory Design Conference, CUNY, New York. Available online: <>.

Sales, B.D., & Folkman, S. (2000). Ethics in research with human participants. Washington, DC: American Psychological Association.

Stemler, Steve (2001). An overview of content analysis. Practical Assessment, Research & Evaluation, 7(17). Available online: <>.


Library Reference Search

Please note that this site is privately owned and is in no way related to any Federal agency or ERIC unit.  Further, this site is using a privately owned and located server. This is NOT a government sponsored or government sanctioned site. ERIC is a Service Mark of the U.S. Government. This site exists to provide the text of the public domain ERIC Documents previously produced by ERIC.  No new content will ever appear here that would in any way challenge the ERIC Service Mark of the U.S. Government.