For Researchers Using Social Media Data, Consent Is Ethical Worry
Public health researchers are using social media data, but there is no clear guidance on how to use this information ethically.
Heidi E. Jones, PhD, MPH, and colleagues investigated a mixed methods evaluation of a program designed to train physicians in leadership and advocacy for sexual and reproductive healthcare, including abortion and contraception. A secondary objective was to understand how alumni from the program used social media, including Twitter, for advocacy. “We had a long discussion with the funder and the program implementers about whether we should seek informed consent from program alumni to review their Twitter advocacy activities,” says Jones, director of the doctoral program in epidemiology at the City University of New York School of Public Health (CUNY SPH).
Researchers erred on the side of caution by asking for consent — specifically, whether participants wanted to opt in to providing their Twitter handle in the online survey after describing how these data would be used.
This led to the broader question: What are the ethical research standards? A group of researchers from the CUNY SPH, including students and faculty, conducted a systematic review of public health research using Twitter data to examine the extent to which the studies require ethical oversight.1 Investigators often interacted with different application program interfaces (APIs) to select the sample of tweets to include in their studies. “But it was not clear whether they understood that which API they used could impact the selection process and the extent to which their sample was a representative sample of the target population of interest,” explains Courtney Takats, MPH, lead author of the study.
Researchers should be sure they are clear how the different APIs work. This way, investigators will understand the strengths and limitations of their samples, according to Takats.
Of 367 eligible studies, only 119 sought IRB approval. Forty-nine percent of studies included some type of discussion about the need to anonymize quotes to protect the identity of Twitter users. None of the studies sought any type of informed consent.
“On the one hand, many researchers consider the data publicly available. On the other hand, many are uncomfortable presenting identifying information of Twitter handles,” Jones notes.
Researchers struggle with the ethical implications of sharing information that enables Twitter users to be identified. “More guidance is needed from IRBs, as there does not seem to be consensus on best practice in terms of research ethics,” Jones observes.
When public health researchers use social media, ethical issues revolve around an obvious potential public benefit and less obvious potential individual and group harms, says Bonnie Kaplan, PhD, FACMI, a scholar at the Yale Interdisciplinary Bioethics Center. The person sharing information on social media might not intend for those postings to be widely available. Social media postings and the metadata connected with those posts generally are not privacy protected, even if the posting is about someone’s health.
“The click-through agreements required to use social media commonly allow data to be used for many purposes. These agreements are far from informed consent,” Kaplan notes.
The data are identified, so it also might include identifiable information about others, such as recipients of the messages, friends of the person posting, or other individuals in photos.
“The person posting and all these other people are having data collected about them without their knowing about it,” Kaplan says.
The individuals subject to this data collection may be lumped into groups that characterize people in inaccurate or stigmatizing ways, or used for messages they disapprove. These data can be combined with other information and used in ways they might not like, all without their knowledge or permission.
“These purposes may have nothing to do with public health. Even when the data are used for public health research, these individuals may not wish to participate in that research,” Kaplan says.
Additionally, the data could be available forever. It can be shared with other organizations, all for purposes the person using social media (let alone an IRB) will not know about.
“Even if identifiers are stripped, data can be subject to re-identification, especially when combined with other data,” Kaplan warns.
If IRBs are reviewing study protocols, they still may not be aware of the potential problems and how to guard against them. “All this is done without knowledge or consent of the people creating and responding to the social media postings,” Kaplan says.
Public health researchers routinely collect and analyze data from social media platforms, using key words and phrases to identify illness outbreaks, observes Tara Coffin, PhD, MEd, CIP, IRB chair and vice chair at WCG IRB. For example, Twitter content and accompanying geolocation data have been used in the research setting to track COVID-19 and other influenza-like illnesses. Similarly, Google searches for “food poisoning” combined with publicly available Yelp reviews have been used to identify foodborne illnesses.
“In the context of mental health surveillance, social media usage has been used as a marker for depression and secondary trauma,” Coffin says.
Scientists also use social media platforms to support research recruitment efforts. “While online outreach like this comes in many different forms, online behavioral advertising stands out because of its reliance on stored user data,” Coffin explains.
Essentially, investigators pay for an advertisement service through a social media platform. This allows for targeted outreach based on online behavior.
“Information about a user’s age, sex, occupation, race/ethnicity, education, income, and internet search history is leveraged through this service,” Coffin says. This ensures only people who meet specific criteria set by the research team will see the advertisement. “There is often a discordance between what a social media user perceives to be private vs. what actually is,” Coffin notes.
Any online action, social media post, or scrolling behavior is recorded in some capacity. Thus, it may be used for disease surveillance or research recruitment purposes, without the internet user understanding how their online data are stored and mined. “The absence of informed consent within online surveillance systems is frequently noted as a significant ethical concern,” Coffin reports.
When designing study protocols that include online behavioral advertising for outreach, researchers should clearly disclose these plans to the IRB.
“Importantly, partnering with patient advocacy groups to drive outreach can function as an effective substitute, benefiting from established social networks instead of relying on information mined without user consent,” Coffin offers.
Researchers also should treat online, user-generated data as “private,” even when it is not, according to Coffin. An individual posting to his or her own social media page typically operates with the assumption the content they generate belongs to them.
“Given the multitude of ways that our online data can, and is, used, this assumption isn’t 100% accurate,” Coffin stresses.
For example, Twitter data can be mined retrospectively or in real time, including geolocation, post content, and other information connected to the user’s profile. Anyone can collect these data, save them on their personal computer, and use them.
“Researchers should take steps to de-identify, securely store, and scrub non-health-related content from these data sources in the interest of preserving user autonomy and mitigating potential harm,” Coffin says.
1. Takats C, Kwan A, Wormer R, et al. Ethical and methodological considerations of Twitter data for public health research: Systematic review. J Med Internet Res 2022;24:e40380.
Researchers struggle with the ethical implications of sharing information that enables Twitter users to be identified. More guidance is needed from IRBs, as there does not seem to be a consensus on best practice in terms of research ethics.
Subscribe Now for Access
You have reached your article limit for the month. We hope you found our articles both enjoyable and insightful. For information on new subscriptions, product trials, alternative billing arrangements or group and site discounts please call 800-688-2421. We look forward to having you as a long-term member of the Relias Media community.