Like a geologist identifying strata of rock, Elizabeth Buchanan, PhD, describes three distinct eras of the internet as a way of coming to grips with its profound implications for human research: the Old Ways, Social Media, and Big Data.
Unlike our traditional view of geology with eons of shifting lands and tides, we are witnessing rapid change as new ways to manipulate and aggregate internet and digital data threaten to outpace our understanding of their research implications. By way of example, Buchanan cites the 2013 internet research guidelines1 by the Secretary’s Advisory Committee on Human Research Protections (SACHRP), which were issued amid ongoing data changes that continue to escalate.
“Many IRBs drafted their guidelines around those,” says Buchanan, interim IRB research administrator at the University of Wisconsin-Stout in Menomonie. “But even in a three-year time span — the SACHRP document took a couple of years to write, so it started probably in 2011 and then was actually published in 2013 — a lot had changed. Right as we think we have this figured out, here comes big data. We are again rethinking some of these data concepts and what these issues mean. Because we are not just seeing new forms of data — we are seeing all new methodologies and technologies that did not exist five years ago. The sophistication of these technologies just keep increasing so we are continually seeing [change].”
Buchanan froze this blur to a snapshot recently in Long Beach, CA, at the annual conference of the Association for the Accreditation of Human Research Protection Programs (AAHRPP). In addressing the rapid evolution of the internet and its implications for human research, she traces cyber history back to the old days when there was the relative anonymity and perception of control and ownership over sites we visited and data we downloaded.
Today, all bets are off. Ninety percent of the data in the world was created in the last two years, she estimates. The 10-year labor of decoding the human genome can now be done in roughly a week, she adds. IRB Advisor recently reached Buchanan at a conference in Dublin, Ireland, and asked her about her AAHRP presentation and the implications for a rapidly changing future.
IRB Advisor: Can you talk a little about how we have moved from this era of internet anonymity to one of identifiability?
Buchanan: I started doing this work in the mid-1990s and at that point there were many avenues and opportunities to remain anonymous to some degree in online experiences. And that’s not that long of a time — we are talking 15 to 20 years here — when what I call the “second phase” of social media really took hold. That was 2005 to 2006. If you think about the nature of social media, you can’t be anonymous. The whole point of social media is its interconnectedness, interrelations, and [showing] one’s presence and persona. So the whole idea is to be visible.
That is where the shift comes from anonymity — and let me say that some hardcore computer scientists would say, “We never had anonymity,” but that’s a different conversation. For our purposes, this shift is really significant and it pushed us into this third phase of big data.
IRB Advisor: You note in your presentation that big data research relies on algorithms and predictive analytics, but there have been a few public surprises and attendant outrage with the way the numbers can be crunched. We have recently seen dating site details about people released, with the researchers arguing it is already public data. Researchers have also shown that internet search patterns can reveal people with disease by their queries about symptoms.
Buchanan: We are starting to see big data now take this kind of next step. I think that people initially thought with big data that these data sets were so huge, and you needed such powerful computing and skill to be able to drill down to an individual level, that we didn’t have to worry about it from a human subject perspective. We are starting to rethink that and starting to see the human subjects’ research level [can be revealed] even in these massive data sets.
We shouldn’t be quite so surprised anymore. Think of the analogy of when [a retail business] has a data breach. We’re like, “Oh, no!” We get upset for a minute and then we go right back to doing the exact thing — our online shopping habits don’t change. So I think the public needs to realize that this is the new reality. The researchers, on the other hand, need to really respect the privilege that we have in engaging in research. I think it’s vastly important we that we use these data in responsible ways. There’s something like 5 quintillion bytes of data right now, and I know that’s a lot of data. We want to really think carefully about the research questions that we are asking and the methods that we are using to answer them. It really comes down to making sure our methods and our ethics are intact.
IRB Advisor: Is informed consent possible in such a research environment?
Buchanan: I think there are many levels and it is complicated. That’s why we are having this discussion everywhere for IRBs. We’re participants first and foremost in these other tools — in Facebook, and using Google and Bing as search engines. So at the first level we are participating, consenting to those terms. It’s only at the second level that the research consent becomes a consideration. So we are getting to this point [where researchers may say,] “Well, it’s already public data. Facebook, Google or Bing has already collected it according to their terms.”
And yet it is not necessarily [public] in terms of research. I think that’s where we are really conflating the two and saying, “Because you consented to use some product, that automatically transfers to consent in a research project to use those data.” That’s kind of a high-risk assumption. I think you have to look at it on a case-by-case basis. I couldn’t say all internet research should have consent waived, but I think there are many times where obtaining documents and consent might be truly impractical. But I can’t say, simply, “Because you are doing internet research, you can’t possibly get consent.” I don’t think that is true, either.
IRB Advisor: Is the primary role of the IRB in this type of research as a watchdog, assuring the protocol is ethical and if informed consent is waived, the subjects cannot later be identified?
Buchanan: Our role is to ensure that, first and foremost, persons are protected, and that we are sure that the research is scientifically valid and generalizable and contributes to some knowledge base. So with that, I think it is going to be harder and harder for IRBs to serve in that role when we are not talking about the human subject in the traditional regulatory sense or even in the philosophical sense of a human subject. We are talking about a “data subject,” something that exists external to us that we partially created based on our data inputs and outputs every day. But it is also based on these other operations and machinations that are always going on behind the scenes in third-party software.
So there is only so much the IRB can control in all of the things that are happening in this concept of the data subject. As we look more and more at this kind of research, which is in every discipline now, I think IRBs have a role in really reminding individuals and researchers about the basic ethical principles. Our traditional research ethics principles — all the issues with trust and dignity, because so much of what is happening is seemingly out of our control. A good majority of U.S. IRBs have or rely on some form of internet research guidance. (See editor’s note below for resources.)
IRB Advisor: This is starting to sound like Future Shock, the book by the late futurist Alvin Toffler that predicted change would occur faster than our ability to adapt.
Buchanan: I think we are very unsure of what we all want this to look like. Right now, I am in Dublin for a research ethics workshop and they are facing this whole new slate of EU regulations around data privacy. They are having a very different conversation than we are having in the states. We hear about being globally connected — the global web — and yet we have very different philosophical approaches to privacy. We have very different political approaches to how we enact these laws. So it continues to be really challenging. The best approach we find [may be] best practices for like cases and continuing to talk to each other. I think one of the worst things that can happen is that we get so scared and so shocked that we just stop doing this. I don’t think that can happen at this point. The cat’s out of the bag. I think it would be dangerous to try to shut down a lot of these new forms of research.
IRB Advisor: Of course, with all the risks comes the potential for great reward if this data can be harnessed and used ethically.
Buchanan: I think we have a lot of opportunities in science and medicine right now with these forms of data and the power of big data for computing. We’re able to do things in a day now that would have taken years not to too long ago. There are tremendous ways in which this is going to be beneficial to society.
We are in some growing pains right now. Think about our own lifetimes. I was not born a digital native. I grew up and learned how to use a computer when I went to college. It’s very different now, and I think societally and culturally we are kind of learning what it means to have a generation that grew up digitally — to experience life in a very different way than what a large portion of our society has grown up with. There are going to be a more of these “shock” cases that will continue to pop up, but I hope we can continue doing the best we can as educators for ethical research.
Editor’s note: Buchanan has compiled a list of IRB internet human research policies from a wide variety of institutions at: http://bit.ly/29sz45Y.
- SACHRP. Considerations and Recommendations Concerning Internet Research and Human Subjects Research Regulations, with Revisions. Final document approved at SACHRP meeting March 12-13, 2013: http://bit.ly/29CoDyi.