By Melinda Young

IRBs and researchers should change their old habits when it comes to assessing studies for privacy and confidentiality. When HIPAA’s privacy rules first went into effect, it was not possible for the average armchair sleuth to drill down to specific people based on little more than some health and demographic characteristics and a study site’s location.

Now, it is. Researchers recently showed that de-identified data could be used to find a specific person. Using a mathematical model in databases of more than 200 populations, researchers found they could correctly re-identify 99.98% of Americans, using 15 demographic attributes.1

The study authors concluded that even heavily sampled, anonymized data sets are not protected from re-identification, challenging the adequacy of the de-identification release-and-forget model.1

“This is astounding information,” says Michele Russell-Einhorn, JD, chief compliance officer and institutional official for Advarra.

Even if identifying individuals is not as easy that sounds, it is far more likely that research subjects’ identities could be discovered in 2019 than it was in 1999.

“Today, you have banks with cameras, streets with cameras,” Russell-Einhorn says. “If you have Alexa [Amazon Echo] in your house, you have lost privacy protection.”

Regulatory language on privacy and the way IRBs view privacy are not adequate for current privacy challenges, she says. “We need to get together and come up with the best language and best approaches for our research participants,” she adds. “I’m concerned that the non-research landscape is eliminating privacy and confidentiality, and I think that we need to make a decision about how we want to impact the research landscape.”

IRBs cannot lose sight of the fact that research participants are volunteers. “We’re faced with the fact that people choose to participate in research. We need to be respectful of their choices, and we need to maintain the public’s trust in research,” Russell-Einhorn says.

It also is the IRB’s responsibility to ensure investigators have considered privacy and confidentiality thoroughly.

“Investigators are incredibly overwhelmed with different responsibilities in conducting research, and they might not think about the exact words they use in informed consent about privacy,” she says. “They think they can say, ‘We guarantee your privacy and confidentiality, and no one but study team members will ever have access to your data,’” Russell-Einhorn says. “That’s simply not true; no one can guarantee this because secondary research could see the data.”

If investigators are not thinking through this kind of wording in informed consent and what the implications are, then it is up to IRBs to think it through, she says.

“I don’t fault them because they have way too much on their plates,” she adds. “But we need organized discussions on how we want privacy and confidentiality handled in research, and we need to include investigators, IRBs, and industry sponsors.”

IRBs and investigators also should consider the reality that their past ways of de-identifying data might no longer be adequate.

“If you think about research and data — the diagnosis, height, weight, and lab counts — there’s lots of other information that could be identifiable depending on the disease or condition and what’s being studied,” Russell-Einhorn says.

IRBs should consider all the risks and benefits of a study, including the potential for a breach of privacy. They also should ensure that privacy and confidentiality are not guaranteed in the informed consent, she says.

“The first thing to do is make sure the consent form is understandable and honest,” she adds. “The worst thing an informed consent form could say is ‘We guarantee your information will never be shared and de-identified’ because I don’t know how anyone can guarantee that.”

Instead, IRBs should encourage language that looks more like: “We’ll do our best and set up the research in such a way to keep your information as restricted as is possible,” Russell-Einhorn says.

It also is fine to say in an informed consent form, “Notwithstanding our best efforts, it is always possible that de-identified information will be accessed.”

IRBs also could give extra consideration to the risks of de-identification in studies that enroll a very specific minority population, Russell-Einhorn says. For example, a sociobehavioral study that enrolls a transgender population could be at risk of inadvertent de-identification of subjects simply because this population in any given research institution or area could be very small.

“That’s where an IRB could say, ‘If you put down where the study takes place and you talk about transgender issues, it could be identifiable,’” Russell-Einhorn explains. “If you don’t list certain demographic information, then you are more likely to have data that is not identifiable, which — in the case of transgender research — might be a better and more proactive way of protecting those individuals.”


  1. Rocher L, Hendrickx JM, de Montjoye YA. Estimating the success of re-identifications in incomplete datasets using generative models. Nat Commun 2019;10:3069.