One of the more complicated issues social, behavioral, and education research (SBER) investigators and IRBs might consider involves how to de-identify data for use in qualitative studies.

De-identifying data in a medical record for a clinical trial is very different from de-identifying data in a SBER study, says Laura A. Henderson, MA, senior IRB administrator of the committee on the use of human subjects at Harvard University in Cambridge, MA.

Removing identifiers in biomedical trial data requires a different set of considerations than removing identifiers in qualitative SBER research.

“Quantitative data can be much more readily de-identified and in a much cleaner way,” Henderson says. “In quantitative research, you might have an Excel spreadsheet and have separate fields for names, addresses, ZIP codes, demographic variables, and those kinds of things.”

Stripping data of those identifiers is simple. “Qualitative data is way different,” Henderson says. “In my field of culture anthropology, you might look at extremely detailed field notes — life histories, for instance.”

Investigators and IRBs must ask what de-identification looks like in this type of research. How do they take something that is, basically, defined by its context and richness of story and personal meaning, and de-identify it?

“In my field, if we’re interviewing somebody, it’s all about how their story is reflective of larger social realities,” Henderson says.

This information can be useful in research studies beyond the original study, she notes.

“We want to use it to its greatest capacity,” Henderson says. “People can examine what you’ve done and hold it up for peer review and take data and think about it in different ways and enrich common knowledge through new analysis.”

Researchers could replicate a study, building on existing data. They could take one data set and have it informed by another person’s data set.

“There’s a growing interest in pushing people to share their data and enabling them to provide an infrastructure to share their data,” Henderson says.

But this is where it becomes challenging. There are no easy answers to the dilemma of how to de-identify this type of information without rendering it useless.

“If you take out all of the stuff that can betray the identity of somebody, do you have anything left that can be used, or is it eviscerated of its core meaning?” Henderson asks. “Can you contribute something that actually has integrity and meaning if it’s that deeply qualitative?”

For example, a researcher could exchange the person’s home town with another town’s name or say the person is an aesthetician in place of the person’s job as a manicurist. These types of changes will mask the person’s identity, she explains.

“But can it then be misunderstood because too much meaning has been stripped from it?” Henderson asks. “Could something be misconstrued because it has lost all of that context?”

For example, what if the person being studied lived in a town near a former nuclear test area and this person’s childhood traumas were influenced by that regional context? Stripping the person of that place identity would change the way researchers might view the person’s social-behavioral issues.

Another issue to consider is how the de-identification is described in the informed consent.

Sometimes investigators anticipate an IRB requiring data destruction and will add to the informed consent that all identifiers will be destroyed in one or two years, Henderson says.

“They box themselves in to do it,” she says. “They don’t think through the other ways they could collect data and organize it for use further down the road.”

Also, the process of de-identification might conflict with what the study participant agreed to in the informed consent process.

“This might be a person who spent hours pouring their heart out about stuff deeply personal to them,” Henderson says. “If they saw that the next step of representation is to share the data, and all those things are missing from it making it much more generic in nature, then that could be a problem in how it might not jibe with the spirit of the consent process.”

IRBs will encounter this type of issue with long-term qualitative research, where the researcher is building up a relationship with informants over time, she says.

“That process of sharing information back to them and checking back with them is growing in acceptance within the framework from an IRB perspective, as part of a continual informed consent process,” she adds.

Sometimes, investigators might share with participants how their information has been changed. Henderson has seen this type of process described in protocols and sometimes intertwined with a research orientation that is informed by certain social theories.

“The best example of this is participatory action research,” she says. “It cuts across disciplines, but people who use that approach tend to be more on the humanistic side: public health, anthropology, smaller groups of individuals.”

Sharing research sometimes comes up when investigators want to return something of value to the community, as well as have an interest in speaking correctly on their behalf, Henderson says.

Also, some people who are interviewed for SBER studies have a personal interest in staying identified.

For example, Henderson’s thesis project was about trafficking and bonded labor. She met with youths at a rehabilitation center, which also served as a political action camp. The young people rescued were given literacy training, food, and housing, and they were taught to view their experiences differently. Rather than think of themselves as victims with no control over their lives, they were taught to transform their own narratives and stories to fit into a larger human rights narrative, she says.

As such, the children of various ages wanted to use their real names. They had no parental supervision and they had universally been abused. “They were seeking the public eye,” Henderson says.

The research took place in the 1990s in India, and Henderson received approval after a full board review.

For qualitative research and de-identification, the IRB’s role will depend on the data’s phase of use. If de-identification occurs for a secondary use, then the original IRB might no longer be involved, Henderson says.

“Investigators might take out certain identifiers before putting the data in a repository,” she says. “De-identification is time-consuming and very difficult to accomplish.”