“Big data” is becoming increasingly important in healthcare, with the Precision Medicine Initiative and numerous other quality initiatives seeking de-identified information to improve care. Some ethical concerns include the following:

  • Even de-identified data carries privacy risks.
  • Individuals who volunteer their data don’t always benefit.
  • With “blanket” consents, data could later be used in a way that is objectionable to the individual.

President Barack Obama’s Precision Medicine Initiative seeks 1 million volunteers to share genetic data and biological samples. The goal is to develop targeted approaches to diseases — but there are important ethical implications as well, says Melinda C. Hall, PhD, an assistant professor of philosophy at Stetson University in Deland, FL.

“While we clearly know the importance of an individual’s right to privacy and informed consent, we know less about the benefit to the public good of massive, aggregated health data,” says Hall.

Hall names the following two important ethical questions:

Will those who volunteer their data benefit or be fairly compensated?

Who controls the data once it is collected?

The promise of huge advances in health and medicine from whole genome sequencing is still relatively unfulfilled. “And it may remain so,” adds Hall. “We should not be duped into skipping over concerns about informed consent and privacy rights for speculative goods.”

Current regulations on research and patient privacy don’t require consent for de-identified data. “So if it’s de-identified, you can use it. But patients may still object that they don’t want their data used,” says Sharona Hoffman, JD, professor of law and bioethics at Case Western Reserve University School of Law in Cleveland.

There is a question as to whether data can ever really be de-identified. Publicly available information, such as voter registration records, might enable experts to link de-identified data to individuals in some cases.

“People should be aware that their data can be put to lots of uses that they might not know about,” says Hoffman. One example is “data brokers” who mine data from numerous publicly available sources, including hospital discharge records, and sell it to interested parties. “There are questions about whether privacy is adequately maintained,” says Hoffman. This can lead to discrimination, or people being harassed by aggressive marketers.

Some argue that if individuals choose to withhold their data from research, they become “free riders” benefiting from medical research without contributing. “People are grappling right now with whether, and to what degree, the common good should prevail over individual autonomy,” says Hoffman.

One question is whether it’s ethically permissible for researchers to obtain a “blanket” consent for de-identified data that’s likely to be used for numerous purposes in the future.

“You are basically providing consent and not knowing what it will be used for,” says Hoffman. “Is that meaningful, or should you be contacted every time a researcher wants to use the data for a new project?” A patient might willingly consent to his or her data being used for a particular research project. That same person might vehemently object to the data being used years later for cloning or stem cell research.

“There will have to be more public education,” says Hoffman. “The public needs to understand the huge amount that we can do in terms of records-based research, but also understand the risks.”

Distributive justice is the primary ethical issue with use of de-identified health data, in Hall’s view. “The individuals and populations from whom data is collected should, eventually, benefit from the collection of that data,” she says.

In some cases, individuals aren’t even aware their health data is being collected. This raises concerns about informed consent. “When data is collected from individuals who have not consented to the data’s use in research, that data is unethically used,” says Hall.

Engaging in a clinical trial, which itself has no therapeutic value, is a different matter than allowing the collection of one’s data after lab work or during clinical visits, says Hall.

“Yet, the data collected is a primary driver of profit for corporations, including those in the pharmaceutical industry,” Hall says. Those from whom the data is collected are not being paid. In some cases, they pay to share it.

“So one distributive question is: Why don’t those who provide data profit from data?” says Hall.

For example, thousands of consumers pay to share their genomic data with 23andMe, a company that sells limited genetic analysis, but also acts as a massive biobank of that genetic data. Individual users send in a saliva sample and can choose to answer a variety of survey questions regarding lifestyle. “This valuable genetic data and related health data contributes to 23andMe’s billion-dollar valuation, while users do not profit-share,” says Hall.

Data should benefit those from whom it is collected, says Hall — either in terms of monetary compensation or health-related benefits. The widely publicized 2013 publication of Henrietta Lacks’ genome without permission from her descendants brought this issue to the spotlight.1

“A highly profitable immortalized cell line resulted from her biopsy,” notes Hall. “This health data is used in research, yet does not benefit those from whom the data was collected.”

De-identified health data should benefit the public and future generations, says Hall, “while at the same time providing tangible benefit for the person whose data is collected.”


  1. Andrews BJ, DePellegrin T. HeLa sequencing and genomic privacy: The next chapter. G3: Genes, Genomes, Genetics 2013; 3(8):vii.


  • Melinda C. Hall, PhD, Assistant Professor of Philosophy, Stetson University, Deland, FL. Phone: (386) 740-2507. Fax: (386) 822-7582. Email: mchall@stetson.edu.
  • Sharona Hoffman, Professor of Law & Bioethics, Case Western Reserve University School of Law, Cleveland. Phone: (216) 368-3860. Email: sxh90@case.edu.