New research from the Johns Hopkins Armstrong Institute for Patient Safety and Quality in Baltimore suggests that most of the measures used by government agencies and public rankings to rate the safety of hospitals are not accurate or reliable. Of 21 quality measures studied, only one was deemed valid.
The study, published in the journal Medical Care, assessed the Agency for Healthcare Research and Quality (AHRQ) patient safety indicators and the CMS hospital-acquired condition (HAC) measures, both used in public rating systems. (An abstract of the study is available online at http://bit.ly/1ZBixPZ.)
The research was conducted by Bradford Winters, MD, PhD, associate professor of anesthesiology and critical care medicine at Johns Hopkins Medicine, and Peter Pronovost, MD, PhD, director of the Johns Hopkins Armstrong Institute for Patient Safety and Quality.
Their study notes that, as part of efforts to achieve more transparency in healthcare in recent years, hospitals have increasingly been called on to report their performance on quality-of-care measures publicly. Some of the most prominent measures come from AHRQ and CMS: patient safety indicators (PSIs) and HACs.
The accuracy of those measures is compromised because they are derived from billing data input from hospital administrators and not clinical data obtained from patient medical records, the authors wrote. Hospitals code medical errors and other quality metrics inconsistently, making the resulting scores unreliable for comparing one hospital to another, they explained in the report.
The scores end up being more a reflection of how hospitals code data rather than a measure of quality, the authors concluded. To reach that conclusion, Winters and Pronovost analyzed 19 studies conducted between 1990 and 2015 that directly addressed the validity of HACs and PSI measures, as well as information from CMS, AHRQ, and the Maryland Health Services Cost Review Commission’s websites. They compared errors listed in medical records to billing codes found in administrative databases, deciding that the measure was reliable if the medical record and the administrative database matched 80% of the time.
Sixteen of the 21 measures developed by AHRQ and CMS had insufficient data and could not be evaluated for their validity, leaving five measures that contained enough information for analysis. They were Iatrogenic Pneumothorax (PSI 6/HAC 17), Central Line-associated Bloodstream Infections (PSI 7), Postoperative hemorrhage/hematoma (PSI 9), Postoperative deep vein thrombosis/pulmonary embolus (PSI 12), and Accidental Puncture/Laceration (PSI 15). Of those five, only PSI 15 was found to be valid.
Even PSI 15 warranted skepticism because the data was so heterogeneous, the authors reported. In all five measures analyzed, the most common reason for discrepancies between medical records and administrative databases was coding errors.
The researchers said they hope their work will lead to reform and encourage public rating systems to use measures that are based on clinical rather than billing data.
“This systematic review finds that there is limited validity for the PSI and HAC measures when measured against the reference standard of a medical chart review,” the authors concluded. “Their use, as they currently exist, for public reporting and pay-for-performance, should be publicly re-evaluated in light of these findings.”
Earlier research also has cast doubt on the validity of quality measures. In a 2015 report, researchers from the University of Massachusetts Medical School in Worcester and Swedish Cherry Hill Family Medicine Residency in Seattle, determined that little evidence supported common quality measures leading to improved outcomes. (The study is available online at http://bit.ly/1WXyO55.)
“These measures are often based on easily measured, intermediate endpoints such as risk-factor control or care processes, not on meaningful, patient-centered outcomes; their use interferes with individualized approaches to clinical complexity and may lead to gaming, overtesting, and overtreatment,” the authors concluded. They called for more focus on patient-centered performance measures such as medication reconciliation in the home after discharge, screening for addressing fall risks, and the patient’s self-assessment of health status over time.
“Quality measures should reflect that a provider has elicited, explored, and honored patient values and preferences, and not merely indicate whether a test or intervention has been performed,” the authors wrote. “To do otherwise strikes at the heart of patient-centered care. Because most healthcare interventions carry risk of causing harms, measures should reflect overutilization as well as underutilization of care.”