Reliability science: Ensure system success even when components fail

Industrial-engineering approach applicable to health care, experts say

Is your health care facility reliable? If it wasn’t, would you know it — and would you know how to turn things around? While most of us would be inclined to reply in the affirmative, recent studies indicate that when judged by the more rigorous quality standards being applied today, few facilities in the United States would pass muster.

If a health care system is reliable, argues the Boston-based Institute for Healthcare Improvement (IHI), all patients can expect to always receive evidence-based, effective care when they need it. In a recent study published by the RAND Corp., however, it was reported that for many clinical conditions with known best practices for quality care, only 50% of patients received care consistent with such recommendations.1 In fact, in a follow-up paper, the researchers asserted that "performance was not better in areas with outstanding medical institutions."2

"Almost any study that talks about evidence-based care concludes that [it is delivered] at best at somewhere near 90%, and usually at 70%," says statistician Tom Nolan, PhD, a senior fellow with IHI and a member of Associates in Process Improvement, a small consulting firm with offices in several cities across the country.

Nolan became involved with the subject of health care reliability several years ago and has been particularly impressed with the Robert Wood Johnson Foundation’s "Pursuing Perfection" project, for which IHI is a national program office. "The whole idea is to raise the performance bar; 90% is not good enough anymore," he asserts. "So it becomes even more imperative to think about the best science around."

There is actually a science of reliability, Nolan says. "There are two bodies of knowledge relevant to the issue of reliably matching patient needs and delivery of care," he explains. "The first might be reliability engineering, of the type made famous by the space program: Can you design a system so that if a component fails, the whole system doesn’t?" Reliability engineering is concerned with redundancy, the practice of having backup systems that can come on-line when primary systems fail.

"Reliability science is also concerned with very complex systems and trying to understand their potential failure modes," says Nolan, citing the second critical body of knowledge.

IHI has been sufficiently impressed with reliability science that it scheduled a seminar on the topic for a June 28 session in Boston. The one-day seminar was to provide an overview of the key concepts of reliability science and share how organizations working with IHI’s IMPACT network will be applying these key concepts to improve outcomes in five diagnoses (acute myocardial infarction, coronary artery bypass graft [CABG], heart failure, community-acquired pneumonia, and hip and knee replacement). It was offered to quality officers and directors, chief medical officers, chief nursing officers, physicians, and senior leaders.

"Other industries have used these approaches very successfully; we need to see what they’ve done and how it can be applied to health care," says Frances A. Griffin, RRT, MPA, a director with IHI.

So what, exactly, constitutes reliable health care, and how does IHI plan to work with health care institutions to improve their performance? "I would say it is ensuring that health care processes occur consistently and safely so all patients all the time receive the right care at the right time," says Griffin. "So, for example, if a patient has pneumonia, the appropriate antibiotic is ordered every time and they get the appropriate dosage every time."

Even when we look at what we would call today’s high-reliability health care, "We aren’t even close to that," says Griffin. "When we look at our processes, we can’t guarantee the patient will get the right treatment at the right time every time."

But doesn’t this assume that human beings, who are fallible, can achieve perfection? "Look at an aircraft carrier," Griffin replies. "It’s a very complex system, with a tremendous amount of variables. It’s at sea; the weather conditions are changeable; the deck is wet; there’s oil on the deck from planes, which take off every 15 to 20 seconds, some of them loaded with nukes; there are a large number of employees, many of whom are very young. You have all these variables, so if something goes wrong and is not properly managed, the potential for catastrophe is huge — but how often do you hear about a catastrophe on an aircraft carrier?"

What organizations like the military have, she explains, are systems in place so that all the right things happen all the time under changing conditions. "Health care is very similar," she asserts. "It is very complicated, things are always changing, and every patient is different."

The bottom line, says Griffin, is that the "variability" argument posed by many in health care "is a nice denial excuse. Of course we are different, but health care organizations still have processes. Like those other organizations, we rely on processes, procedures, equipment, and human beings to run that equipment. You can’t say there’s nothing we can learn from industry."

While the move to reliability is in its early stages, Griffin and Nolan have some well-formed ideas on how to proceed. Nolan has come up with a three-level design for improving reliability. "It involves first preventing errors or defects; then you surface defects when you can’t prevent them," he notes. "For example, if a patient comes into the ED [emergency department] with community-acquired pneumonia [CAP], he should be put on a protocol. If he isn’t, what mechanisms do you have in place to recognize that the defect is there and correct it?" The third level, he adds, is mitigating the effects.

"You must have a preoccupation with failures," adds Griffin. "Look for any little failure. If a failure happens, study it very intensively. Health care today is not even close to that. We talk about routine conditions.’ You’ve got to get your arms around the system, know it inside and out, so you can prevent failures."

Both Griffin and Nolan draw upon the work of Karl E. Weick and his book Managing the Unexpected.3 "Karl Weick has done some of the best thinking on high reliability in organizations in industry, though some translation [into health care] needs to be done," says Nolan.

"He outlines the five characteristics of high-reliability organizations in fields such as nuclear power and aviation, where failures can be catastrophic," Griffin says. (See an outline of Weick’s five characteristics, below.)

In terms of moving forward with health care organizations, "The first thing we have to look at is standardization," says Griffin. "If you look at any of the five diagnoses, many organizations will say they have a protocol or a standard order set. But if you look at how they are being used, very few organizations have 100% of their patients being placed on the proper protocol or order set, either because the physician chooses not to, or the ED staff misses it, and so on. Or, if the protocol is initiated, one or some of the pieces may not be used. So, you have to actually standardize. This is a critical first step."

For example, the Centers for Medicare & Medicaid Services (CMS) has four quality measures for heart failure. "Everyone should get them," asserts Griffin. As for the first organizations IHI is working with, "We tell them they need to get to 80% to 90% before we move on to what we should look at on the next level."

Nolan offers this outline for applying reliability to the five diagnoses: "First, you need to have an overall structure," he advises. "Can we prevent the defects? Can we have a redundant component, so if we do not get on the right protocol, we can find out and correct it?"

One possible approach is to use markers, he says. "If someone is hospitalized for CHF [congestive heart failure], they almost always get Lasix or a strong diuretic," he notes. "The lab can do a check to see if the patient got the right meds, and if they find people not on the protocol, get them on it."

The "default" should be made the evidence, he argues. "The default is, use a protocol unless a doctor orders something else," Nolan explains. "It connects to habits and patterns."

He offers another method for improving reliability. "For certain surgeries, delivering antibiotics within one hour of surgery has been shown to be effective in preventing surgical site infections," he notes. "One common approach is, The surgery is scheduled for 10:00, so sometime around 9:00 we will give the meds on the unit.’ But if the surgery gets delayed an hour, you miss the window, so this is an unreliable approach."

Nolan has seen some hospitals use much more reliable approaches. "One hospital has a sign just above the door in the holding area. It reads, Has the patient been given their antibiotic?’ The meds are given when the patient goes through the door. Others have said that when the anesthesiologist puts the patient to sleep, you can concurrently give the medication. Both of these are much more reliable methods."

One of the biggest challenges facing quality professionals is how to make a whole hospital reliable, Nolan says. "That really is on the leading edge of where we are now; I don’t think anyone has the answer," he concedes.

He does, however, offer some suggestions. "When we look at these five conditions, some patients have more than one, and so, for example, they may have smoking cessation counseling and also need some kind of vaccination. So, rather than do everything one time for each condition, some hospitals might say that whenever someone comes in over a certain age, you should offer them that vaccination and not wait for that disease to appear, so at one time you become reliable across the whole hospital. If you’re looking at the antibiotics needed for just hip and knee surgery, couldn’t you ask, What are all the surgeries that need this antibiotic one hour before?’ Or you might look at the fact that three of the five conditions — AMI [acute myocardial infarction], congestive heart failure, and CAP — almost always go through the ED, so you might institute some mechanism in the ED to get them going. You might start thinking fundamentally about the use of protocols in the ED."

IHI is not alone in its pursuit of greater reliability, Nolan notes. "A project worth mentioning involves CMS partnering with Premier on a project relating to five acute conditions — CAP , CHF, AMI, hip & knee replacement, and coronary artery bypass graft," he reports. "They have designed certain process, time, and outcomes measures. Organizations that have signed up for the project will get paid a premium if they can get into the top 10% on all measures for all conditions."

As for IHI, it is preparing to learn more about additional reliability tools. "We are looking at finding ways to identify the failures," says Griffin, who notes that she is currently working with a small group of hospitals from among Premier’s partners for the June seminar, which covers what has been learned.

"Starting in the fall, we will have an innovation community as part of IMPACT, and organizations will be able to join us," Griffin says. "If any of the ideas we test turn out to work, this could be very exciting."


1. McGlynn E, et al. The quality of healthcare delivered in the United States. N Engl J Med 2003; 348:2635-2645.

2. McGlynn E, et al. Profiling the quality of care in twelve communities: Results from the CSI study. Health Affairs 2004; 23:247-255.

3. Weick KE, Sutcliffe KM. Managing the Unexpected: Assuring High Performance in an Age of Complexity. San Francisco: John Wiley & Sons, 2001.

Need More Information?

For more information, contact:

• Frances A. Griffin, RRT, MPA, Director, Institute for Healthcare Improvement. Telephone: (732) 869-0533. E-mail:

• Tom Nolan, PhD, Associates in Process Improvement. Telephone: (212) 265-0353. E-mail: