August 7, 2007 By News Staff
By Sally K. Ride, Ph.D., former NASA astronaut and the first American woman in space.
Today's government agencies face an expanding array of threats that make effective risk management fundamental to their success.
From directing homeland security and managing national transportation systems to running weapons labs and research centers, government agencies oversee some of the nation's most critical activities. Maintaining these operations in the face of natural and man-made hazards is vital to the country's well-being.
Government agencies also house some of its citizenry's most sensitive information - Social Security numbers, health records, financial information and other personal data must be protected. That job grows harder each day as computer hackers become more sophisticated and mobile workers carry vast amounts of information in easily misplaced laptops and handheld computers.
In this environment, all public agencies must strengthen their ability to spot, evaluate and eliminate potential safety and security risks.
Focus on safety and risk management starts at the top and must permeate every layer of an organization. Government leaders must understand the importance of managing risk - and they must communicate that importance to all employees and empower them to make the right decisions despite budget pressure, deadlines and other factors.
My experience with risk management stems from my involvement in the U.S. space program. As a mission specialist aboard space shuttle flights in 1983 and 1984, I was extremely confident in NASA's safety procedures.
I also had the unfortunate duty of serving on accident investigation boards for the 1986 Challenger explosion and the 2003 Columbia disaster. Despite NASA's reputation for technical and operational excellence, our investigations revealed an agency with serious risk-management problems.
Our findings are instructive for public-sector leaders and managers because they highlight a particularly insidious risk-management shortcoming: the tendency to develop a false sense of security when everything is working fine. It's human nature to interpret the lack of problems as a lack of risk. When this happens, people and organizations can become complacent.
This phenomenon was a key factor in the space shuttle disasters. Although the Challenger and Columbia were destroyed by completely different technical malfunctions, the underlying cause of both accidents was rooted in risk-management failures. NASA managers, blessed by the good fortune of many successful shuttle launches, began to downplay the importance of significant technical challenges with tragic results.
If this type of risk-management failure can occur at NASA - an organization engaged in one of the riskiest endeavors known to mankind - it can happen anywhere. The space agency's experience shows that effective risk management demands leadership, communication and constant vigilance - particularly when everything seems to be OK.
What Went Wrong?
Through years of space exploration, NASA finely honed procedures for spotting and mitigating potential dangers. But cultural changes triggered by cost and schedule pressures of the space shuttle program in the '80s prompted NASA to lose focus and discount the seriousness of known design flaws. The longer operations continued without mishap, the more "acceptable" these flaws became.
Then on an unusually chilly morning in January 1986, Challenger thundered away from its Florida launch pad, commencing NASA's 25th shuttle mission. Seventy-three seconds later, the orbiter was destroyed in a massive fireball. Our accident investigation determined that failure of a rubber O-ring in one of the shuttle's massive solid rocket boosters triggered events that literally tore the craft apart.
At the time of Challenger's launch, NASA managers and engineers were well aware of problems with the O-rings, which sealed seams between sections of the rocket boosters. They'd seen evidence in earlier flights where hot exhaust gases from the rocket motors had nearly burned through the O-ring seals - particularly during cold-weather launches - endangering the shuttle and crew.
When the seal "erosion" was first discovered, NASA considered the problem quite serious, but the shuttle continued to fly as engineers worked on a solution. As flights continued without serious incident, the O-ring problem became less urgent. Deterioration of the critical booster seals came to be viewed as a nearly normal occurrence.
NASA fell into a trap. The O-rings functioned well enough to avoid disaster for 24 shuttle missions, and the agency became complacent. But the danger remained.
With the Challenger launch, NASA's luck ran out. An O-ring in the shuttle's right rocket booster - hardened by subfreezing temperatures during the morning of the launch - gave way, sending flames into the external fuel tank. Seconds later, the craft burst apart, scattering debris into the Atlantic Ocean.
Our investigation concluded that NASA's organizational culture and decision-making processes were key contributors to the accident, and it included nine recommendations to be implemented before shuttle flights resumed. The shuttle program was halted for 32 months while NASA implemented the changes.
Yet history repeated itself 17 years later.
After a string of successful shuttle missions, Columbia lifted into the sky from Cape Canaveral on Jan. 16, 2003. Eighty-two seconds into the flight, a chunk of insulating foam broke free from the external fuel tank and slammed into the leading edge of the orbiter's left wing.
Similarities to the Challenger accident were alarming. As with the O-ring problem, NASA had long known of the falling foam. Almost from the beginning of the shuttle program, pieces of insulation had been separating from the fuel tank and striking the shuttles, causing varying degrees of damage. At first considered serious, the problem didn't cause catastrophic results for more than 100 shuttle flights, leading NASA to minimize the importance of the potentially fatal flaw.
Yet the flaw remained. The chunk of foam that struck Columbia's wing damaged critical thermal panels designed to protect the shuttle during its return to Earth. The damaged panels triggered an in-flight breakup when Columbia re-entered the atmosphere 16 days later.
How could the nation's premier scientific and research organization twice underestimate serious space shuttle design flaws? NASA was coping with challenges familiar to agency managers at any level of government: schedule pressure, budget pressure and political pressure. These forces led to flawed decision-making in both accidents.
The space shuttle was touted as a vehicle to make space travel routine, able to deliver payloads into orbit quickly and relatively inexpensively. Therefore, NASA was forced to control costs and meet launch schedules despite the shuttle's huge complexity and inherent risks.
Over time, NASA "normalized" the O-ring and foam problems, drastically underplaying their potential risk. Investigations of the Challenger and Columbia disasters showed that concern over cost-efficiency and deadlines blurred NASA's focus on safety. Furthermore, NASA had developed a culture that emphasized procedure and chain of command, and stifled communication.
Before the Challenger accident, engineers raised concerns about O-ring performance due to unusually cold temperatures the morning of the lift off. Our investigation found that management didn't listen to the engineers' warnings, and some engineers who had important information didn't speak up. Ultimately NASA decided launching in such cold temperatures was an acceptable risk despite the concerns.
Similarly NASA management overruled a request by worried engineers to use Department of Defense satellite imagery to study the damage to Columbia's wing. Instead, NASA administrators grumbled about the engineers' failure to follow proper protocol in requesting the images.
In the events leading up to both accidents, management didn't recognize that unprecedented conditions demand flexibility and democratic process, not bureaucratic response. In both investigations, we found that budget shortages prompted NASA to cut safety personnel, and those remaining lacked the clout and independence they needed to be effective.
Leadership Is Key
How do you avoid a similar situation? Effective risk management starts with leadership.
Challenger and Columbia weren't doomed by technical problems. The accidents stemmed from cultural failures within NASA that encouraged complacency, silenced communication between levels of the organization, and allowed cost and budget pressures to eclipse safety concerns. As we pointed out in the Columbia accident investigation, leaders create culture, and it's their responsibility to change it.
Agency leaders must create an environment where any employee - regardless of title or status - can bring legitimate concerns to management and have those concerns taken seriously. If, for example, a software programmer discovers a potential vulnerability in computer code, he or she should be able to alert someone who will take the appropriate action - even if that means shutting down a Web site or spending a significant amount of money to correct the problem.
I recently spoke at a management retreat organized by the CEO of a large medical center. He's working to develop formal risk-management procedures, and more importantly he's instilling a culture that values open communication of risk factors. More executives should do the same.
After serving on investigation boards for both space shuttle accidents - and seeing up-close the results of risk management gone awry - I believe procedures are important, but people make the difference.
Developing mechanisms to spot potential problems clearly is necessary. But as NASA's experience shows, that's not enough. The space agency created excellent mechanisms to spot risks, and for the most part, they worked. The mechanical flaws that destroyed both space shuttles were well known before the accidents, but those flaws were improperly - if at all - addressed by people running the organization. Ultimately it's not enough to spot risks; it's how you address them that counts.
NASA may have a specialized mission, but lessons from the space shuttle accidents apply to public agencies at all government levels. Almost any agency manages risk - from regulating hazardous materials to protecting sensitive data on computer networks - and can learn from NASA's experience.
Sally K. Ride, Ph.D., a former NASA astronaut and the first American woman in space, is the president and CEO of Sally Ride Science, a company dedicated to supporting girls' interests in math, science and technology, and a professor of physics at the University of California, San Diego (currently on leave). She also serves as chair of Deloitte & Touche USA LLP's Council for the Advancement of Women. Ride is the only person to have served on the commissions investigating both the Challenger and Columbia space shuttle accidents.