How Engineers Lose Touch
Engineering students have been taught to rely far too completely on computer models, and their lack of old-fashioned, direct hands-on experience can be disastrous
Until the 1960s a student in an American engineering school was expected by his teachers to use his mind’s eye to examine things that engineers had designed—to look at them, listen to them, walk around them, and thus develop an intuitive “feel” for the way the material world works and sometimes doesn’t work. Students developed a sense of form and proportion by drawing and redrawing. They acquired a knowledge of materials in testing laboratories, foundries, and metalworking shops. Students took field trips to power plants, steel mills, heavy machine shops, automobile assembly plants, and chemical works, where company engineers with operating experience helped them grasp the subtleties of the real world of engineering.
These young engineers’ picture of the material world continued to be enlarged after graduation. As working engineers they routinely looked carefully at many features of the built world as they expanded and refined their repertoire of nonverbal and tacit knowledge. They also seized opportunities to see unusual structures or machines being erected, and they studied accidents and equipment failures on the spot.
By the 1980s engineering curricula had shifted to analytical approaches, and visual and other sensual knowledge of the world seemed much less relevant. Computer programs spewed out wonderfully rapid and precise solutions of obviously complicated problems, making it possible for students and teachers to believe that civilization had at last reached a state in which all technical problems were readily solvable.
As faculties dropped engineering drawing and shop practice from their curricula and deemed plant visits unnecessary, students had no reason to believe that curiosity about the physical meaning of the subjects they were studying was necessary. With the National Science Foundation and the Department of Defense offering apparently unlimited funds for scientific research projects, working knowledge of the material world disappeared from faculty agendas and therefore from student agendas, and the nonverbal, tacit, and intuitive understanding essential to engineering design atrophied. In this new era, with engineering guided by science, the process of design would be freed from messy nonscientific decisions, subtle judgments, and, of coui se, human error.
Despite the enormous effort and money that have been poured into creating analytical tools to add rigor and precision to the design of complex systems, a paradox remains. There has been a harrowing succession of flawed designs with fatal results—the Challenger , the Stark , the Aegis system in the Vincennes , and so on. Those failures exude a strong scent of inexperience or hubris or both and reflect an apparent ignorance of, or disregard for, the limits of stress in materials and people under chaotic conditions. Successful design still requires expert tacit knowledge and intuitive “feel” based on experience; it requires engineers steeped in the understanding of existing engineering systems as well as in the new systems being designed.
The science writer James Gleick, in relating the development of the “new science” of “chaos,” points out that computer simulations “break reality into chunks, as many as possible but always too few,” and that “a computer model is just a set of arbitrary rules, chosen by programmers.” You, the programmer, have the choice, he says: “You can make your model more complex and more faithful to reality, or you can make it simpler and easier to handle.” For engineers a central discovery in the formal study of chaos is that a tiny change in the initial conditions of a dynamic system can result in a major unexpected departure from the calculated final conditions. It was long believed that a highly complex system, such as all automobile traffic in the United States, is in principle fully predictable and thus controllable. “Chaos” has proved this belief wrong.
Alan Colquhoun, a British architect, argues convincingly that no matter how rigorously the laws of science are applied in solving a design problem, the designer must still have a mental picture of the desired outcome. “[Scientific] laws are not found in nature,” he declares. “They are constructs of the human mind; they are models which are valid as long as events do not prove them wrong.” A successful new design combines formal knowledge and experience and always contains more judgment than certainty. Judgment is brought to bear as designers repeatedly modify their means to reach desired ends. Design is thus a contingent process. It is also a creative process that Robert W. Mann, a leader in engineering-design education, observes, “is, virtually by definition, unpredictable. The sequence of the steps is never known at the beginning. If it were, the whole process could be accomplished by the computer since the information prerequisite to the computer program would be available. Indeed, the creative process is the process of learning how to accomplish the desired result.”
Engineering design is usually carried on in an atmosphere of optimistic enthusiasm, tempered by the recognition that every mistake or misjudgment must be rooted out before any plans are turned over to the shops for fabrication.
Despite all the care engineers exercise and all their systems for ensuring correct engineering choices, evidence of faulty judgment shows up again and again in some of the most expensive and (at least on paper or on a computer screen) most carefully designed and tested machines of the twentieth century.
Of course there is nothing new about wrong choices and faulty judgments in engineering design. More than a hundred years ago the editors of Engineering News tried to track down the reasons for failures of bridges and buildings so that civil engineers might learn from others’ mistakes. “We could easily,” they wrote, “if we had the facilities, publish the most interesting, the most instructive and the most valued engineering journal in the world, by devoting it to only one particular class of facts, the records of failures…. For the whole science of engineering, properly so-called, has been built up from such records.”
Such a journal of failures was never published; however, Engineering News and its successors have presented many valuable reports of engineering failures. One of these reports—careful, comprehensive, knowledgeable, and fair to all parties—was published in Engineering News just a week after a cantilever railway bridge being built over the St. Lawrence River near Quebec City collapsed on August 29, 1907, killing seventy-four workmen (see “A Disaster in the Making,” Invention & Technology , Spring 1986).
“Long and careful inspection of the wreck,” wrote the reporter, “shows that the material was of excellent quality; that the workmanship was remarkably good.” But because the members were much larger than those used in ordinary bridges, he questioned the judgment that led to the design of the builtup compression members: “We step up from the ordinary columns of ordinary construction, tried out in multiplied practice, to enormous, heavy, thick-plated pillars of steel, and we apply the same rules. Have we the confirmation of experiment as a warranty? Except in the light of theory, these structures are virtually unknown. We know the material that goes into their make-up, but we do not know the composite, the structure.”
Whereas with the Quebec Bridge disaster the fault was found to lie in a lack of experience supporting the analytic theory behind the structure, there is today an unfortunate belief that the newer analytic techniques available to designers will prevent failures in the future. The report in Science of the collapse of a twenty-seven-year-old radio telescope, three hundred feet in diameter, in Green Bank, West Virginia, in 1989 implied that such a failure could not occur in a radio telescope designed today. The “cause” of the collapse was pinpointed in “the fracture of a single highly stressed steel plate” (which had survived for more than twenty-five years). “An independent panel appointed by the National Science Foundation” declared that “parts of the telescope were under far higher stresses than would be permitted today” and that “computerized stress analysis would identify potential failure points in telescopes built today, but these methods were not available when the instrument was built in 1962.” One wonders what explanation will be given for the collapse, some years hence, of a structure designed today with the help of a “computerized stress analysis.”
A much more sensible and realistic outlook on design failures may be found in a book titled To Engineer Is Human: The Role of Failure in Successful Design , written by Henry Petroski, a professor of civil engineering who graduated from engineering school in the early 1960s. Toward the end of his book, Petroski has a chapter called “From Slide Rule to Computer: Forgetting How It Used to Be Done.” He describes the Keuffel & Esser Log Log Duplex Decitrig slide rule that he purchased when he entered engineering school in 1959 to emphasize that the limits of a slide rule’s accuracy—generally three significant figures—are no disadvantage, because the data on which the calculations depend are seldom better than approximations.
Petroski uses the 1978 collapse of the modern “space-frame” roof of the Hartford Civic Center under a snow load as an example of the limitations of computerized design. The roof failed a few hours after a basketball game that had been attended by several thousand people, and providentially nobody was hurt in the collapse. Petroski explains the complexity of a space frame, which suggests mammoth Tinkertoys, with long, straight steel rods arranged vertically, horizontally, and diagonally. To design a space frame using a slide rule or a mechanical calculator was a laborious process with too many uncertainties for nearly any engineer, so space frames were seldom built before computer programs became available. With a computer model, however, analyses can be made quickly. The computer’s apparent precision, says Petroski—to six or more significant figures—can give engineers “an unwarranted confidence in the validity of the resulting numbers.”
Who makes the computer model of a proposed structure is of more than passing interest. If the model is worked out on a commercially available analytical program, the designer will have no easy way of discovering all the assumptions made by the programmer. Consequently, the designer must either accept on faith the program’s results or check the results—experimentally, graphically, and numerically—in sufficient depth to be satisfied that the programmer did not make dangerous assumptions or omit critical factors and that the program reflects fully the subtleties of the designer’s own unique problem.
To underline the hazards of using a program written by somebody else, Petroski quotes a Canadian structural engineer on the use of commercial software: “Because structural analysis and detailing programs are complex, the profession as a whole will use programs written by a few. These few will come from the ranks of structural ‘analysts’ … and not from the structural ‘designers.’ Generally speaking, their design and construction-site experience and background will tend to be limited. It is difficult to envision a mechanism for ensuring that the products of such a person will display the experience and intuition of a competent designer… . More than ever before, the challenge to the profession and to educators is to develop designers who will be able to stand up to and reject or modify the results of a computer aided analysis and design.”
The engineers who can “stand up to” a computer will be those who understand that software incorporates many assumptions that cannot be easily detected by its users but that affect the validity of the results. There are a thousand points of doubt in every complex computer program. Successful computer-aided design requires vigilance and the same visual knowledge and intuitive sense of fitness that successful designers have always depended on when making critical design decisions.
Engineers need to be continually reminded that nearly all engineering failures result from faulty judgments rather than faulty calculations. For instance, in the 1979 accident in the nuclear power plant at Three Mile Island, the level of the coolant in the reactor vessel was low because an automatic relief valve remained open while, for more than two hours after the accident began, an indicator on the control panel said it was shut. The relief valve was opened by energizing a solenoid; it was closed by a simple spring when the solenoid was shut off. The designer who specified the controls and indicators on the control panel assumed that there would never be a problem with the valve’s closing properly, so he chose to show on the panel not the valve position but merely whether the solenoid was on or off. When the solenoid was off, he assumed, the valve would be closed. The operators of the plant assumed, quite reasonably, that the indicator told them directly, not by inference, whether the valve was open or closed.
The choice made in this case may have seemed so simple and sensible as to be overlooked in whatever checking the design underwent. It might have been re-examined had the checker had experience with sticky relief valves or comprehension of the life-and-death importance of giving a nuclear power plant’s operators direct and accurate information. This was not a failure of calculation but a failure of judgment.
A cluster of newspaper articles that appeared in the first half of 1990 (a similar crop may be harvested in any half-year) has fattened my “failure” file folder and has led me to expect only more of the same under the accepted regimen of abstract, high-tech design. The magnitude of the errors of judgment in some of the reported failures suggests that engineers of the new breed have climbed to the tops of many bureaucratic ladders and are now making decisions that should be made by people with more common sense and experience.
The first oil spill that year occurred on January 1, when a transfer line from the Exxon Bayway refinery in Linden, New Jersey, spilled more than 500,000 gallons into the Arthur Kill, which separates New Jersey from Staten Island. A few feet of a side seam had split in a section of pipe, and an automatic alarm valve, intended to shut off the flow, detected the leak but had been wedged open for twelve years because the shut-down alarm was “too sensitive” and kept interrupting flow in the pipeline. In all those years, according to Exxon, the pipe had never leaked.
On May 7 the Wall Street Journal gave a careful account of the expensive problems that poor design judgment and unreasonable production deadlines had caused when General Electric introduced a new and insufficiently tested compressor in its domestic refrigerators in 1986. The new refrigerators featured rotary compressors rather than the reciprocating compressors that had been employed since the 1920s.
Rotary compressors, common in air conditioners, were attractive to GE managers because they were expected to be much cheaper to build. Many engineers learn in school a bit of folklore about the invariable superiority of rotating machinery over reciprocating machinery. Rotating gas compressors, however, require substantially more power than reciprocating compressors, and their high rotative speeds make them difficult to cool and lubricate.
The designers of the new compressors ignored the significant difference in performance requirements between air conditioners and refrigerators. In air conditioners a convenient stream of air keeps the body of the compressor, and thus the lubricating oil sealed inside, cool. Refrigerators lack an equivalent airstream, and none was provided to cool the new compressors.
A consultant suggested a joint venture with a Japanese firm experienced in rotary-compressor design. Although the designers had had little experience with rotary compressors, they rejected the advice and proceeded to develop a design that required tolerances smaller than those found in mass-produced machines of any kind. According to one of his former associates, the chief design engineer “figured you didn’t need previous compressor-design experience to design a new compressor.”
The first of the new compressors were to be tested for the assumed lifetime of a refrigerator; however, the tests were cut short long before a lifetime had elapsed, and the misgivings voiced by the experienced technician who ran the tests were disregarded. This senior technician—who had worked in the testing lab for thirty years—reported that although the compressors did not actually fail in the truncated testing program, “they didn’t look right, either.” Discoloration from high temperatures, bearing surfaces that looked worn, and a black, oily crust on some parts pointed to eventual trouble with overheating, wear, and a breakdown of the sealed-in lubricating oil. The experience-based assessment was discounted because it came from a mere technician.
The new refrigerators sold well, and trouble didn’t begin for almost a year. After the dimensions of the design debacle began to be clear, the company found itself replacing more than a million rotary compressors with reciprocating compressors at a cost of about $450 million.
In May of 1990 the National Aeronautics and Space Administration returned to the front pages with two blunders less chilling than the Challenger explosion but likely to waste hundreds of millions of dollars. The Hubble space telescope, launched on April 24, had been confidently advertised as the answer to the problem of the atmosphere’s interference with extremely faint light waves from fardistant heavenly bodies. The space telescope was expected to increase the diameter of the known universe by a factor of seven. The first pictures were to be transmitted to earth a week after the launch. But several unexpected happenings postponed the expected first transmission to the end of the year, about eight months behind schedule. Most significantly, an error had been made in grinding the large mirror, and it was impossible to bring any heavenly body into sharp focus. Computer experts fell back on proposing programs that would “enhance” the distorted images.
The first smaller-scale mishap occurred when the satellite carrying the telescope was launched from the shuttle vehicle. An electrical cable, connecting an adjustable antenna dish to the television transmitter, was kinked as it exited the shuttle, causing a significant reduction in the antenna’s adjustability. Transmission to earth was interrupted by the inability to point the antenna continuously at the receiving station.
A few days later newspaper readers learned that the telescope could not be pointed accurately at stars and planets. The controlling computer program had been based on an outdated star chart and introduced a pointing error of about half a degree. Furthermore, the telescope developed a tendency to drift and to pick up other nearby stars just slightly brighter or dimmer than those it was supposed to hold in focus.
Finally, vibrations of the entire telescope satellite raised questions about its ability to obtain any information that is not available to ordinary telescopes on the ground. An unanticipated (i.e., unthought of in the design) cycle of expansion and contraction of the solarpanel supports, as the spacecraft moved into and out of the earth’s shadow, caused the panels to sway “like the slowly flapping wings of a great bird,” as a newspaper report put it. The computer program for stabilizing the spacecraft, confused by the unexpected vibrations, called for corrective measures that only exacerbatprl thp vibration.
Further deficiencies turned up in Hubble’s second year in orbit. Two gyroscopes (of six) have failed, and two others exhibit signs of incipient failure. The Goddard high-resolution spectograph may have to be shut down because of intermittent loss of connection with its data computer. The flapping solar panels are attached to booms that have developed a jerky motion that may lead to their collapse and a catastrophic power loss. Although NASA hopes to send repair missions to Hubble in 1994, one wonders whether the repair missions will be able to keep ahead of the failures of one component after another. These blunders resulted not from mistaken calculations but from the inability to visualize realistic conditions. They suggest that although a great deal of hard thinking may have been done to accomplish the stated missions of Hubble, the ability to imagine the mundane things that can go wrong remains sadly deficient at NASA.
Richard P. Feynman, the maverick physicist who served on the official panel reviewing the Challenger explosion, argued that more failures and embarrassing surprises would be inevitable if NASA did not radically change the way its big projects were designed. He accused the space agency of “top-down design” and contrasted this with sensible “bottomup” design, which has been normal engineering practice for centuries.
In bottom-up design the various components of a system are designed, tested, and, if necessary, modified before the design of the entire system has been set in concrete. In the topdown mode (invented by the military), the whole system is designed at once, before resolving the many questions and conflicts that are normally ironed out in a bottom-up design. The system is then built before there is time to test all its components, so that deficient and incompatible ones must ultimately be located (often a difficult problem in itself), redesigned, and rebuilt—an expensive and uncertain process.
Furthermore, as Feynman pointed out, the political problems faced by NASA encourage, if not force, it to “exaggerate” when explaining its needs for large sums of money. It was, he wrote, “ apparently necessary [in the case of the shuttle] to exaggerate: to exaggerate how economical the shuttle would be, to exaggerate how often it could fly, to exaggerate how safe it would be, to exaggerate the big scientific facts that would be discovered. The shuttle can make so-and-so many flights and it’ll cost such-and-such; we went to the moon, so we can do it!’”
Until the foolishness of top-down design has been dropped in a fit of common sense, the harrowing succèssion of flawed designs in high-tech, high-cost public projects will continue.
In the mid-1960s a prominent British structural engineer, Sir Alfred Pugsley, made a wise and important prescription for the design of pioneering projects. He said that in such projects the chief engineer should be given a “sparring partner,” a senior engineer who would be privy to essentially all the information available to the chief engineer and whose status would be such that the chief could not ignore his comments and recommendations. This sparring partner would be given ample time to follow the design work and to study and think about the implications of details as well as the “big” decisions made by the chief engineer.
The hazards of permitting a chief engineer to determine all aspects of a complex project, without critical review, are insidious and farreaching, but Pugsley also warned against an even worse commonplace hazard: the adoption of a faulty doctrine by a whole profession.
Pugsley cited as an example of misplaced enthusiasm for a new doctrine the collapse of the Tacoma Narrows suspension bridge in 1940, the “major lesson” of which was “the unwisdom of allowing a particular profession to become too inward looking and so screened from relevant knowledge growing up in other fields around it.” Had the designers of the Tacoma Narrows Bridge known more of aerodynamics, he thought, the collapse might have been averted. It is fairly certain, however, that if the relevance of aerodynamics to that design had been suggested by a person outside the network of “leading structural engineers,” the advice would have been considered an attack on the profession of civil engineering.
The experience of two engineers who published historical articles on the collapse of the bridge supports my surmise. The professional reaction to an article in Engineering NewsRecord by James Kip Finch of Columbia University prompted him to virtually retract its contents. David Billington, an unorthodox professor of civil engineering at Princeton University, was excoriated by several prominent bridge engineers when his paper on events leading to the collapse was published in a journal of the American Society of Civil Engineers.
Billington, in a historical study of suspension bridges, argues convincingly that a design decision made in the 1920s by O. H. Ammann, designer of the George Washington Bridge over the Hudson River, “led directly to the failure of the Tacoma Narrows bridge.” Ammann decided that the deck of his bridge could be built without vertical stiffening, and he omitted the stiffening trusses that John Roebling and other suspension-bridge engineers had felt were necessary to keep winds from causing undulation of the bridge deck. Ammann’s reasoning appealed to many in the civil engineering profession, and they applied it to several long, narrower, lighter, and disturbingly flexible suspension bridges they built in the 1930s, including the Golden Gate Bridge, which was stiffened after a harrowing experience with crosswinds in 1951.
After the Tacoma Narrows Bridge fell, structural engineers found that a sense of history might have tempered their enthusiastic acceptance and extension of Ammann’s design precept. They learned, as Billington points out, that published records of suspension bridges in Europe and America “described nineteenth-century failures that were amazingly similar to what they saw in the motion pictures of the Tacoma collapse.”
Billington’s article was characteristically greeted by engineers as “an attack on the leading figures of the period and especially upon O. H. Ammann” (in the words of Herbert Rothman, one of Ammann’s defenders). Rebuttal was necessary, he continued, in order to “remove the undeserved blame” leveled at several bridge designers and to “preserve their proper position in the history of engineering.”
The need to justify the way engineers do things is unfortunately often felt even when illconsidered systems lead them to make fatally wrong judgments. The missile cruiser USS Vincennes was equipped with a billion-dollar “state-of-the-art” air defense system called Aegis. On July 3, 1988, the ship shot down an Iranian civilian airliner, killing 290 people. The Aegis system had received IFF (Identification, Friend or Foe) signals for both military and civilian planes, yet the ship’s radar indicated only one plane, and the decision was made to destroy it. (No radar or any other existing equipment will identify a plane by its physical shape and size alone.) Later the Navy decided that an enlisted man had misinterpreted the signals on his visual display and that therefore the captain was not at fault for ordering the destruction of the civilian airplane.
As with most “operator errors” that have led to major disasters, the operators aboard the Vincennes had been deluged with more information than they could assimilate in the few seconds before a crucial decision had to be made. It is a gross insult to the operators who have to deal with such monstrous systems to say, as the Navy did, that the Aegis system worked perfectly and that the tragedy was due to “operator error.”
The designer of the Aegis, which is the prototype system for the Strategic Defense Initiative, greatly underestimated the demands that their designs would place on the operators, who often lack the knowledge of the idiosyncrasies and limitations built into the system. Disastrous errors of judgment are inevitable so long as operator error rather than designer error is routinely considered the cause of disasters. Hubris and an absence of common sense in the design process set the conditions that produce the confusingly overcomplicated tasks that the equipment demands of operators. Human abilities and limitations need to be designed into systems, not designed out.
If we are to avoid calamitous design errors—as well as those that are merely irritating or expensive—it is necessary for engineers to understand that such errors are not errors of mathematics or calculation but errors of engineering judgment, of judgment that is not reducible to engineering science or to mathematics.
Here, indeed, is the crux of all arguments about the nature of the education that an engineer requires. Necessary as the analytical tools of science and mathematics most certainly are, more important is the development in student and neophyte engineers of sound judgment and an intuitive sense of fitness and adequacy.
No matter how vigorously a “science” of design may be pushed, the successful design of real things in a contingent world will always be based more on art than on science. Unquantifiable judgments and choices are the elements that determine the way a design comes together. Engineering design is simply that kind of process. It always has been. It always will be.