What Can We Learn From Systems Engineering Failures?

In the world of Silicon Valley startups, failure is a state which, if not cherished, is at least celebrated. “Fail early, fail often” is the startup mantra. There’s even an event where you can commiserate with other losers: the annual FailCon event. The key is, to fail quickly and at low cost.

So what happens when you fail slowly, with mounting costs, escalations, gnashing of teeth, and rounds of finger pointing? Let’s examine two well-publicized failures and, on a much smaller scale, my own personal nightmare—my first job out of MBA school—and see what lessons can be learned.

Case # 1: The FBI virtual case file

The Federal Bureau of Investigation (FBI) Virtual Case File (VCF) project, published by SEBOK, is a near-perfect example of doing systems engineering the wrong way. Initiated in 2000 in order to deliver a secure Intranet and update the FBI’s antiquated case management system, the project was abandoned five years and $170 million later.

This case study has every excuse for not doing systems engineering the right way: There was no time to develop formal requirements. Scheduling focused on what was desired, not what was possible. The new FBI Intranet was specified with little understanding of the network traffic that would result from information sharing. By early 2003, the FBI began to realize how taxing the network traffic would be once all 22,000 users came online.

By the time the project was canceled, the lead contractor had amassed 700,000 lines of software based upon an incomplete set of requirements that were documented in an 800-page document.

Case # 2: The Hubble telescope

Today, the Hubble telescope is an engineering marvel, churning out 814 gigabytes of high-resolution photos per month that are expanding our insight into the universe. But it wasn’t always that way. When the Hubble launched in 1990 it quickly became apparent that something was very wrong with the 13 year, $1.5 billion project.

The pictures coming back to earth were fuzzy, causing at least one science writer to compare the telescope to “a patient with a bad case of astigmatism.” It took another three years and $1 billion to fix the arduously crafted mirrors used in the telescope. The project demonstrates how an error that could have been fixed for $1,000 at the design stage, or with a $10 million investment in end-to-end testing, ended up costing $1 billion to fix when the system was in service.

Also instructive are details recounted by Michael Griffin, a chief engineer on the Hubble project. Interviewed for a Booz Allen paper, he states: What stood out to Griffin was that each contractor could prove that they fully met the requirements of their side of the interface control document. Yet it was completely obvious to Griffin, even as a young systems engineer, that the device, when fielded, would not work.”

Case # 3: My first marketing job

My first post-MBA job was working for a New York City-based marketing group that was part of a phone company. The group was a marketing think tank of sorts, tasked with incubating new ways to make money from phone company assets. The goal of my project, Fast Track, was to monetize the value of up-to-date telephone listings (name, address, telephone number).

While we couldn’t release this data to marketers, there were three other groups willing to spend money for it. Police, firefighters, and other emergency responders needed to quickly identify phone numbers based on a location and also identify the phone numbers of neighbors, which wasn’t available any other way.

Large corporations, who were spending so much on directory assistance that they often blocked its use, were interested in setting up a lower cost, internal directory lookup service. And banks, insurers, and other application processors needed to quickly verify that applicants’ names, phone numbers, and addresses were current and valid.

My job was to incubate the production of the software and data that would be delivered monthly to our Fast Track customers. We signed a contract with one of the leading mini-computer companies of the day who set us up with a small minicomputer to do the indexing and wrote the customer-facing software application and back-end indexing applications.

That’s where the ‘laws of unintended consequences’ began. While we had tested each of the cleansing, indexing, and publishing applications required to publish the 40 million names, we hadn’t looked at the end-to-end view. A month before our projected release date, we discovered it would take 40 days and nights to index the data we needed to deliver on a monthly basis.

No problem, I reasoned, we could lease a bigger hunk of metal. But going from a computer the size of a dorm refrigerator to a computer the size of a walk-in closet creates its own problems. Those bigger computers came with a whole new set of space, electrical, fire hazard, redundancy, networking, and cooling requirements. You couldn’t plop them into a corner of the room—you had to build a whole new room for them. My $100,000 project was quickly becoming a $1,000,000 nightmare.

In the end, the minicomputer company traded in our too-small computer for an outsourcing contract and handled the production themselves. The product was launched, won the Information Industry Association’s “New Product of the Year” award, and was made obsolete a few years later (along with the minicomputer company).

What are our key take-aways from these systems engineering failures?

  1. Project discipline is required. “We don’t have time for requirements! We can’t afford system testing! No time for project planning!” All of this thinking must be avoided. Effective project management is crucial for mission accomplishment in systems engineering.
  2. “Whole systems thinking” must be encouraged. It’s instructive that every sub-contractor met their contractual obligations for the Hubble project—yet the project as a whole failed. The FBI intranet, designed before its case sharing technology came online, failed to account for the resulting increase in network traffic. You must understand how systems influence one another within a whole in order to succeed.
  3. Set realistic project measurements. Sometimes a “failure is not an option!” mindset buries the truth, rather than exposing poor outcomes earlier, when they can be fixed at lower cost. And the longer the truth is buried, the more costly the ultimate resolution.

Ultimately, the main lessons are that when utilizing a systems engineering approach, you must unify your teams with a common language and approach and provide a consistent, intuitive way to design more innovative products, validate that the products work as expected, and engineer product lines to maximize re-use.

Learn more on how you can achieve systems engineering success and save projects—including your own—from disaster.

Related Articles

This entry was posted in Application Lifecycle Management, Best Practices and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s