Friday, November 20, 2009

Broken Pieces of the Puzzle

I bet you came here today looking for an explanation of what went wrong yesterday. You’re going to get more than you bargained for.

First, the FAA’s statement:

”The failure was attributed to a software configuration problem within the FAA Telecommunications Infrastructure (FTI) in Salt Lake City.  As a result FAA services used primarily for traffic flow and flight planning were unavailable electronically. 

The National Airspace Data Interchange Network (NADIN), which processes flight planning, was affected because it relies on the FTI services.  During the outage air traffic controllers managed flight plan data manually and safely according to FAA contingency plans.“

I hope you heeded my previous advice.

”Speaking of radar outages, here are three letters for you to remember -- FTI. When the next major ATC outage occurs, those are the three letters to listen for -- FTI. It stands for Federal Telecommunications Infrastructure. “

In that same blog entry, I tell you to listen to PASS (Professional Aviation Safety Specialists, AFL-CIO). Is it live ot is it Memorex ?

”Despite a rosy picture painted Tuesday by the Federal Aviation Administration, the FAA Telecommunications Infrastructure (FTI) network is unreliable, lacking suitable backups, and continues to be a source of great frustration and deep concern for the FAA technicians and air traffic controllers who must deal with the fallout of the FAA’s decision to cut corners and costs on this project and run it on the razor’s edge despite a lengthy list of failures and outages.“

It’s Memorex. That wasn’t from yesterday, it was from April, 2008.

Here’s the PASS press release from yesterday.

”The Professional Aviation Safety Specialists, AFL-CIO (PASS), the union that represents Federal Aviation Administration (FAA) technicians, is extremely concerned about the resolution process in reaction to today's outage of the Federal Telecommunications Infrastructure (FTI). The outage occurred due to a corrupt router card for the FTI server at the Salt Lake Center in Utah and had a rippling effect that caused significant delays across the country.

FTI, which provides all telecommunications used to transfer critical data used by the FAA for air traffic control, is owned by Harris Corporation. As such, the system is not maintained by the FAA. When the outage occurred at the Salt Lake Center, Harris Corporation attempted to troubleshoot the problem remotely but eventually a Harris FTI technician had to be dispatched to the scene in order to fix the problem. In the end, it took four hours for Harris to rectify the situation.

(Emphasis added)

Please, I’m begging you. If you have the time, read the whole press release -- critically. It just keeps getting better and better (or worse and worse). Look for 4 letters -- ADS-B.

Here’s another piece for the technical aspects from Computerworld.

Okay, to sum it up. FTI corrupts a router at the Salt Lake NADIN facility that handles flight plans. FTI is a bad program from the start. NADIN is just old. The NADIN facility in Atlanta is supposed to back up the NADIN facility in Salt Lake. But to do that, FTI has to work. I haven’t seen an explanation of that missing link (literally) yet. But that is what FTI -- Federal Telecommunications Infrastructure -- does. It provides data communications. Without communications between the two NADIN facilities, the whole system breaks down.

FTI is run by Harris Corporation. You might remember they are wrapped up with our old friend Congressman John Mica. In other words, we told you so. At least as far back as 2007.

”“Despite the known safety-related problems directly attributed to the FTI program, Congressman John Mica was adamant then that the program should continue.” “

”This is such a “target-rich environment” I hardly know where to start. Seriously, you could spend all day just reading the links I found it ten minutes and you don’t have to take my word for it. Go to Google and type in “+FAA +FTI” (or just click on the link) and you can read all day. “

Make that two days now. And we haven’t even scratched the surface. It’s all connected. The politics, contracting out, lack of manpower, nonexistent redundancy, inadequate training and on and on and on. It isn’t just one piece of the puzzle that is broken. It’s a lot of pieces.

If you think the “I told you so”s are uncomfortable now...wait until somebody dies. In case you haven’t figure it out, that is how this all ends. That’s how it always ends. Somebody dies and we fix it. It doesn’t have to be that way. But that is the way it always is.

Don Brown
November 20, 2009


No comments: