Sunday, October 6, 2013

Reader Feedback - Smart Autopilots, and Playing with Fire!

Understanding Air France 447 Reader Feedback.

A reader wrote:

I am perplexed and at times aggravated by this statement on page 106 when you say "a system that shuts itself off when it is designed to shut itself off, due to lack of data or failure of another part, does not necessarily constitute a "failure."  Here's why I dispute that statement.

I know from my days as a private pilot that when you are in trouble at altitude (vertigo, flying into a cloud, a stall at altitude, etc) you look immediately at the artificial horizon.  That saved me from a terrible accident when I had vertigo.  And we know that if you use that device to: 1)level the wings 2)put the nose on the horizon and 3) keep cruise power on the engine THEN THE PLANE IS GOING TO FLY SAFELY.  It's practically a law of physics.

So we have this amazingly capable and intelligent flight control system (including autopilot) on the A330 which senses and measures every conceivable detail on the airplane (it probably knows when the toilet is flushed), BUT WHEN IT ENCOUNTERS MULTIPLE SPEED INPUTS IT QUITS LIKE A BIG BABY "OH I DON'T HAVE MY DATA....HERE, YOU FLY THE PLANE". What it should do is go into a high-altitude safe mode where it levels the wings, maintains power, and puts the nose 2 degrees above the horizon and tells the pilots "I'VE LOST SPEED INPUTS, BUT I'VE GOT YOUR BACK AND WILL MAINTAIN STRAIGHT AND LEVEL FLIGHT FOR YOU UNTIL YOU FIGURE OUT THE UNDERLYING PROBLEM."  I'd say that's a design fault/omission!!!!

In addition, I'd say that any plane that allows multiple stick inputs is playing with fire!

So my conclusion is that hundreds of people die because the autopilot that thinks it is soooo smart when it is in actuality a stupid sh&t!

Well I am probably all wet here, for some technical reason I don't yet understand, but I needed to vent a little bit about what I perceive to be a huge technical omission in the A330.

I knew that statement would cause some discussion when I wrote it. My point was that the airplane didn't break. It performed as designed, albeit to an phenomenon that probably was not anticipated.

 That doesn't mean that I think it is the perfect design. Sure, an attitude hold sub mode prior to total disconnection would have been great at that point - and no doubt saved the day.  I'm not aware of any commercial aircraft with that autopilot logic, though.  It's my opinion that a simultaneous triple pitot tube incident was not anticipated at design time.  There's a lot of work to be done in the what-happens-when-stuff-goes-wrong department before pilotless passenger airplanes are a reality.

However, it depends on what failed. In this case the FMGEC saw a disagreement of data from the ADIRUs. The question becomes, which data is wrong? While your suggestion sounds great, I don't think that one can always assume what data is always correct either. In the case of Quantas Flt 72 in October 2008, it was a spiking AOA value at just the wrong frequency of occurrence: (in that case:  "If either AOA 1 or AOA 2 significantly deviated from the other two values, the FCPCs (primary flight control computer)  used a memorized value for 1.2 seconds. The FCPC algorithm was very effective, but it could not correctly manage a scenario where there were multiple spikes in either AOA 1 or AOA 2 that were 1.2 seconds apart.")

If the engineers knew what would fail in 2009 while they were designing it in 1991 (think of your computer in 1991), perhaps there could have been a lower mode of functionality built in. But, after all weren't two pilots always going to be there to fly the plane anyway? I think you'd agree that anyone who thinks their system can anticipate any occurrence, no matter how remote, is kidding themselves. Won't there always be one of those "oh, we didn't think of that!"

On the A330, there are a number of cases where the airplane can be in Alternate law, but the autopilot can be reengaged, depending on how what failed  - and it is almost always takes at least two failures. 

There are other instances where the autopilot shuts off, such as a glide-slope transmitter failure occurring below 400 feet. Let's face it, the airplane has nothing to go on at that point. When this happens, the GS "needle" goes away, the flight director pitch bar flashes and the pilots can push the thrust levers up to execute an automatically flown go around. However, if the GS doesn't return in a few seconds and the pilots don't go around (a disappearing needle and a flashing FD bar are very subtle visual signals) the airplane is unable to continue in that mode, shuts off the autopilot, flashes the lights, sounds the horn, and the pilots need to execute a manual go -around. 

The above cited wikipedia article has an interesting couple of paragraphs relating to the design-time thought process:

As with other safety-critical systems, the development of the A330/A340 flight control system during 1991 and 1992 had many elements to minimize the risk of a design error. These included peer reviews, a system safety assessment (SSA), and testing and simulations to verify and validate the system requirements. None of these activities identified the design limitation in the FCPC’s AOA algorithm.
The ADIRU failure mode had not been previously encountered, or identified by the ADIRU manufacturer in its safety analysis activities. Overall, the design, verification and validation processes used by the aircraft manufacturer did not fully consider the potential effects of frequent spikes in data from an ADIRU.
Airbus has stated that they are not aware of a similar incident occurring previously on an Airbus aircraft.

Perhaps as we move into the future where pilots can no longer be assumed to have thousands of hours of actual hands-on instrument flying experience before taking command of transport aircraft, autopilot systems will have to be designed with more robust failure modes that assume less about the crew's ability to take over unexpectedly. The fact that those systems are not there is one of the reasons that drove me to write this book. We still need to be pilots and be able to take over at 

This is not a new debate. Were it not for the original Mercury astronauts demanding manual controls, the spacecraft would have been controlled only automatically or remotely. Manual controls and competent pilots became important when  when the automatic stabilization and control system malfunctioned allowing the spacecraft to drift about a degree and a half per second to the right. Glenn switched control to manual-proportional control mode and moved Friendship 7 back to the proper attitude. He switched back to automatic, which began having problems again and he then switched back to the manual fly-by-wire system and flew the spacecraft in that mode for the remainder of the flight.

"...I'd say that any plane that allows multiple stick inputs is playing with fire!"

Wouldn't we all have liked to have been in the room when those design discussions were going on?  But the system is what it is, and we are trained to deal with it.  There's discipline in who's flying the airplane, (I've got it' - "ok, you've got it") and when that fails the DUAL INPUT warns of that breakdown. That gets everybody's attention in my experience.  The complexities of a fly by-wire system design present all of these issues. I don't know that there is a perfect answer to them. 

Even fire can be handled safely, when you understand and respect it. When you lose either one, you can get burned. Afterall, there's fire in use in most of the buildings in the world!

"... the autopilot that thinks it is soooo smart when it is in actuality a stupid sh&t!"

Remember, the autopilot doesn't think anything. It IS stupid. It only does what it is told to do, and when it can't do that - it's pilot time!   However, if the pilot get's lulled into believing that it is the autopilot that is smart, then it is HE that might be the "stupid sh&t!"


Karlene Petitt said...

Bill, this is fascinating discussion. I think it's a hard thing to wrap your head around when a plane does an "auto" shutdown of anything that it has not failed, but done what it was supposed to. The component is no longer there. A challenging concept for sure. These are the types of discussions that make us think. And then there is the thought of the machine "thinking" ... hmm. Perhaps this is the reason we always will need pilots is that machines don't think. They do as programmed to do. But when the unexpected happens... will it know what to do? Not sure.
Excellent post.

Bill Palmer: flybywire said...

The pilot must always know what the automation is doing.
The moment the pilot is working for the automation, instead of the automation working for the pilot, it's time to shut it off!