Tuesday, March 07, 2006

Safety Critical Software

I started to study the development-test requirements for safety critical software. Safety critical means that an error/malfunction could lead to loss of property and worse loss of life. Some quick thoughts:

For a comprehensive account of accidents in software based systems, see "Safeware: System Safety and Computers", Nancy G. Leveson, Addison-Wesley, 1995

Wikipedia entry for DO-178B, Software Considerations in Airborne Systems and Equipment Certification.

Another link to DO-178B: RTCA/DO-178B, Software Considerations in Airborne Systems and Equipment Certification


"...there are two key issues with safety-critical systems. First, you have to understand all the situations in which a hazardous condition might occur. The way to discover all of the safety issues in a system is to get a lot of knowledgeable people in a room and have them imagine scenarios that could lead to a breach of safety.
The second issue with safety is to be sure that once dangerous scenarios are identified and controls are designed to keep them from happening, future changes to the system take this prior knowledge into account. The lethal examples my friend on the airplane cited were cases in which a new programming team was suspected of making a change without realizing that the change defeated a safety control. The point is, once a hazard has been identified, it probably will be contained initially, but it may be forgotten in the future."

"The word “process” has become tainted among developers; it means something imposed by people who have lost touch with the realities of code development. Developers accuse ‘them’ of imposing processes because they sound good in theory, and are a foolproof way of passing an auditor’s comparison of the practice to a particular standard. Developers find themselves overloaded with work that they feel they don’t have to do in order to produce good code. The unfortunate consequence of this is that anything said by the “process camp” tends to be disregarded by the “developer camp.” This leads to an unwillingness to adopt a good practice just because the process people support it."


Real Time and Linux:

"The example of the paint nozzle and the average framerate are examples of what we call hard real-time and soft real-time constraints, respectively. Hard real-time applications must have their deadlines met, otherwise an unacceptable result occurs. Something blows up, something crashes, some operation fails, someone dies. Soft real-time applications usually must satisfy a deadline, but if a certain number of deadlines are missed by just a little bit, the system may still be considered to be operating acceptably."

"Real-time applications have time-related requirements. Real-time operating systems can guarantee performance to real-time applications."

No comments: