Brief informal notes from a wrap-up quick position statement talk I did at a workshop today.
Both safety and security have a lot in common in terms of how they are maturing over time. Without getting into a religious debate about the difference between them, I note that their trajectory seems to include the following steps, especially for autonomous systems. I'd argue that each step is in a sense more mature than the previous step.
- Get the system to work. Safety/security can come later.
- Get the system to work almost all the time. Conflate this with safety/security even though you're still really just getting it to work in the common cases (safety for a vehicle is "doesn't hit stuff" while security is "doesn't get taken down by the usual continuous stream of automated attacks")
- Brute force problem fixes: fly/crash/fix/fly (air) and drive/crash/fix/drive (ground)
- Create a set of best practices in the nature of a building code ("build your system this way")
- Create a useful fiction that you have completely characterized the requirements and operational environment and that your building code will always work.
- Any failure is an embarrassing piece of bad news that violates the fiction of complete understanding.
- As system matures, complain about false alarm safety/security shutdowns
- It might feel like this means your system has problems, but in fact you're a lot more safe and secure than systems that operate oblivious to their vulnerabilities
- Start permitting breaking the building code standard rules by arguing that exceptions still result in equivalent safety/security
- Evolve to full-up deductive assurance cases to argue safety/security beyond building codes
- Still maintain the fiction of complete knowledge of requirements and environment
- Start operating in more open environments and admit you didn't really understand requirements, nor environment
- Spend a lot of time chasing down problems that reveal defects in your safety case (safety case does not match environmental assumptions, or might not even match deployed system)
- Switch to an inductive safety case approach:
- Account for risk from epistemic uncertainty (unknown unknowns)
- Instrument system for failure precursors (e.g., safety performance indicators tied to safety case claims)
- Treat incidents as an opportunity to fix problems before there is a loss event.