A More Comprehensive Look at Autonomous Vehicle Testing and Validation

Developers should create a transparent safety argument based on testing, simulation, and good engineering practices.

The race to deploy self-driving cars is in full swing, with fleets being deployed on public roads in increasing numbers. While some vehicles require driver supervision, other fleets are being deployed which don’t even have driver controls at all. The big question is: will these vehicles be safe enough?

Carnegie Mellon NavLab 11 / 2001 / https://goo.gl/itksbK

Now that self-driving car technology is maturing, there is increasing recognition of the fact that demonstrating adequate levels of safety with on-road testing is impractical. Too many billions of miles are needed for a credible statistical safety argument, and it simply costs too much and takes too long to do that. Other safety critical application areas already have a solution to this problem in the form of internationally accepted safety standards (e.g., aircraft, trains, and even home appliances). While there is an international safety standard for conventional cars (ISO-26262), its use is not currently required by the US government.

The stakes are high. Even a single bad line of software code can — and has — caused catastrophic failures. While nobody should expect this technology to be perfect, it’s important to ensure that it follows accepted practices and is appropriately safe. Assuming we have a quantification of how much better than a human driver the technology needs to be, the technologists need to make sure that they have a way to measure and ensure that self-driving cars actually are that safe.

A starting point for understanding self-driving car safety is defining an effective role for on-road data gathering. Contrary to common discussions about the purpose of vehicle testing, on-road testing should not be primarily used to find bugs in software. (Bugs — which really should be called defects — should be found in simulation and development.) Rather, the critical need for on-road testing is to understand the driving environment and collect data to make sure that simulation will include all the typical, unusual, and just plain weird driving situations that a full-scale self-driving car fleet will experience on real-world roadways. On-road vehicle testing can be used as a graduation exercise before deploying production vehicles to make sure nothing was missed in development, but it should not be the primary strategy for finding defects in the vehicle’s autonomy.

Ensuring adequate safety requires two approaches beyond collecting on-road data. The first approach is extensive simulation across many different levels. Closed course testing is a form of simulation in that a real car’s response is measured within an environment that has artificially created scenarios designed to stress key aspects of the system. Pure software simulation can be used in addition to closed course testing to scale up to hundreds or thousands of simulated cars running 24 hours a day in a cloud computing environment. These and other types of simulation can help improve testing coverage. But, experience with other types of critical systems shows that even extensive simulation is not enough to catch all the software defects that can cause avoidable deaths.

In addition to testing and simulation, rigorous design engineering approaches must be used to ensure that the software running self-driving cars is well designed and can deal safely with the inevitable glitches, malfunctions, faults, and surprises that happen in real world systems. Following ISO 26262 and other relevant safety standards is a necessary starting point. Additional methods such as system robustness testing and using a safety net architectural approach can help develop a sufficient level of assurance for the AI-centric vehicle functions. These technical approaches should be created in parallel with the control system side of self-driving car technology. While both design and validation of self-driving cars is a maturing area, it is likely that we have the means to ensure an appropriate level of safety. However, it is crucial to ensure that developers actually use these state-of-the-art techniques rather than skimping on safety validation in the race to deploy.

As self-driving car technology continues to be developed, there are fleets of test vehicles operating on public roads. Public on-road testing clearly has the potential to put pedestrians, other vehicles, and the general public at risk if the test vehicle misbehaves. While safety drivers are charged with ensuring vehicle safety, there are challenges to make this approach effective in practice. For example, continuous safety driver vigilance turns out to be difficult to maintain if the vehicle doesn’t fail very often. And it can be difficult for even a vigilant driver to know if a vehicle is about to misbehave without having some sort of indication of the vehicle’s near term plans. In general, it is important to ensure that a test vehicle (or any partially autonomous vehicle that requires human supervision) never paints the safety driver into a corner, leaving that driver in an untenable situation from which recovery within the constraints of reasonable human performance is difficult or impossible. When the driver does intervene, it’s important that the vehicle actually responds to the driver and gracefully relinquishes relevant vehicle controls.

In the longer term, it will be important that vehicle makers create a transparent argument as to why their AV is sufficiently safe. That argument must be reviewed by credible, independent safety experts. Well-chosen safety performance metrics can give some insight into how AV technology is progressing, but tend to come with many disclaimers and limitations. Simplistic metrics such as disengagement reports are no substitute for a thorough understanding of whether an AV has been engineered to an appropriate level of safety.

A lesson learned a long time ago in other safety areas is that independent audits of some type are an absolute requirement for designing safe systems. Transparency in safety arguments does not necessarily mean design information must be made public, nor that the government must perform reviews. But some credible independent party must assess whether an AV has been designed to an appropriate level of engineering rigor to be safe. The safety reports being published by some AV developers are an encouraging first step, but more details are required for independent assessment. A good starting point is to say exactly which appropriate international safety standards are being followed for what aspects of safety, and get external assessment according to those standards.

Incorporating AI technology into life critical computer-based systems such as AVs presents unique challenges. Whether safety is ensured within the scope of existing standards or is supported by arguments in addition to those standards, a transparent safety argument should be checked by an independent assessor to ensure that vehicles really are as safe as they need to be for use on public roads. That safety argument will need to include a lot more than just on-road test results.

About the author:
Dr. Philip Koopman has been working on autonomous vehicle safety for more than 20 years. As a professor at Carnegie Mellon University he teaches safe, secure, high quality embedded system engineering. As co-founder of Edge Case Research LLC he finds ways to improve AV safety, including robustness testing, architectural safety patterns, and structured safety argument approaches. He has a blog on self-driving car safety at: https://safeautonomy.blogspot.com