Replace 'images' with 'sensor data' and adversarial examples can still be generated. They might not be as easy to feed into the vehicles hardware (e.g. requiring speakers to fool an acoustic sensor), but the same principles apply.
It's also not necessary for the recognition algorithms to be using gradient descent, so long as they are differentiable (or can be approximated by a model that is), you can use gradient descent to find adversarial examples.
Adversarial examples exist for any model with a high input dimension (in relation to the available training data), differentiability only helps with finding them.