Probably like anything regarding computer vision: edge cases and lack of real artificial intelligence or data to categorize.
I think we will get there with reasonable safety, but I think it will take a while until it will see it implemented on larger scales. I think current solutions are a risky venture.
It would be far easier without human drivers on the roads and simple supporting infrastructure like signs and markings...