The same youtuber has done more comprehensive testing. HW3 with FSD v12 consistently failed while HW4 with v13 consistently passed the same test.
I didn't like Mark Rober's video because he only tested Autopilot which is a different software.
Autopilot = simple lane keep assistant from 7 years ago
FSD = the real deal, this can do all the maneuvers, this is what is supposed to become unsupervised according to Tesla
I am interested in the future of Tesla's technology, so testing Autopilot feels irrelevant to me.