The spec is underspecified for basic edge cases, like any spec, and it's very hard to have implementation consensus.
This stuff is supposed to improve the UX. Yet the reality is that even when building basic forms, every website has to test and solve the sort of problems shown in TFA.
How much of the spec does every web developer in the world have to read to know that password managers should or shouldn't try to fill in credit card expiry/cvv in a hidden input? Does the spec even say anything about that? 1Password will ignore a `display: none`, by the way. Can this be quick-fixed by ensuring hidden inputs also have `display: none`? That's something every website trying to consider good autofill UX gets to figure out themselves if they even care.
Unfortunately "just follow the spec" does very little to block off the rabbit holes you'll find if you try to perfect UX on even basic forms, else I might agree with you.