I think your question was rhetorical, but there's an answer:
Because it means that modifying the sign is not useful. (Unlike, say, modifying a speed limit sign.) Also, the fiducial can encode its orientation or position (perhaps in relation to other nearby signs... and could be a hash or checksum of its position, to save space), so the vehicle would be able to know there's a mismatch and thus mark the sign as suspect/unreliable if it was in any other spot or orientation.
There are other solutions to these. But the same problem you describe occurs if someone moved a human-readable sign (but without any way to checksum).
I think fiducials are not a panacea. They are just one additional data source in what needs to be a robust sensor-fusion approach. But they make a whole bunch of stuff in machine vision a LOT easier to solve. Machine-learning approaches have the same problems but with less opportunity to address them, less robustness, and more overhead.