Several other protocols solve this in a layering agnostic way by simply having a header length field. The header bytes can then be copied without any understanding of the format. This is even how IP's own ICMP protocol knows how much of an IP packet it should (at least) include in an error message so that the sender can know what triggered the error.
TCP, UDP, ICMP and IP were all designed contemporaneously; UDP fragmentation could also easily have just been specified for. It's just an odd regrettable quirk.