And I'm taking about both things.
Integer arithmetic that produces pointers that are just out of bounds of an object. Why can't this work? Why can't the compiler assume that, since I explicitly converted a pointer to an integer, the pointed-to object can't be put into a register, or made to go out or scope early?
Second, fabricating pointers. If I have a pointer to mmap/sbrk memory, shouldn't I be allowed to “fabricate” arbitrary pointers from integers that point into that area? If not, why not?
Finally Wasm. The linear memory is addressable from address 0 to __builtin_wasm_memory_size * PAGESIZE. Given this, and except maybe the address at zero, why should it be undefined behavior to dereference any other address?
What's the actual advantage to making these undefined behavior? What to we gain in return?
Formally it is undefined, as there is no way to give it sane semantics and it will definitely trip sanitizers and similar memory safety tools.
You get a bunch of memory from mmap. There are no “objects” (in C terminology) in there, and a single “provenance” (if at all, you got it from a syscall, which is not part of the language).
If arbitrary integer/pointer math inside that buffer is not OK, how do to get heterogeneous allocations from the arena to work? When do slices of it become “objects” and gain any other ”provenance”?
Is C supposed to be the language you can't write an arena allocator in (or a conservative GC, or…)?
Fabricating pointers in an implementation-defined way is obviously ok. This would cover mmap / sbrk or mapped I/O. Note also that historically, the C standard left things UB exactly so that implementations can use it for extensions (e.g. mmap). The idea that UB == invalid program is fairly recent misinformation, but we have to react to it and make things at least implementation-defined (which also meant a bit something else before).
I'll just finish by saying: yes please. And thank you for bearing with me.