Passing nothing is surprisingly difficult (opens in new tab)

(davidben.net)

180 pointskingkilr2y ago70 comments

70 comments

There is no problem with memcpy other than that you can't use a null pointer. You can memcpy zero bytes as long as the pointer is valid. This works in a good many circumstances; just not circumstances where the empty array is represented by not having an address at all.

For instance, say we write a function that rotates an array: it moves the low M bytes to the top of the array, and shuffles the remaining M - N bytes down to the bottom. This function will work fine with the zero byte memmove or memcpy operations in the special case when N == 0, because the pointer will be valid.

Now say we have something like this:

  struct buf {
    char *ptr;
    size_t size;
  };

we would like it so that when the size is zero, we don't have an allocated buffer there. But we'd like to support a zero sized memcpy in that case: memcpy(buf->ptr, whatever, 0) or in the other direction likewise.

We now have to check for buf->ptr being buf in the code that deals with resizing.

Here is a snag in the C language related to zero sized arrays. The call malloc(0) is allowed to return a null pointer, or a non-null pointer that can be passed to free.

oops! In the one case, the pointer may not be used with a zero-sized memcpy; in the other case it can.

This also goes for realloc(NULL, 0) which is equivalent to malloc(0).

And, OMG I just noticed ...

In C99, this was valid realloc(ptr, 0) where ptr is a valid, allocated pointer. You could realloc an object to zero.

I'm looking at the April 2023 draft (N3096). It states that realloc(ptr, 0) is undefined behavior.

When did that happen?

LegionMammal9782y ago

N2464 [0]: there was lots of implementation divergence on what realloc(ptr, 0) did (especially with BSD, which allegedly doesn't free the memory at all?), so they just declared it UB.

[0] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2464.pdf

planede2y ago

I read this as "there were a lot of buggy implementations of the C standard, so we imported the bug into the standard". Crazy. Don't make the language less defined going forward.

When those implementations eventually pick up C23, they surely could fix the bug as well. At best this should have been an errata/defect for the previous standard, so that the previous standards document behavior of implementations of said standards.

kazinator2y ago

The BSD people don't understand what little standards they do read. It's unfortunate that we have to spoil the language for their sake.

The requirements in C99 and before are perfectly clear. realloc is described as liberating the old pointer, and then allocates a new one as if by malloc. (Except that it magically has access to both objects so it can transfer the necessary bytes that must be transferred from the old to the new.)

It is perfectly clear what happens when size is zero. No byte can be copied from the old object, if any. The behavior is like free(oldptr) followed by return malloc(newsize).

Your IQ would have to be well below 85 to misunderstand the requirements.

And those requirements are still there; there is still the description of realloc in terms of freeing the old pointer and allocating a new object with malloc.

There was no need to insert a gratuitous removal of definedness for the size zero case, given that malloc handles it.

Applications now have to do this:

  void *sane_realloc(void *ptr, size_t size)
  {
    if (size == 0) {
      // behave literally as required in C99
      free(ptr);
      return malloc(0);
    }

    return realloc(ptr, size);
  }

Supposedly because a few vendors were not able to code this logic in their realloc functions?

wruza2y ago

Just a reminder to myself how happy I am to leave this academic level of snobbery behind and migrate to languages that help getting things done and save you from nonsense like this. I did C for about ten years. Hope I’ll never have to do a single line of it again.

1 more reply

torstenvl2y ago

C99 and C11 have no special treatment for a size of zero. Since "memory for the new object [of size zero] cannot be allocated, the old object is not deallocated and its value is unchanged." (emphasis added). This is exactly what BSD does.

C17 says "If size is zero and memory for the new object is not allocated, it is implementation-defined whether the old object is deallocated" (emphasis added).

What standard, exactly, is BSD violating?

2 more replies

adrianN2y ago

I don't think insulting the BSD folks is very nice. They probably had good reasons for their decisions.

2 more replies

matheusmoreira2y ago

Is there a rationale for a memory allocator to support zero sized allocations? Is this really just about providing a "technically" valid pointer for the pointer/size pair structure? To me it seems any address is a potentially valid pointer to a zero-sized object. Do allocators really keep track of these null allocations? That would require keeping state for every single address in the worst case...

It's very strange. I wrote my own memory allocator and I can't figure out the right way to handle this. Eliminating the need for these "technically" valid pointers that can't actually be accessed because they're zero sized seems like the better solution.

> When did that happen?

More importantly, why did that happen? People have told me that I should care about the C standards committee because they take backwards compatibility very seriously. Then they come out with breaking changes like these.

kazinator2y ago

> Is there a rationale for a memory allocator to support zero sized allocations?

Mainly, that it has supported that before and programs rely on it.

Programs written to the C99 standard can resize a dynamic vector down to empty with a resize(ptr, 0). The pointer coming from that will be the same as if malloc(0) has been called.

So now, that has been taken away; those programs can now make demons fly out of your nose.

Thank you, ISO C!

> Do allocators really keep track of these null allocations? That would require keeping state for every single address in the worst case...

Implementations of malloc(0) that don't return null are required to return a unique object. To do that, all they have to do is pretend that the size is some nonzero value like 1 byte. (The application must not assume that there is any byte there that can be accessed).

torstenvl2y ago

> Programs written to the C99 standard can resize a dynamic vector down to empty with a resize(ptr, 0).

C99 has no resize() function. Assuming you mean realloc(), C99 does not guarantee you can use realloc() in this manner.

https://stackoverflow.com/questions/16759849/using-realloc-x...

https://wiki.sei.cmu.edu/confluence/plugins/servlet/mobile?c...

https://developers.redhat.com/articles/2023/07/26/checking-u...

2 more replies

tom_2y ago

If storing the metadata in the heap, 0 bytes often doesn't even end up a special case. You need to have a case for allocations of some arbitrary number of bytes, and 0 is an arbitrary number of bytes.

Another option is to treat them as being of size 1.

(In theory you could do endless allocations of size 0, and eventually you'd run out of space, even though you've allocated 0 bytes in total. But you end up in exactly that situation, whatever the allocation size, if you don't take bookkeeping overhead into account!)

cbarrick2y ago

Useful context on the Rust side is this issue [1]. It sounds like some of the author's concerns are addressed already.

[1]: https://github.com/rust-lang/unsafe-code-guidelines/issues/4...

steveklabnik2y ago

thayne2y ago

This is basically the "define pointer arithmetic for invalid pointers". Which as pointed out in that section, doesn't solve completely the FFI problem.

kevingadd2y ago

A fun additional twist to this is that dereferencing nullptr is valid in WebAssembly, and actual data can in fact end up there, though ideally it never will.

If you ensure that the 'zero page' (so to speak) is empty you can also exploit this property for optimizations, and in some cases the emscripten toolchain will do so.

i.e. if you have

  struct MyArray<T> {
    uint length;
    T items[0];
  }

you can elide null pointer checks and just do a single direct bounds check before dereferencing an element, because for a nullptr, (&ptr->length) == nullptr, and if you reserve the zero page and keep it empty, (nullptr)->length == 0.

this complicates the idea of 'passing nothing' because now it is realistically possible for your code to get passed nullptr on purpose and it might be expected to behave correctly when that happens, instead of asserting or panicking like it would on other (sensible) targets

Joker_vD2y ago

Because WASM is not C and there is no "nullptr" in WASM. In WASM, zero is just an address, as valid as any other. And C actually doesn't require the null pointer value to have bit pattern "all zeros", precisely to allow for architectures where treating zero address as invalid would be way too cumbersome. And some implementations actually took that option.

kevingadd2y ago

I wasn't aware the spec allowed for nullptr to not be 0, that's fascinating! In that case you could probably use 0xFFFFFFFF as long as you limit the size of the WASM heap to below 4GB, then. You'd risk having addresses wrap-around though.

Joker_vD2y ago

Nothing stops you from having your null pointer in the middle of the address space. Some C compiler for DOS or early Windows did that IIRC (it was 0xB800 or something?.. so that it wouldn't accidentally corrupt the interrupt table). Also, C explicitly prohibits address wrap-around problems for pointers:

    Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

But this is fine, since pointer comparisons (as in, less/greater comparisons) are actually both pretty restricted and required to have reasonable semantics when comparing pointers that point into the same object/array:

    When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object types both point to the same object, or both point one past the last element of the same array object, they compare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P. In all other cases, the behavior is undefined.

By the way, this means that, among other things, if you use number N to represent a null pointer then number N-1 can not ever be a valid pointer to anything: adding 1 to a valid pointer is always allowed, and this addition should produce a non-null pointer — because the resulting pointer is required to be well-behaved in comparisons, and comparisons with null pointer are UB.

nmilo2y ago

I'm not sure... wasm is an assembly, not a C implementation. It can define what happens if you load from 0 but it doesn't get to define if the C code `*nullptr` actually loads from 0. Whether or not it does is defined by your compiler, which is probably the clang frontend if you're on emscripten. But then again I think there's a clang flag to disable optimizing away reads/writes to nullptr.

lanstin2y ago

HPUX must have had something similar, as when AOL backend code was ported to Solaris, which does segv on null dereference, we found all kinds of places where code that had been running without notable incident on HPUX started dropping core.

vlovich1232y ago

I’m kind of surprised it’s not defined that the first page must be 0-mapped read only… this sounds like a security vulnerability because it’s not like any other machine code would be written against and thus violate all sorts of safety assumptions.

deathanatos2y ago

Do you mean that as written? I'd find that extremely surprising, and would in my mind, violate all sorts of safety assumptions, primarily that deref'ing NULL traps¹.

E.g., I am pretty sure Go relies on some of the behavior described here: that the 0 page is unmapped, and that accesses will trap. This is why Go code will sometimes SIGSEGV despite being an almost memory-safe language: Go is explicitly depending on that trap (and it permits Go, in those cases, to elide a validity check). (Vs. some memory accesses will incur a bounds check & panic, if Go cannot determine that they will definitely land in the first page; Go there must emit the validity check, and failing it is a panic, instead of a SIGSEGV.)

IIRC, Linux doesn't permit at least unprivileged processes to map address 0, I believe. (Although I can't find a source right now for that.)

¹Yes, in most languages this is UB … but what I'm saying is that having it trap makes errors — usually security errors — obvious & fail, instead of really letting the UB just do whatever and really going off into "it's really undefined now" territory.

vlovich1232y ago

Ideally it would be an unmapped trap considering it’s literally how every other runtime works. The next best option is to make it read only. The dumbest option is to make it read/write as that’s going to be a vector for security vulnerabilities.

1 more reply

matheusmoreira2y ago

I'm dealing with the exact same issues right now in my project, this post is very enlightening.

> But suppose we want an empty (length zero) slice.

So is there an actual rationale for this? I've written the memory allocator and am in the process of developing the foreign interface. I've been wondering if I should explicitly support zero length allocations. Even asked this a few times here on HN but never got an answer. It seems to be a thing people sort of want but for unknown reasons.

anderskaseorg2y ago

It is extremely common to have a collection that might or might not be empty at runtime, and we don’t want to force every programmer who allocates a slice to manually write an alternate code path for the empty case.

matheusmoreira2y ago

Are memory blocks collections though? An empty list or table makes intuitive sense. Does a zero sized memory block make sense? I'm having trouble understanding that. If the memory has size zero, then by definition there is nothing to point to, nothing whose address can be taken.

I definitely see the benefits of well-defined arithmetic on null pointers. As a data type though it seems to me that any pointer could be a zero sized allocation.

anderskaseorg2y ago

When you allocate memory for an empty array with malloc(num * size) where num == 0, you get a zero-sized memory block. As discussed in the article, representing this with a null pointer causes problems, because that results in undefined behavior in memcpy (despite asking it to copy 0 bytes). So we want it to be a real memory block that can be safely passed to free().

swiftcoder2y ago

It's obviously too late to change this in Rust's case, but I wonder whether being able to differentiate between None and the empty slice is actually a necessary property in general?

There are a bunch of languages where empty arrays are "falsy", and in those it's not recommendable to use the two to differentiate valid states. Feels like the same could apply here

tialaramex2y ago

The main complaint in the post is basically that Rust's actual bona fide slice type doesn't work the way cobbled together library types for this purpose in C or C++ do.

The C++ type discussed is much newer than Rust (std::span was standardized in C++ 20).

Yes in many cases what C++ APIs mean here isn't a slice of zero Ts at all but instead None, and Rust has an appropriate type for that Option<&[T]> which works as expected, and so in many cases where people have built an API which they think is &[T] and are trying to make it with the unsafe functions mentioned it's actually Option<&[T]> they needed anyway, they don't even have a type correct design.

swiftcoder2y ago

I'm guess I'm inclined to go the other way. I tend to object to wrapping arrays in Option, because while semantically similar, the empty slice supports the full set of array APIs, whereas Option requires unwrapping

tialaramex2y ago

But that's not (a reference to) an array, that would be [T; N] it's a reference to a slice hence the syntax [T]

Arrays know their size, so the "I'll interpret it as zero Ts" makes even less sense for an array where we know up front the size as it is part of the type.

anonymoushn2y ago

Unfortunately empty slices are pretty useful, particularly for strings. For example, if you want to represent HTTP response headers, you might include a bunch of nigh-ubiquitous headers in a struct and punt the others to a hash table, and you would then have to represent both empty-valued-and-present headers and missing headers for those headers you placed in the struct.

cozzyd2y ago

I thought the justification for Rust having an unstable ABI is so such things like binary representations of types could change?

swiftcoder2y ago

The binary representation could change, yes, but collapsing two distinct types into one would cause actual logic errors in existing code (i.e. this is an API change, not an ABI change)

vardump2y ago

Fun times with buggy kernel drivers.

Pass something with a 0 length, pointing to NULL. Enjoy your blue screens and kernel panics.

pizlonator2y ago

It’s so silly to talk about C not allowing null on memcpy. That’s a thing the spec says, I guess?

The solution is clear: just ignore the C spec. It’s total garbage. Of course you can memcpy between any ptr values if the count is zero and those ptr values don’t have to point to anything.

JonChesterfield2y ago

Better be rolling your own compiler in that case. Or your own memcpy with a different name.

UB to pass memcpy to null means after that call, the pointer is assumed to be non-null. So if(ptr) can constant fold. Maybe faster.

I'm in agreement with you on this but your compiler probably isn't.

matheusmoreira2y ago

> Better be rolling your own compiler in that case.

No need for that.

> the pointer is assumed to be non-null

Just give us an option to tell the compiler to stop assuming nonsense like that. I'm gonna make it standard on my makefiles just like -fno-strict-aliasing and -fwrapv.

There's no use trying to work around C standard problems. Compilers should just be told to define the undefined and to disable everything that can't be defined. Then we can write code on solid foundations instead of quicksand.

> Or your own memcpy with a different name.

I wish. I couldn't escape that function even on my freestanding nolibc project. The compilers will happily emit calls to memcpy and memset all by themselves whenever they feel like it and god help you if you don't provide them because for some reason this nonsense can't be disabled.

JonChesterfield2y ago

LLVM's handling of libc is roughly "assume libc always exists and is linked as machine code". This is deeply unhelpful when that is not true, such as when you're implementing libc. -ffreestanding and -fno-builtins (might be spelled differently) should kill the pattern match into memcpy/memset logic, if it doesn't we have another bug.

I don't trust the clang -fno-strict-aliasing -fno-pointer-whatever strategy. There's too many ways for that to go wrong. Code needs to be correct/safe by default and opt into optimisations to have a chance of working, otherwise it is really easy to fail to check that flag.

There are a few fairly simple C compilers out there. LCC, the one that derives from the obfuscated project, one in gnu mes associated with guix. There's a grammar from the compcert people.

I haven't convinced myself writing a working C compiler is a weekend project but it's surely less than a year, seriously considering it on paranoia grounds. Idea being use it as a reference - when I suspect clang to be breaking things, run against the dumb one that doesn't really do optimisations as a comparison.

1 more reply

garaetjjte2y ago

-fno-delete-null-pointer-checks

anonymoushn2y ago

This is a bad idea if your memcpy starts by adding len to src or dst, since nullptr + 0 is nasal demons in C.

SonOfLilit2y ago

What a wonderfully subtle issue.

pyrolistical2y ago

How does zig handle this? Does it just have its own slice representation that gets compiled away? Or does it disallow zero length slices?

anonymoushn2y ago

Zig slices are (start, count) where start's type is non-nullable pointer.

My impression is that Zig doesn't have a documented memory model that cares about things like whether an address corresponds to an allocation or not, so problems relating to this sort of thing cannot come up yet :)

brabel2y ago

I am not entirely sure, but it seems they chose to return a null pointer:

https://github.com/ziglang/zig/commit/32e0dfd4f0dab351a024e7...

anonymoushn2y ago

This is a commit that changes the now-defunct Zig compiler written in C++ to be careful when it calls malloc. So it has nothing to do with the semantics of the Zig language or the compatibility of zero-sized Zig slices with Rust, C, or C++ APIs.

stealthcat2y ago

In ML the problem is passing 0D scalar tensor as 1D 1-element tensor.

bhakunikaran2y ago

Quite intriguing

hackyhacky2y ago

> Passing nothing is surprisingly difficult

From the title, I assumed that this article was going to be about either (a) permissive grading standards at university or (b) chronic constipation.

dmvdoug2y ago

In fairness, both of those are also surprisingly difficult.

j / k navigate · click thread line to collapse

70 comments

kazinator2y ago

Now say we have something like this:

  struct buf {
    char *ptr;
    size_t size;
  };

We now have to check for buf->ptr being buf in the code that deals with resizing.

Here is a snag in the C language related to zero sized arrays. The call malloc(0) is allowed to return a null pointer, or a non-null pointer that can be passed to free.

oops! In the one case, the pointer may not be used with a zero-sized memcpy; in the other case it can.

This also goes for realloc(NULL, 0) which is equivalent to malloc(0).

And, OMG I just noticed ...

In C99, this was valid realloc(ptr, 0) where ptr is a valid, allocated pointer. You could realloc an object to zero.

I'm looking at the April 2023 draft (N3096). It states that realloc(ptr, 0) is undefined behavior.

When did that happen?

LegionMammal9782y ago

N2464 [0]: there was lots of implementation divergence on what realloc(ptr, 0) did (especially with BSD, which allegedly doesn't free the memory at all?), so they just declared it UB.

[0] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2464.pdf

planede2y ago

I read this as "there were a lot of buggy implementations of the C standard, so we imported the bug into the standard". Crazy. Don't make the language less defined going forward.

kazinator2y ago

The BSD people don't understand what little standards they do read. It's unfortunate that we have to spoil the language for their sake.

It is perfectly clear what happens when size is zero. No byte can be copied from the old object, if any. The behavior is like free(oldptr) followed by return malloc(newsize).

Your IQ would have to be well below 85 to misunderstand the requirements.

And those requirements are still there; there is still the description of realloc in terms of freeing the old pointer and allocating a new object with malloc.

There was no need to insert a gratuitous removal of definedness for the size zero case, given that malloc handles it.

Applications now have to do this:

  void *sane_realloc(void *ptr, size_t size)
  {
    if (size == 0) {
      // behave literally as required in C99
      free(ptr);
      return malloc(0);
    }

    return realloc(ptr, size);
  }

Supposedly because a few vendors were not able to code this logic in their realloc functions?

wruza2y ago

1 more reply

torstenvl2y ago

C17 says "If size is zero and memory for the new object is not allocated, it is implementation-defined whether the old object is deallocated" (emphasis added).

What standard, exactly, is BSD violating?

2 more replies

adrianN2y ago

I don't think insulting the BSD folks is very nice. They probably had good reasons for their decisions.

2 more replies

matheusmoreira2y ago

> When did that happen?

kazinator2y ago

> Is there a rationale for a memory allocator to support zero sized allocations?

Mainly, that it has supported that before and programs rely on it.

Programs written to the C99 standard can resize a dynamic vector down to empty with a resize(ptr, 0). The pointer coming from that will be the same as if malloc(0) has been called.

So now, that has been taken away; those programs can now make demons fly out of your nose.

Thank you, ISO C!

> Do allocators really keep track of these null allocations? That would require keeping state for every single address in the worst case...

torstenvl2y ago

> Programs written to the C99 standard can resize a dynamic vector down to empty with a resize(ptr, 0).

C99 has no resize() function. Assuming you mean realloc(), C99 does not guarantee you can use realloc() in this manner.

https://stackoverflow.com/questions/16759849/using-realloc-x...

https://wiki.sei.cmu.edu/confluence/plugins/servlet/mobile?c...

https://developers.redhat.com/articles/2023/07/26/checking-u...

2 more replies

tom_2y ago

Another option is to treat them as being of size 1.

cbarrick2y ago

Useful context on the Rust side is this issue [1]. It sounds like some of the author's concerns are addressed already.

[1]: https://github.com/rust-lang/unsafe-code-guidelines/issues/4...

steveklabnik2y ago

thayne2y ago

This is basically the "define pointer arithmetic for invalid pointers". Which as pointed out in that section, doesn't solve completely the FFI problem.

kevingadd2y ago

A fun additional twist to this is that dereferencing nullptr is valid in WebAssembly, and actual data can in fact end up there, though ideally it never will.

If you ensure that the 'zero page' (so to speak) is empty you can also exploit this property for optimizations, and in some cases the emscripten toolchain will do so.

i.e. if you have

  struct MyArray<T> {
    uint length;
    T items[0];
  }

Joker_vD2y ago

kevingadd2y ago

Joker_vD2y ago

    Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

    When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object types both point to the same object, or both point one past the last element of the same array object, they compare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P. In all other cases, the behavior is undefined.

nmilo2y ago

lanstin2y ago

vlovich1232y ago

deathanatos2y ago

Do you mean that as written? I'd find that extremely surprising, and would in my mind, violate all sorts of safety assumptions, primarily that deref'ing NULL traps¹.

IIRC, Linux doesn't permit at least unprivileged processes to map address 0, I believe. (Although I can't find a source right now for that.)

vlovich1232y ago

1 more reply

matheusmoreira2y ago

I'm dealing with the exact same issues right now in my project, this post is very enlightening.

> But suppose we want an empty (length zero) slice.

anderskaseorg2y ago

matheusmoreira2y ago

I definitely see the benefits of well-defined arithmetic on null pointers. As a data type though it seems to me that any pointer could be a zero sized allocation.

anderskaseorg2y ago

swiftcoder2y ago

It's obviously too late to change this in Rust's case, but I wonder whether being able to differentiate between None and the empty slice is actually a necessary property in general?

There are a bunch of languages where empty arrays are "falsy", and in those it's not recommendable to use the two to differentiate valid states. Feels like the same could apply here

tialaramex2y ago

The main complaint in the post is basically that Rust's actual bona fide slice type doesn't work the way cobbled together library types for this purpose in C or C++ do.

The C++ type discussed is much newer than Rust (std::span was standardized in C++ 20).

swiftcoder2y ago

tialaramex2y ago

But that's not (a reference to) an array, that would be [T; N] it's a reference to a slice hence the syntax [T]

Arrays know their size, so the "I'll interpret it as zero Ts" makes even less sense for an array where we know up front the size as it is part of the type.

anonymoushn2y ago

cozzyd2y ago

I thought the justification for Rust having an unstable ABI is so such things like binary representations of types could change?

swiftcoder2y ago

The binary representation could change, yes, but collapsing two distinct types into one would cause actual logic errors in existing code (i.e. this is an API change, not an ABI change)

vardump2y ago

Fun times with buggy kernel drivers.

Pass something with a 0 length, pointing to NULL. Enjoy your blue screens and kernel panics.

pizlonator2y ago

It’s so silly to talk about C not allowing null on memcpy. That’s a thing the spec says, I guess?

The solution is clear: just ignore the C spec. It’s total garbage. Of course you can memcpy between any ptr values if the count is zero and those ptr values don’t have to point to anything.

JonChesterfield2y ago

Better be rolling your own compiler in that case. Or your own memcpy with a different name.

UB to pass memcpy to null means after that call, the pointer is assumed to be non-null. So if(ptr) can constant fold. Maybe faster.

I'm in agreement with you on this but your compiler probably isn't.

matheusmoreira2y ago

> Better be rolling your own compiler in that case.

No need for that.

> the pointer is assumed to be non-null

Just give us an option to tell the compiler to stop assuming nonsense like that. I'm gonna make it standard on my makefiles just like -fno-strict-aliasing and -fwrapv.

> Or your own memcpy with a different name.

JonChesterfield2y ago

There are a few fairly simple C compilers out there. LCC, the one that derives from the obfuscated project, one in gnu mes associated with guix. There's a grammar from the compcert people.

1 more reply

garaetjjte2y ago

-fno-delete-null-pointer-checks

anonymoushn2y ago

This is a bad idea if your memcpy starts by adding len to src or dst, since nullptr + 0 is nasal demons in C.

SonOfLilit2y ago

What a wonderfully subtle issue.

pyrolistical2y ago

How does zig handle this? Does it just have its own slice representation that gets compiled away? Or does it disallow zero length slices?

anonymoushn2y ago

Zig slices are (start, count) where start's type is non-nullable pointer.

brabel2y ago