They don't, though. If you try to allocate too much VRAM it will either hard fail or everything suddenly runs like garbage due to the driver constantly swapping it / using shared memory.
The reason for this flag to exist in the first place is that many of the models are larger than the available VRAM on most consumer GPUs, so you have to "balance" it between running some layers on the GPU and some on the CPU.
What would make sense is a default auto option that uses as much VRAM as possible, assuming the model is the only thing running on the GPU, except for the amount of VRAM already in use at the time it is started.