While the JDK's don't work quite like this, a simple way to picture it is as a a contiguous memory buffer, say, 100MB large. The GC compacts the live objects to the bottom of that buffer. At that point they do know where the end of the used memory is. Say that after compaction, the live objects occupy the bottom 20MB of the buffer (overwriting any dead objects that may have been there). At that point, the GC can choose to uncommit the top 30MB of the buffer and return it to the OS.
With malloc/free there's also this separation. A free operation marks the memory of a freed object in a data structure called a free list. If all the objects in a certain page have been freed, the allocator can choose to uncommit it and return it to the OS (although many modern allocators choose not to do this promptly).
The operation of moving collectors and where there cost is is completely different from free-list approaches. With free list approaches there's some work done to allocate an object and some work done to free it. With moving collectors, there's very little work to allocate an object (typically, just a pointer bump) and no work to free it; there is, however work to keep the object alive. That's why, especially with generational collectors, moving collectors do very little work to manage the memory of objects that don't live for long.
This is why, when working in Java, it's important not to think about the heap as we do in C. For short-lived objects, the cost of allocatng and "deallocating" them is often not significantly higher than the cost of allocating and deallocating an object on the stack in C. On the other hand, mutating an existing long-lived object could sometimes require some bookkeeping work by the GC, and could be much more costly than allocating (and "deallocating") a new object. That's why Java programmers are discouraged from pooling objects to "help the GC", something that Go developers often do when they run into issues with their non-moving, non-generational collector.