So the remote user packets up 100ms of mouse movement with timestamps, sends it with ~100ms latency. Your side now has a buffer of ~100ms to start playing the positions back.
This also removes all jitter in the playback from varying latency (up to the point the jitter stays under 100ms).
All of the above numbers are made up for this example. You can adjust the playback delay as much as needed for smooth playback.