It's giving a reply that "sounds right to a human" in context. Which, on small scale, is exactly what they, me and you are also doing when writing (or speaking), except for the infrequent cases when we force ourselves to reason through stuff very. slowly.
(This is why I believe LLM performance is best judged against human inner voice/system 1 reasoning, not the entirety of human thinking. When thinking with system 1, people don't really have an idea what they're doing either - they're just doing stuff that feels right.)
Also note that "sounds right to a human" is literally the loss function on which LLMs are trained, so between heaps of training inputs and subsequent extensive RLHF, the process is by its very construction aiming optimizing for above-human-average performance across the board.