For example, "squeeze a tube to cut water flow" and "put pressure on a deep wound to stop blood loss" can only be related, if not already in the training data, if there is understanding and intelligence.
The ability to do that is intelligence.
Otherwise, it's just a search and optimization problem.
Secondly, this poses the problem of 1) finding examples like the one you gave that it can't understand (regarding your specific example, see [1]), 2) ruling out that there was something "too close" drawing the equivalence in the training data, 3) ruling out that the failure to draw the equivalence is something structural (it can't have real understanding) rather than qualitative (it has real understanding, but it just isn't smart enough to understand the specific given problem)
So I'm back to my original question of how we would know if these are structurally different things in the first place.
[1] vidarh: How does squeezing a tube to cut water flow give you a hint as to what to do about a deep wound?
ChatGPT (GPT4): Squeezing a tube to cut off water flow demonstrates the basic principle of applying pressure to restrict or stop the flow of a fluid. Similarly, applying pressure to a deep wound can help control bleeding, which is a critical first aid measure when dealing with serious injuries.
[followed by a long list of what to do if encountering a deep wound]