I usually tell the model that I will be testing its reasoning capabilities by describing a scenario and then asking questions about the evolving scenario.
I typically give it a description of a limited environment with objects in it, and say that “we “ are in this environment. I then describe actions that I take within the environment and ask questions about the updated world-state that must be inferred from the actions. This tests a lot of “common sense” reasoning skills, which I find to be more important for real world tasks than logic puzzle type reasoning.