If you directly give the distribution to the LLM, it is not doing anything interesting. It is just sampling from the strategy you tell it to play.