That's a very weak form. The way I use "agentic" is that it is trained to optimize the success of an agent, not just predict the next token.
The obvious way to to that is for it to plan a set of actions and evalute each possible way to reach some goal (or avoid an anti-goal). Kind of what AlphaZeros is doing for games. Q* is rumored to be a generalization of this.