Only if you are doing it wrong, search >>> summarization
Then the other question, is it deterministic between runs or am I going to get a different summary each session, turn, or toolcall? And depending on that frequency, am I using more token than I save by doing summarization for N tools?
Minimizing token usage is not the goal in of itself, re: the ageless tradeoff of quantity vs quality
For some context, my system prompt is around 5k tokens at the start. I put file contents there read/write/agents.md, which save millions of tokens and seems to work better than making them message parts.
> Just trimmed boilerplate
This is not what I see this tool doing. It's automatically manipulating words in the background that you should put far more care and attention towards. Referring to those words as "boilerplate" you can just throw into a slop machines is revealing