My understanding of caching with most models/providers is that a prefix substring of the context has to be reused for a cache hit, but not necessarily the whole entire context window. So if you prune tool calls from the history, you're going to get one cache miss on the newly-pruned history, and then you're going to be getting cache hits on every subsequent turn, with a lower number of input tokens. If you prune subsequent tool calls after that, you would still get a cache hit for the already-pruned portion of the context, just not the full context.
The way you do this (and the way opencode does it) is you do most of your pruning in more recent history. Last I looked at opencode, they start pruning tool call results after 2 full agentic turns. So you probably dont get quite as good hits on cache for the most recent 1-5% of your turns, but after that everything else caches fine and those tool calls that likely aren't relavent to your session anymore are gone.