So the agent can freely explore, check logs, list files, inspect service status. It only blocks when it wants to change something (install a package, write a config, restart a service).
Also worth noting: Shannot operates on entire scripts, not individual commands. The agent writes a complete program, the sandbox captures everything it wants to do during a dry run, then you review the whole batch at once. Claude Code's built-in controls interrupt at each command whereas Shannot interrupts once per script with a full picture of intent.
That said, you're pointing at a real limitation: if the fix genuinely requires a write to test a hypothesis, you're back to blocking. The agent can't speculatively install a package, observe it didn't help, and roll back autonomously.
For that use case, the OP's VM approach is probably better. Shannot is more suited to cases where you want changes applied to the real system but reviewed first.
Definitely food for thought though. A combined approach might be the right answer. VM/scratch space where the agent can freely test hypotheses, then human-in-the-loop to apply those conclusions to production systems.