This is the right instinct. LLMs are great at proposing actions, but they're fundamentally unreliable at enforcing deterministic invariants.
Math is the obvious example, but schema validation falls into the same bucket. If an agent outputs structured JSON, the question "does this conform to the declared schema?" should have exactly one answer. Same schema + same payload → same result, every time, regardless of runtime, language, or retry.
Once you treat that layer as deterministic infrastructure instead of model behavior, a few things get easier:
• retries stop producing inconsistent side effects
• downstream systems can trust that validation actually ran
• you can audit what passed structural checks independently of the model
Semantic correctness is still a separate problem (models or domain rules are needed there) but offloading the structural layer removes a lot of accidental complexity.
One example of what that deterministic validation layer looks like as a standalone API: https://docs.rapidtools.dev/openapi.yaml