The answer is not more natural language guardrails, it is in (progressive) formal specification of workflows and acceptance criteria. The task cannot be marked as complete if it is only accessible through an API that rejects changes lacking proof that acceptance criteria were met.