My app builds and runs fine on Termux, so my CLAUDE.md says to always run unit tests after making changes. So I punch in a request, close my phone for a bit, then check back later and review the diff. Usually takes one or two follow-up asks to get right, but since it always builds and passes tests, I never get complete garbage back.
There are some tasks that I never give it. Most of that is just intuition. Anything I need to understand deeply or care about the implementation of I do myself. And the app was originally hand-built by me, which I think is important - I would not trust CC to design the entire thing from scratch. It's much easier to review changes when you understand the overall architecture deeply.