The test suite was 90% "end to end" unit tests - no real infrastructure was used it was all faked. Only interactions with the outside world (web client, LLM, database) were tested and all interactions were faked.
(This is not feasible on every project but it was on this one, database interactions were simple)
There were a small number (~5%) of slow tests that used a real LLM, database, infrastructure, etc. and a small number of very low level unit tests (~5%) surrounding only complex stateless functions with simple interfaces.
Refactoring could be done trivially without changing any test code 98% of the time.
Additionally, the (YAML) tests could rewrite their expected responses based upon the actual outcome - e.g. when you added a new property to a rest api response you just reran the test in update mode and eyeballed the test.
There was also a template used to generate how-to markdown docs from the YAML.
Test coverage was probably 100% but I never measured it. All new features being written with TDD/documentation driven development probably guaranteed it.