Not just when using tools, also when using humans. The frame of reference of what is considered 'production code' differs immensely between organizations, teams and people. The code I get from LLM's is usually much better than what I get from my peers. Maybe not one shot, but after some steering it gets there.
It also isn't lazy. When generating test cases for relatively simple pieces of code, it usually tests pretty much every path and doesn't stop right at the 80% code coverage quality gate.
I can imagine if you're at the level of Linus or something, you might conclude differently, but most people aren't there at all.