Case in point: I recently asked an LLM to write a pile of code to compile historical baseball stats to test betting success against the results of my hand-written code that evolves genetic algorithms. I marveled for a little while at the unbelievable improvement in EV/ROI that this script was showing could have been achieved from certain small tweaks. I only noticed after pushing a total bet that the push registered on the output as a win - and only because I was carefully staying on top of it. A single stupid recursively operating >= instead of > had caused completely nonsensical results that looked plausible.
Imagine, like, trusting a 10k loc script to give you data for something you were going to build in the physical world, and hoping an LLM hadn't made a mistake like that.