This is not a problem in my unreliable calculator use-cases; are you disputing that or dropping the analogy?
Because I'd love to drop the analogy. You mention IDEs- I routinely use IntelliJ's tab completion, despite it being wrong >>5% of the time. I have to manually verify every suggestion. Sometimes I use it and then edit the final term of a nested object access. Sometimes I use the completion by mistake, clean up with backspace instead of undo, and wind up submitting a PR that adds an unused dependency. I consider it indispensable to my flow anyway. Maybe others turn this off?
You mention hospitals. Hospitals run loads of expensive tests every day with a greater than 5% false positive and false negative rate. Sometimes these results mean a benign patient undergoes invasive further testing. Sometimes a patient with cancer gets told they're fine and sent home. Hospitals continue to run these tests, presumably because having a 20x increase in specificity is helpful to doctors, even if it's unreliable. Or maybe they're just trying to get more money out of us?
Since we're talking LLMs again, it's worth noting that 95% is an underestimate of my hit rate. 4o writes code that works more reliably than my coworker does, and it writes more readable code 100% of the time. My coworker is net positive for the team. His 2% mistake rate is not enough to counter the advantage of having someone there to do the work.
An LLM with a 100% hit rate would be phenomenal. It would save my company my entire salary. A 99% one is way worse; they still have to pay me to use it. But I find a use for the 99% LLM more-or-less every day.