I currently work in the HR-tech space, so suppose someone has a not-too-crazy proposal of using an LLM to reword cover-letters to reduce potential bias in hiring. The issue is that the LLM will impart its own spin(s) on things, even when a human would say two inputs are functionally identical. As a very hypothetical example, suppose one candidate always does stuff like writing out the Latin like Juris Doctor instead of acronyms like JD, and then that causes the model to end up on "extremely qualified at" instead of "very qualified at"
The issue of deliberate attempts to corrupt the LLM with prompt-injection or poisonous training data are a whole 'nother can of minefield whack-a-moles. (OK, yeah, too far there.)