The accuracy of the results might depend on the language you’re reading, the LLM you’re using, the nature of the text, and the amount of context you give to the LLM. When I’ve tested the best models with more or less standard texts, such as excerpts from novels or news articles, in English, Japanese, and Russian, the results have been extremely good. The latest versions of ChatGPT, Claude, and Gemini are able to explain the meetings of words quite well, and they also get the grammar correct. (I say this as a long-time language teacher and lexicographer. I have written and edited many textbooks and dictionaries for learners of English and Japanese; LLMs come close to my ability and maybe exceed it sometimes.)
They are not always so good, however, with more granular aspects of language, particularly the way words are written or pronounced—the problem the models have with the word “strawberry” is well known. I’ve also seen them struggle with the meanings of words and sentences in isolation, as a lack of context can confuse them (as it can confuse people).
In the case of emails or transcripts, the text might contain mistakes or non-standard language that might trip up the LLMs as well.
In any case, at least for major languages and non-critical applications, I think LLM’s are a great way to understand what is written in another language.