Hacker News is not some strange place where the normal rules of discourse don't apply. I assume you are familiar with the function of quotation marks.
> If we're both grading a single student (LLM) in same field (programming), and you find it great and I find it disappointing, it means one of us is scoring it wrong.
No, it means we have different criteria and general capability for evaluating the LLM. There are plenty of standard criteria which LLMs are pitted against, and we have seen continued improvement since their inception.
> It can't consistently write good doc comments. I does not understand the code nor it's purpose, but roughly guesses the shape.
Writing good documentation is certainly a challenging task. Experience has led me to understand where current LLMs typically do and don't succeed with writing tests and documentation. Generally, the more organized and straightforward the code, the better. The smaller each module is, the higher the likelihood of a good first pass. And then you can fix deficiencies in a second, manual pass. If done right, it's generally faster than not making use of LLMs for typical workflows. Accuracy also goes down for more niche subject material. All tools have limitations, and understanding them is crucial to using them effectively.
> It can't read and understand specifications, and even generate something as simple as useful API for it.
Actually, I do this all the time and it works great. Keep practicing!
In general, the stochastic parrot argument is oft-repeated but fails to recognize the general capabilities of machine learning. We're not talking about basic Markov chains, here. There are literally academic benchmarks against which transformers have blown away all initial expectations, and they continue to incrementally improve. Getting caught up criticizing the crudeness of a new, revolutionary tool is definitely my idea of unimaginative.