The freaking article omits several issues in the "compiler". My bet is because they didn't actually challenged the output of the LLM, as it usually happens.
If you go to the repository, you'll find fun things, like the fact that it cannot compile a bunch of popular projects, and that it compiles others but the code doesn't pass the tests. It's a bit surprising, specially when they don't explain why those failures exist (are they missing support for some extensions? any feature they lack?)
It gets less surprising, though, when you start to see that the compiler doesn't actually do any type checking, for example. It allows dereferences to non-pointers. It allows calling functions with the wrong number of arguments.
There's also this fantastic part of the article where they explain that the LLM got the code to a point where any change or bug fix breaks a lot of the existing tests, and that further progress is not possible.
Then the fact that this article points out that the kernel doesn't actually link. How did they "boot it"? It might very well be possible that it crashed soon after boot and wasn't actually usable.
So, as usual, the problem here is that a lot of people look at LLM outputs and trust what they're saying they achieved.