The C compiler Anthropic got excited about was not a “working” compiler in the sense that you could replace GCC with it and compile the Linux kernel for all of the target platforms it supports. Their definition of, “works,” was that it passed some very basic tests.
Same with SQLite translation from C to Rust. Gaping, poorly specified English prose is insufficient. Even with a human in the loop iterating on it. The Rust version is orders of magnitude slower and uses tons more memory. It’s not a drop in Rust-native replacement for SQLite. It’s something else if you want to try that.
What mechanism in these systems is responsible for guessing the requirements and constraints missing in the prompts? If we improve that mechanism will we get it to generate a slightly more plausible C compiler or will it tell us that our specifications are insufficient and that we should learn more about compilers first?
I’m sure its possible that there are cases where these tools can be useful. I’m not sure this is it though. AGI is purely hypothetical. We don’t simulate a black hole inside a computer and expect gravity to come out of it. We don’t simulate the weather systems on Earth and expect hurricanes to manifest from the computer. Whatever bar the people selling AI system have for AGI is a moving goalpost, a gimmick, a dream of potential to keep us hooked on what they’re selling right now.
It’s unfortunate that the author nearly hits on why but just misses it. The quotes they chose to use nail it. The blog post they reference nearly gets it too. But they both end up giving AI too much credit.
Generating a whole React application is probably a breath of fresh air. I don’t doubt anyone would enjoy that and marvel at it. Writing React code is very tedious. There’s just no reason to believe that it is anything more than it is or that we will see anything more than incremental and small improvements from here. If we see any more at all. It’s possible we’re near the limits of what we can do with LLMs.