undefined | Better HN

0 pointsredhale4mo ago0 comments

Not necessarily responding to you directly, but I find this take to be interesting, and I see it every time an article like this makes the rounds.

Starting back in 2022/2023:

- (~2022) It can auto-complete one line, but it can't write a full function.

- (~2023) Ok, it can write a full function, but it can't write a full feature.

- (~2024) Ok, it can write a full feature, but it can't write a simple application.

- (~2025) Ok, it can write a simple application, but it can't create a full application that is actually a valuable product.

- (~2025+) Ok, it can write a full application that is actually a valuable product, but it can't create a long-lived complex codebase for a product that is extensible and scalable over the long term.

It's pretty clear to me where this is going. The only question is how long it takes to get there.

0 comments

arkensaw4mo ago

> It's pretty clear to me where this is going. The only question is how long it takes to get there.

I don't think its a guarantee. all of the things it can do from that list are greenfield, they just have increasing complexity. The problem comes because even in agentic mode, these models do not (and I would argue, can not) understand code or how it works, they just see patterns and generate a plausible sounding explanation or solution. agentic mode means they can try/fail/try/fail/try/fail until something works, but without understanding the code, especially of a large, complex, long-lived codebase, they can unwittingly break something without realising - just like an intern or newbie on the project, which is the most common analogy for LLMs, with good reason.

namrog844mo ago

While I do agree with you. To play the counterpoint advocate though.

What if we get to the point where all software is basically created 'on the fly' as greenfield projects as needed? And you never need to have complex large long lived codebase?

It is probably incredibly wasteful, but ignoring that, could it work?

fwip4mo ago

That sounds like an insane way to do anything that matters.

Sure, create a one-off app to post things to your Facebook page. But a one-off app for the OS it's running on? Freshly generating the code for your bank transaction rules? Generating an authorization service that gates access to your email?

The only reason it's quick to create green-field projects is because of all these complex, large, long-lived codebases that it's gluing together. There's ample training data out there for how to use the Firebase API, the Facebook API, OS calls, etc. Without those long-lived abstraction layers, you can't vibe out anything that matters.

theshrike794mo ago

In Japan buildings (apartments) aren't built to last forever. They are built with a specific age in mind. They acknowledge the fact that houses are depreciating assets which have a value lim->0.

The only reason we don't do that with code (or didn't use to do it) was because rewriting from scratch NEVER worked[0]. And large scale refactors take massive amounts of time and resources, so much so that there are whole books written about how to do it.

But today trivial to simple applications can be rewritten from spec or scratch in an afternoon with an LLM. And even pretty complex parsers can be ported provided that the tests are robust enough[1]. It's just a metter of time someone rewrites a small to medium size application from one language to another using the previous app as the "spec".

[0] https://www.joelonsoftware.com/2000/04/06/things-you-should-...

[1] https://simonwillison.net/2025/Dec/15/porting-justhtml/

3 more replies

techblueberry4mo ago

I don't think so. I don't think this is how human brains work, and you would have too many problems trying to balance things out. I'm thinking specifically like a complex distributed system. There are a lot of tweaks and iterations you need for things to work with eachother.

But then maybe this means what is a "codebase". If a code base is just a structured set of specs that compile to code ala typescript -> javascript. sure, but then, it's still a long-lived <blank>

But maybe you would have to elaborate on, what does "creating software on the fly" look like,. because I'm sure there's a definition where the answer is yes.

damethos4mo ago

I have the same questions in my head lately.

bayindirh4mo ago

Well, the first 90% is easy, the hard part is the second 90%.

Case in point: Self driving cars.

Also, consider that we need to pirate the whole internet to be able to do this, so these models are not creative. They are just directed blenders.

throwthrowuknow4mo ago

Even if Opus 4.5 is the limit it’s still a massively useful tool. I don’t believe it’s the limit though for the simple fact that a lot could be done by creating more specialized models for each subdomain i.e. they’ve focused mostly on web based development but could do the same for any other paradigm.

emodendroket4mo ago

That's a massive shift in the claim though... I don't think anyone is disputing that it's a useful tool; just the implication that because it's a useful tool and has seen rapid improvement that implies they're going to "get all the way there," so to speak.

bayindirh4mo ago

Personally I'm not against LLMs or AI itself, but considering how these models are built and trained, I personally refuse to use tools built on others' work without or against their consent (esp. GPL/LGPL/AGPL, Non Commercial / No Derivatives CC licenses and Source Available licenses).

Of course the tech will be useful and ethical if these problems are solved or decided to be solved the right way.

ForHackernews4mo ago

We just need to tax the hell out of the AI companies (assuming they are ever profitable) since all their gains are built on plundering the collective wisdom of humanity.

2 more replies

literalAardvark4mo ago

They're not blenders.

This is clear from the fact that you can distill the logic ability from a 700b parameter model into a 14b model and maintain almost all of it.

You just lose knowledge, which can be provided externally, and which is the actual "pirated" part.

The logic is _learned_

encyclopedism4mo ago

It hasn't learned any LOGIC. It has 'learned' patterns from the input.

theshrike794mo ago

What is logic other than applying patterns?

1 more reply

bayindirh4mo ago

Are there any recent publications about it so I can refresh myself on the matter?

D-Machine4mo ago

You won't find any trustworthy papers on the topic because GP is simply wrong here.

That models can be distilled has no bearing whatsoever on whether a model has learned actual knowledge or understanding ("logic"). Models have always learned sparse/approximately-sparse and/or redundant weights, but they are still all doing manifold-fitting.

The resulting embeddings from such fitting reflect semantics and semantic patterns. For LLMs trained on the internet, the semantic patterns learned are linguistic, which are not just strictly logical, but also reflect emotional, connotational, conventional, and frequent patterns, all of which can be illogical or just wrong. While linguistic semantic patterns are correlated with logical patterns in some cases, this is simply not true in general.

mcfedr4mo ago

i like to think of LLMs as random number generators with a filter

rat99884mo ago

> Well, the first 90% is easy, the hard part is the second 90%.

You'd need to prove that this assertion applies here. I understand that you can't deduce the future gains rate from the past, but you also can't state this as universal truth.

bayindirh4mo ago

No, I don't need to. Self driving cars is the most recent and biggest example sans LLMs. The saying I have quoted (which has different forms) is valid for programming, construction and even cooking. So it's a simple, well understood baseline.

Knowledge engineering has a notion called "covered/invisible knowledge" which points to the small things we do unknowingly but changes the whole outcome. None of the models (even AI in general) can capture this. We can say it's the essence of being human or the tribal knowledge which makes experienced worker who they are or makes mom's rice taste that good.

Considering these are highly individualized and unique behaviors, a model based on averaging everything can't capture this essence easily if it can ever without extensive fine-tuning for/with that particular person.

enraged_camel4mo ago

>> No, I don't need to. Self driving cars is the most recent and biggest example sans LLMs.

Self-driving cars don't use LLMs, so I don't know how any rational analysis can claim that the analogy is valid.

>> The saying I have quoted (which has different forms) is valid for programming, construction and even cooking. So it's a simple, well understood baseline.

Sure, but the question is not "how long does it take for LLMs to get to 100%". The question is, how long does it take for them to become as good as, or better than, humans. And that threshold happens way before 100%.

1 more reply

rat99884mo ago

Self driving cars is not a proof. It only proves that having quick gains doesn't mean necessarily you'll get a 100% fast. It doesn't prove it will necessarily happen.

damethos4mo ago

"covered/invisible knowledge" aka tacit knowledge

1 more reply

thfuran4mo ago

>None of the models (even AI in general) can capture this

None of the current models maybe, but not AI in general? There’s nothing magical about brains. In fact, they’re pretty shit in many ways.

2 more replies

sanderjd4mo ago

I read the comment more as "based on past experience, it is usually the case that the first 90% is easier than the last 10%", which is the right base case expectation, I think. That doesn't mean it will definitely play out that way, but you don't have to "prove" things like this. You can just say that they tend to be true, so it's a good expectation to think it will probably be true again.

rybosworld4mo ago

The saying is more or less treated as a truism at this point. OP isn't claiming something original and the onus of proving it isn't on them imo.

I've heard this same thing repeated dozens of times, and for different domains/industries.

It's really just a variation of the 80/20 rule.

PunchyHamster4mo ago

Note that blog posts rarely show the 20 other times it failed to build something and only that time that it happened to work.

We've been having same progression with self driving cars and they are also stuck on the last 10% for last 5 years

redhaleOP4mo ago

I agree with your observation, but not your conclusion. The 20 times it failed basically don't matter -- they are branches that can just be thrown away, and all that was lost is a few dollars on tokens (ignoring the environmental impact, which is a different conversation).

As long as it can do the thing on a faster overall timeline and with less human attention than a human doing it fully manually, it's going to win. And it will only continue to get better.

And I don't know why people always jump to self-driving cars as the analogy as a negative. We already have self-driving cars. Try a Waymo if you're in a city that has them. Yes, there are still long-tail problems being solved there, and limitations. But they basically work and they're amazing. I feel similarly about agentic development, plus in most cases the failure modes of SWE agents don't involve sudden life and death, so they can be more readily worked around.

theshrike794mo ago

With "art" we're now at a situation where I can get 50 variations of a image prompt within seconds from an LLM.

Does it matter that 49 of them "failed"? It cost me fractions of a cent, so not really.

If every one of the 50 variants was drawn by a human and iterated over days, there would've been a major cost attached to every image and I most likely wouldn't have asked for 50 variations anyway.

It's the same with code. The agent can iterate over dozens of possible solutions in minutes or a few hours. Codex Web even has a 4x mode that gives you 4 alternate solutions to the same issue. Complete waste of time and money with humans, but with LLMs you can just do it.

sanderjd4mo ago

Yeah maybe, but personally it feels more like a plateau to me than an exponential takeoff, at the moment.

And this isn't a pessimistic take! I love this period of time where the models themselves are unbelievably useful, and people are also focusing on the user experience of using those amazing models to do useful things. It's an exciting time!

But I'm still pretty skeptical of "these things are about to not require human operators in the loop at all!".

throwthrowuknow4mo ago

I can agree that it doesn’t seem exponential yet but this is at least linear progression not a plateau.

sanderjd4mo ago

Linear progression feels slower (and thus more like a plateau) to me than the end of 2022 through end of 2024 period.

The question in my mind is where we are on the s-curve. Are we just now entering hyper-growth? Or are we starting to level out toward maturity?

It seems like it must still be hyper-growth, but it feels less that way to me than it did a year ago. I think in large part my sense is that there are two curves happening simultaneously, but at different rates. There is the growth in capabilities, and then there is the growth in adoption. I think it's the first curve that seems to be to have slown a bit. Model improvements seem both amazing and also less revolutionary to me than they did a year or two ago.

But the other curve is adoption, and I think that one is way further from maturity. The providers are focusing more on the tooling now that the models are good enough. I'm seeing "normies" (that is, non-programmers) starting to realize the power of Claude Code in their own workflows. I think that's gonna be huge and is just getting started.

Scea914mo ago

> - (~2023) Ok, it can write a full function, but it can't write a full feature.

The trend is definitely here, but even today, heavily depends on the feature.

While extra useful, it requires intense iteration and human insight for > 90% of our backlog. We develop a cybersecurity product.

EthanHeilman4mo ago

I haven't seen an AI successfully write a full feature to an existing codebase without substantial help, I don't think we are there yet.

> The only question is how long it takes to get there.

This is the question and I would temper expectations with the fact that we are likely to hit diminishing returns from real gains in intelligence as task difficulty increases. Real world tasks probably fit into a complexity hierarchy similar to computational complexity. One of the reasons that the AI predictions made in the 1950s for the 1960s did not come to be was because we assumed problem difficulty scaled linearly. Double the computing speed, get twice as good at chess or get twice as good at planning an economy. P, NP separation planed these predictions. It is likely that current predictions will run into similar separations.

It is probably the case that if you made a human 10x as smart they would only be 1.25x more productive at software engineering. The reason we have 10x engineers is less about raw intelligence, they are not 10x more intelligent, rather they have more knowledge and wisdom.

kubb4mo ago

Each of these years we’ve had a claim that it’s about to replace all engineers.

By your logic, does it mean that engineers will never get replaced?

fernandezpablo4mo ago

Starting back in 2022/2023:

- (~2022) "It's so over for developers". 2022 ends with more professional developers than 2021.

- (~2023) "Ok, now it's really over for developers". 2023 ends with more professional developers than 2022.

- (~2024) "Ok, now it's really, really over for developers". 2024 ends with more professional developers than 2023.

- (~2025) "Ok, now it's really, really, absolutely over for developers". 2025 ends with more professional developers than 2024.

- (~2025+) etc.

Sources: https://www.jetbrains.com/lp/devecosystem-data-playground/#g...

HarHarVeryFunny4mo ago

Sure, eventually we'll have AGI, then no worries, but in the meantime you can only use the tools that exist today, and dreaming about what should be available in the future doesn't help.

I suspect that the timeline from autocomplete-one-line to autocomplete-one-app, which was basically a matter of scaling and RL, may in retrospect turn out to have been a lot faster that the next LLM to AGI step where it becomes capable of using human level judgement and reasoning, etc, to become a developer, not just a coding tool.

itsthecourier4mo ago

I use it on a 10 years codebase, needs to explain where to get context but successfully works 90% of time

mjr004mo ago

This is disingenuous because LLMs were already writing full, simple applications in 2023.[0]

They're definitely better now, but it's not like ChatGPT 3.5 couldn't write a full simple todo list app in 2023. There were a billion blog posts talking about that and how it meant the death of the software industry.

Plus I'd actually argue more of the improvements have come from tooling around the models rather than what's in the models themselves.

[0] eg https://www.youtube.com/watch?v=GizsSo-EevA

blitz_skull4mo ago

What LLM were you using to build full applications in 2023? That certainly wasn’t my experience.

mjr004mo ago

Just from googling, here's a video "Use ChatGPT to Code a Full Stack App" from May 18, 2023.[0]

There's a lot of non-ergonomic copy and pasting but it's definitely using an LLM to build a full application.

[0] https://www.youtube.com/watch?v=GizsSo-EevA

blitz_skull4mo ago

That's not at all what's being discussed in this article. We copy-pasted from SO before this. This article is talking about 99% fully autonomous coding with agents, not copy-pasting 400 times from a chat bot.

1 more reply

ugurs4mo ago

Ok, it can create a long-lived complex codebase for a product that is extensible and scalable over the long term, but it doesn't have cool tattoos and can't fancy a matcha

j / k navigate · click thread line to collapse

0 comments

arkensaw4mo ago

> It's pretty clear to me where this is going. The only question is how long it takes to get there.

namrog844mo ago

While I do agree with you. To play the counterpoint advocate though.

What if we get to the point where all software is basically created 'on the fly' as greenfield projects as needed? And you never need to have complex large long lived codebase?

It is probably incredibly wasteful, but ignoring that, could it work?

fwip4mo ago

That sounds like an insane way to do anything that matters.

theshrike794mo ago

In Japan buildings (apartments) aren't built to last forever. They are built with a specific age in mind. They acknowledge the fact that houses are depreciating assets which have a value lim->0.

[0] https://www.joelonsoftware.com/2000/04/06/things-you-should-...

[1] https://simonwillison.net/2025/Dec/15/porting-justhtml/

3 more replies

techblueberry4mo ago

But then maybe this means what is a "codebase". If a code base is just a structured set of specs that compile to code ala typescript -> javascript. sure, but then, it's still a long-lived <blank>

But maybe you would have to elaborate on, what does "creating software on the fly" look like,. because I'm sure there's a definition where the answer is yes.

damethos4mo ago

I have the same questions in my head lately.

bayindirh4mo ago

Well, the first 90% is easy, the hard part is the second 90%.

Case in point: Self driving cars.

Also, consider that we need to pirate the whole internet to be able to do this, so these models are not creative. They are just directed blenders.

throwthrowuknow4mo ago

emodendroket4mo ago

bayindirh4mo ago

Of course the tech will be useful and ethical if these problems are solved or decided to be solved the right way.

ForHackernews4mo ago

We just need to tax the hell out of the AI companies (assuming they are ever profitable) since all their gains are built on plundering the collective wisdom of humanity.

2 more replies

literalAardvark4mo ago

They're not blenders.

This is clear from the fact that you can distill the logic ability from a 700b parameter model into a 14b model and maintain almost all of it.

You just lose knowledge, which can be provided externally, and which is the actual "pirated" part.

The logic is _learned_

encyclopedism4mo ago

It hasn't learned any LOGIC. It has 'learned' patterns from the input.

theshrike794mo ago

What is logic other than applying patterns?

1 more reply

bayindirh4mo ago

Are there any recent publications about it so I can refresh myself on the matter?

D-Machine4mo ago

You won't find any trustworthy papers on the topic because GP is simply wrong here.

mcfedr4mo ago

i like to think of LLMs as random number generators with a filter

rat99884mo ago

> Well, the first 90% is easy, the hard part is the second 90%.

You'd need to prove that this assertion applies here. I understand that you can't deduce the future gains rate from the past, but you also can't state this as universal truth.

bayindirh4mo ago

enraged_camel4mo ago

>> No, I don't need to. Self driving cars is the most recent and biggest example sans LLMs.

Self-driving cars don't use LLMs, so I don't know how any rational analysis can claim that the analogy is valid.

>> The saying I have quoted (which has different forms) is valid for programming, construction and even cooking. So it's a simple, well understood baseline.

1 more reply

rat99884mo ago

Self driving cars is not a proof. It only proves that having quick gains doesn't mean necessarily you'll get a 100% fast. It doesn't prove it will necessarily happen.

damethos4mo ago

"covered/invisible knowledge" aka tacit knowledge

1 more reply

thfuran4mo ago

>None of the models (even AI in general) can capture this

None of the current models maybe, but not AI in general? There’s nothing magical about brains. In fact, they’re pretty shit in many ways.

2 more replies

sanderjd4mo ago

rybosworld4mo ago

The saying is more or less treated as a truism at this point. OP isn't claiming something original and the onus of proving it isn't on them imo.

I've heard this same thing repeated dozens of times, and for different domains/industries.

It's really just a variation of the 80/20 rule.

PunchyHamster4mo ago

Note that blog posts rarely show the 20 other times it failed to build something and only that time that it happened to work.

We've been having same progression with self driving cars and they are also stuck on the last 10% for last 5 years

redhaleOP4mo ago

As long as it can do the thing on a faster overall timeline and with less human attention than a human doing it fully manually, it's going to win. And it will only continue to get better.

theshrike794mo ago

With "art" we're now at a situation where I can get 50 variations of a image prompt within seconds from an LLM.

Does it matter that 49 of them "failed"? It cost me fractions of a cent, so not really.

If every one of the 50 variants was drawn by a human and iterated over days, there would've been a major cost attached to every image and I most likely wouldn't have asked for 50 variations anyway.

sanderjd4mo ago

Yeah maybe, but personally it feels more like a plateau to me than an exponential takeoff, at the moment.

But I'm still pretty skeptical of "these things are about to not require human operators in the loop at all!".

throwthrowuknow4mo ago

I can agree that it doesn’t seem exponential yet but this is at least linear progression not a plateau.

sanderjd4mo ago

Linear progression feels slower (and thus more like a plateau) to me than the end of 2022 through end of 2024 period.

The question in my mind is where we are on the s-curve. Are we just now entering hyper-growth? Or are we starting to level out toward maturity?

Scea914mo ago

> - (~2023) Ok, it can write a full function, but it can't write a full feature.

The trend is definitely here, but even today, heavily depends on the feature.

While extra useful, it requires intense iteration and human insight for > 90% of our backlog. We develop a cybersecurity product.

EthanHeilman4mo ago

I haven't seen an AI successfully write a full feature to an existing codebase without substantial help, I don't think we are there yet.

> The only question is how long it takes to get there.

kubb4mo ago

Each of these years we’ve had a claim that it’s about to replace all engineers.

By your logic, does it mean that engineers will never get replaced?

fernandezpablo4mo ago

Starting back in 2022/2023:

- (~2022) "It's so over for developers". 2022 ends with more professional developers than 2021.

- (~2023) "Ok, now it's really over for developers". 2023 ends with more professional developers than 2022.

- (~2024) "Ok, now it's really, really over for developers". 2024 ends with more professional developers than 2023.

- (~2025) "Ok, now it's really, really, absolutely over for developers". 2025 ends with more professional developers than 2024.

- (~2025+) etc.

Sources: https://www.jetbrains.com/lp/devecosystem-data-playground/#g...

HarHarVeryFunny4mo ago

Sure, eventually we'll have AGI, then no worries, but in the meantime you can only use the tools that exist today, and dreaming about what should be available in the future doesn't help.

itsthecourier4mo ago

I use it on a 10 years codebase, needs to explain where to get context but successfully works 90% of time

mjr004mo ago

This is disingenuous because LLMs were already writing full, simple applications in 2023.[0]

Plus I'd actually argue more of the improvements have come from tooling around the models rather than what's in the models themselves.

[0] eg https://www.youtube.com/watch?v=GizsSo-EevA

blitz_skull4mo ago

What LLM were you using to build full applications in 2023? That certainly wasn’t my experience.

mjr004mo ago

Just from googling, here's a video "Use ChatGPT to Code a Full Stack App" from May 18, 2023.[0]

There's a lot of non-ergonomic copy and pasting but it's definitely using an LLM to build a full application.

[0] https://www.youtube.com/watch?v=GizsSo-EevA

blitz_skull4mo ago

1 more reply

ugurs4mo ago

Ok, it can create a long-lived complex codebase for a product that is extensible and scalable over the long term, but it doesn't have cool tattoos and can't fancy a matcha

j / k navigate · click thread line to collapse