Write-only code (opens in new tab)

(heavybit.com)

35 pointsPretzelFisch1mo ago54 comments

54 comments

> The role of the human engineer […] has been to reduce risk in the face of ambiguity, constraints, and change. That responsibility not only endures in a world of Write-Only Code, if anything it expands.

> The next generation of software engineering excellence will be defined not by how well we review the code we ship, but by how well we design systems that remain correct, resilient, and accountable even when no human ever reads the code that runs in production.

As a mechanical engineer, I have learned how to design systems that meet your needs. Many tools are used in this process that you cannot audit by yourself. The industry has evolved to the point that there are many checks at every level, backed by standards, governing bodies, third parties, and so on. Trust is a major ingredient, but it is institutionalized. Our entire profession relies on the laws of physics and mathematics. In other words, we have a deterministic system where every step is understood and cast into trust in one way or another. The journey began with the Industrial Revolution and is never-ending; we are always learning and improving.

Given what I have learned and read about LLM-based technology, I don't think it's fit for the purpose you describe as a future goal. Technology breakthroughs will be evaluated retrospectively, and we are in the very early stages right now. Let's evaluate again in 20 years, but I doubt that "write-only code" without human understanding is the way forward for our civilization.

jopsen1mo ago

Would I use a write-only HTML sanitizer for untrusted HTML: No!

Would I care to review CSS, if my site "looks" good? No!

The challenge becomes: how can we enforce invariants/abstractions etc without inspecting the code.

Type systems, model checking, static analysis. Could become new power tools.

But sound design probably still goes far.

skznnz1mo ago

> Could become new power tools.

If this worked, it’d have worked on low cost devs already. We’ve had the ability to produce large amounts of cheap code (more than any dev can review) for a long time.

The root issue is it’s much faster to do something yourself if you can’t trust the author to do it right. Especially since you can use an LLM to speed up your understanding.

1 more reply

jeffreygoesto1mo ago

I am old. This immediately triggered "Perl!?" in me...

Joke aside: Programming languages and compilers are still being optimized until the assembly and execution match certain expectations. So prompts and whatever inputs to AI also will be optimized until some expectations are met. This includes looking at their output, obviously. So I think this is an overblown extrapolation like many we see these days.

aidos1mo ago

Ha! I had someone doing a task the other day and the llm they used wrote a regex in Perl. I joked that in 25 years all the seems to have changed is the number of layers between me and Perl.

mooreds1mo ago

> I am old. This immediately triggered "Perl!?" in me...

Same same.

cozzyd1mo ago

I thought it would be about APL

wolfi11mo ago

me too

okanat1mo ago

I am not old but started programming early. It also triggered "Perl??" and "PHP??".

teyopi1mo ago

I’ve started doing a quick background check on authors before I dive into their content. This piece starts with the assumption that the writer is closely involved in engineering, but a little research reveals they don't actually work in active software development.

I’ll pass on this.

p.s. I’m happy to read authors with opposing views. Issue is with people who make claims, without having recent direct experience.

skznnz1mo ago

A better test is to see if the author stands to financially (or other ways) benefit from the posts future predictions coming true. They also fail this one.

teyopi1mo ago

That is also an influence, although that has a risk if one over-indexes on it and getting into tunnel-vision territory.

If the opposing view is indeed correct and I dismiss them just because they voted with their feet or money, that would be unfair and damage building a diverse view of the debate landscape and opinions.

medstrom1mo ago

But you can be a software dev even if you do not work in software dev. Plenty of those individuals in open source, for example.

teyopi1mo ago

writing software for a hobby is different from hundreds of thousand that do it 40+ hours per week, go into planning into retros into milestone review meetings etc.

I am painting in my free time as a hobby. I do not think I am an authority or should be taken seriously when taking about impact of AI on artists.

1 more reply

MrEldritch1mo ago

> Write-Only Code is not a prediction about what we should want. It is a description of what happens when software production scales beyond human attention.

Have we considered whether it's even a good idea to produce software at scales beyond human attention? I'm beginning to suspect that, in terms of the net amount of economic effort and sheer quantity of software produced, we are already creating simply too much software relative to the amount of economic effort we put into hardware, construction, and human capital. Most human needs and desires can only be met through manipulation of atoms, and it seems as though we've largely refocused on those which can be met through manipulation of numbers and symbols - not because anyone really wants their life to revolve around them to the exclusion of everything else - but because they're the easiest markets to profitably scale for the least amount of capital input.

maybewhenthesun1mo ago

I helped updating ancient fortran to slightly less ancient C once. The company that depended on the ancient fortran no longer had any fortran programmers.

The resulting software upgrade was a nightmare that nearly killed that company. I shudder if someone needs to fix 20 year old AI write only code and I feel for the poor AI that has to do it. Because an AI 'intelligent' enough to do that deserves holidays and labor rights.

geysersam1mo ago

I imagine, only slightly extrapolating current AI trends, that in 20 years most codebases can be easily modified by AI. I'd even say they are especially well suited to such tasks that typically don't require extremely abstract and complex logic, or imagination, but rather "just" a huge attention span and a lot of work.

marginalia_nu1mo ago

Tricky bit with ancient codebases is that their only requirements, generally speaking, tend to be that they should keep working exactly like they have been since 1983, except for the bit that needs changing of course, that needs to change in a way that implements the change, but doesn't have any unintended side-effects in a system that is a fractal of undocumented interdependencies (systems that have been patched for a few decades under those types of constraints tend to become especially gnarly that way).

demaga1mo ago

Do you mind sharing how exactly it turned out to be a disaster? What went wrong?

maybewhenthesun21d ago

Nothing went 'wrong' per se. they just massively underestimated how much work it would be.

Stuff was badly documented. The documentation that existed was outdated compared to the existing code. Not all executable matched the source code they were supposedly built from. It was unclear if this was because of old compiler bugs, new compiler bugs or (probably) because the source code in the backups was not the actual source code used to compile the exes and that sourcecode was long since lost to time. etc etc

I can't remember the details, it's 20 years ago.

1 more reply

omoikane1mo ago

I remember when everything was "machine learning" as opposed to the current LLM stuff. Some of the machine learning techniques involve training and using models that are more or less opaque, and nobody looked at what was inside those models because you can't understand them anyway.

Once LLM generated code becomes large enough that it's infeasible to review, it will feel just like those machine learning models. But this time around, instead of trying to convince other people who were downstream of the machine learning output, we are trying to convince ourselves that "yes we don't fully understand it, but don't worry it's statistically correct most of the time".

chrysoprace1mo ago

We tried adoption of Claude Code at my last job as our PM was quite excited about its potential and I saw some horrors that were raised during code review (where I thankfully read the code):

- Using private undocumented APIs subject to breaking changes;

- Removing a call to produce a message to an external topic, with no internal references besides tests, declaring it as "redundant" (it was very much not redundant);

- Repeated duplication of formatting, calculation, and permission logic, causing inconsistencies and bugs.

They met whatever goal the agent created for itself, but they would've broken several features in ways that would've been difficult to test for, either today or in the future.

williamstein1mo ago

He is trying to use a different phrase “write-only code” to define exactly the same thing Karpathy defined last year as “vibe coding”.

For what it is worth, in my experience one of the most important skills one should strive to get much better at to be good at using coding agents is reading and understanding code.

resonious1mo ago

I think there's some value in pure vibe coding. To your point though, the best way to extract that value is to know intimately at which point the agents tend to break down in quality, which you can only do if you read a lot of their output. But once you reach that level, you can gain a lot by "one-shotting" tasks that you know are within their capacity, and isolating the result from the rest of your project.

paulryanrogers1mo ago

How often do the AIs devolve at some task? Or does switching models make those assumptions inaccurate?

jopsen1mo ago

> good at using coding agents is reading and understanding code.

You can understand the code using an agent; it's much faster than reading the code.

I think the argument the author is making is that: given this magic oracle that make code, how we so contain and control it.

This is about abstractions and invariants and those will remain important.

YokoZar1mo ago

I thought "vibe coding" had come to mean "I used an LLM to generate this code", but didn't really imply we'd given up trying to review and read the output. The author is taking it one-step further by suggesting we not bother with the latter.

vunderba1mo ago

It's true that the meaning of "vibe coding" has been somewhat diluted - but the original definition as set forth by Karpathy was to forget that the code even exists (no review, no reading the commits, nothing).

https://xcancel.com/karpathy/status/1886192184808149383?lang...

FartyMcFarter1mo ago

A lot of people doing vibe coding can barely (or not at all) understand how to read code.

1 more reply

medstrom1mo ago

I, too, appreciate the clarification in the term "write-only code".

pron1mo ago

This future may come someday, but it's not here yet, not even with the post-late-'25 models. It assumes that agents can write reasonable code reliably, which they're currently far from doing (even 95% isn't enough, and I don't think we're at 95%).

Anthropic's C compiler experiment showed that even in a situation where people give the agent every imaginable advantage (above and beyond what they can reasonably do in most projects), i.e provide not only a very precise specificaton but also thousands of tests, a reference implementation to use as an oracle, and have the model trained on the reference implementaition - years of "preparation" effort - and all the agent has to do is just code, it still fails on a task that's certainly not trivial but also by no means monumental.

A lot of writing about agentic coding seems to assume that today's agent have coding down whereas the experience of anyone using them across different kinds of software work as well as tests by the labs themselves show that this is not yet true.

avidiax1mo ago

We already have effectively write-only code. It's all the assembly language and eventually machine code that we produce from other languages.

The problem with "write-only code" as it relates to LLMs is that we don't have a formal definition of the input the the LLM, nor do people typically save the requirements, both implicit and explicit, that were given the LLM to generate the code. English language will never be a formal definition, of course, but that obviously doesn't prevent the creation of a formal definition from English nor reduce the value of the informal description.

This is very similar to the problem of documentation in software development. It is difficult to enumerate all the requirements, remember all the edge cases, recall why a certain thing was done in a certain way. So computer programs are almost never well documented.

If you knew that you currently have a bunch of junior developers, and next year you will replace all of them with senior developers who could rewrite everything the junior developers did, but taking only a day, how would that affect your strategy documenting the development work and customer/technical requirements? Because that's what you have with current LLMs and coding agents. They are currently the worst that they'll ever be.

So there are two compelling strategies:

1) business as usual, i.e. not documenting things rigorously, and planning to hack in whatever new features or bugfixes you need until that becomes unsustainable and you reboot the project.

2) trying to use the LLM to produce as much and as thorough documentation and tests as possible, such that you have a basis to rewrite the project from scratch. This won't be a cheap operation at first, (you will usually do strategy #1), but eventually the LLMs and the tooling around managing them will improve such that a large rewrite or rearchitecture costs <$10k and a weekend of passive time.

seanmcdirmid1mo ago

Is that really true though? Are people really not saving design documentation to the code repository along with the code? And is it really too much to ask in a prompt to make the LLM document aggressively? Do the LLMs you use complain about being asked to write lots of comments in their code? Is it a token cost thing? It seems ridiculous not to just start with #2 but I might be spoiled by not knowing how much my token usage costs.

avidiax1mo ago

It requires real work to create good documentation and review it.

I've never worked in a place that requires that every commit update some documentation, but if you want to rebuild software based on the documentation, that's what it would take.

The best you could say is that the typical development process today tends to scatter documentation across commit descriptions, feature docs, design reviews, meeting notes, training materials, and source code comments.

To have a hands-off rewrite of a codebase with LLMS, you would need a level of documentation that allows a skilled human to do the rewrite without needing to talk to anyone else. I doubt that any project would have this unless it was required all along.

1 more reply

pconstantine29d ago

It will never work. Code is the documentation. The difference between Code -> IR -> binary is that the transition is deterministic. Doc -> Code is non-deterministic asf. And will never be. The only reason it works and we have senior, middle, junior engineers is the fact that assumptions on that path that senior engineers make are better than the ones junior engineers make. LLMs cannot make any assumptions though, as they have zero real world knowledge and thus zero common sense.

c-fe1mo ago

> In the Write-Only Code story, that same engineer becomes a systems designer, a constraint writer, and a trade-off manager.

This is also what I see my job to be shifting towards, increasingly fast in recent weeks. I wonder how long we will stay in this paradigm, I dont know.

philipwhiuk1mo ago

> “AI writes the code” is already true inside many enterprise teams

I'm highly doubtful this is true. Adoption isn't even close to the level necessary for this to be the case.

1 more reply

lowsong1mo ago

> LLMs are clearly a massive productivity boost for software developers, and the value of humans manually translating intent into lines of code is rapidly depreciating.

This take is so divorced from reality it's hard to take any of this seriously. The evidence continues to show that LLMs for coding only make you feel more productive, while destroying productivity and eroding your ability to learn.

logicprog1mo ago

Re productivity: the METR study is seriously flawed overall, and:

1. if you disaggregate the highly aggregated data, it shows that the slowdown was highly dependent on task type, and tasks that required using documentation or novel tasks were possibly sped up, whereas ones the developers were very experienced with were slowed down, which actually matched the developers' own reports

2. developers were asked to estimate time beforehand per-task, but estimate whether they were sped up or slowed down only once, afterwards, so you're not really measuring the same thing

3. There were no rules about which AI to use, how to use it, or how much to use it, so it's hard to draw a clear conclusion

4. Most participants didn't have much experience with the AI tools they used (just prompting chatbots), and the one that did had a big productivity boost

5. It isn't an RCT.

See [1] for all.

The Anthropic study was using a task far too short to really measure productivity (30 mins), and furthermore the AI users were using chatbots, and spent the vast majority of their time manually retyping AI outputs, and if you ignore that time, AI users were 25% faster[2], so the study was not a good study to judge productivity, and the way people quote it is deeply misleading.

Re learning: the Anthropic study shows that how you use AI massively changes whether you learn and how well you learn; some of the best scoring subjects in that study were ones who had the AI do the work for them, but then explain it afterward[3].

[1]: https://www.fightforthehuman.com/are-developers-slowed-down-... [2]: https://www.seangoedecke.com/how-does-ai-impact-skill-format... [3]: https://www.anthropic.com/research/AI-assistance-coding-skil...

pconstantine1mo ago

> We observed that participants who had access to the AI assistant were more likely to introduce security vulnerabilities for the majority of programming tasks, yet were also more likely to rate their insecure answers as secure compared to those in our control group.

https://arxiv.org/pdf/2211.03622

EOD

petetnt1mo ago

That’s the conclusion you get when you sit in the board of 20 companies where all the CEOs are telling you the same thing but you don’t understand that you are all just selling the same golden shovel to each other. Obviously this can also be backed by their own experiences too: 100% of code is written by AI, because last time they actually wrote code was in 2010.

botusaurus1mo ago

Your comment is so divorced from reality...

p0w3n3d1mo ago

This article assumes that AI bubble will never break, and the prices will never go up. While the major LLM providers are located in USA and China, but mostly USA I can imagine the service becoming a hostage for some political interests, and being blocked for a country or more... This also can become a weapon as creation of invisibly malicious code might be advised by LLMs or even executed by LLMs running loose... So I'd say, copying some other author which I read here, that LLM is an exoskeleton and we're not going anywhere, just being strengthened by it and sped up, and constantly held accountable for the code we together with LLM generate

ddoottddoott1mo ago

Write-only blog posts.

dana3211mo ago

Only read by ai crawlers to be reused in training data.

svilen_dobrev1mo ago

write-once read-never?

something that was not perl ;)

in ~2005 i lead a team to build horse-betting terminals for Singapore, and there server could only understand CORBA. So.. i modelled the needed protocol in python, which generated a set of specific python files - one per domain - which then generated the needed C folders-of-files. Like 500 lines of models -> 5000 lines 2nd level -> 50000 lines C at bottom. Never read that (once the pattern was established and working).

But - but - it was 1000% controllable and repeatable. Unlike current fancy "generators"..

aatd861mo ago

There will be more of it where it does not matter. Maybe eventually with times. At the moment, in my experience, most systems rely on hyperlinear semantics. Especially scalable ones. Current llms cannot physically handle this at the moment. Maybe with biological or quantum (sic) computing.

But even then it is quite impressive.

Concretely in my use case, off of a manual base of code, having claude has the planner and code writer and GPT as the reviewer works very well. GPT is somehow better at minutiae and thinking in depth. But claude is a bit smarter and somehow has better coding style.

Before 4.5, GPT was just miles ahead.

j / k navigate · click thread line to collapse

54 comments

smartmic1mo ago

jopsen1mo ago

Would I use a write-only HTML sanitizer for untrusted HTML: No!

Would I care to review CSS, if my site "looks" good? No!

The challenge becomes: how can we enforce invariants/abstractions etc without inspecting the code.

Type systems, model checking, static analysis. Could become new power tools.

But sound design probably still goes far.

skznnz1mo ago

> Could become new power tools.

If this worked, it’d have worked on low cost devs already. We’ve had the ability to produce large amounts of cheap code (more than any dev can review) for a long time.

The root issue is it’s much faster to do something yourself if you can’t trust the author to do it right. Especially since you can use an LLM to speed up your understanding.

1 more reply

jeffreygoesto1mo ago

I am old. This immediately triggered "Perl!?" in me...

aidos1mo ago

Ha! I had someone doing a task the other day and the llm they used wrote a regex in Perl. I joked that in 25 years all the seems to have changed is the number of layers between me and Perl.

mooreds1mo ago

> I am old. This immediately triggered "Perl!?" in me...

Same same.

cozzyd1mo ago

I thought it would be about APL

wolfi11mo ago

me too

okanat1mo ago

I am not old but started programming early. It also triggered "Perl??" and "PHP??".

teyopi1mo ago

I’ll pass on this.

p.s. I’m happy to read authors with opposing views. Issue is with people who make claims, without having recent direct experience.

skznnz1mo ago

A better test is to see if the author stands to financially (or other ways) benefit from the posts future predictions coming true. They also fail this one.

teyopi1mo ago

That is also an influence, although that has a risk if one over-indexes on it and getting into tunnel-vision territory.

medstrom1mo ago

But you can be a software dev even if you do not work in software dev. Plenty of those individuals in open source, for example.

teyopi1mo ago

writing software for a hobby is different from hundreds of thousand that do it 40+ hours per week, go into planning into retros into milestone review meetings etc.

I am painting in my free time as a hobby. I do not think I am an authority or should be taken seriously when taking about impact of AI on artists.

1 more reply

MrEldritch1mo ago

> Write-Only Code is not a prediction about what we should want. It is a description of what happens when software production scales beyond human attention.

maybewhenthesun1mo ago

I helped updating ancient fortran to slightly less ancient C once. The company that depended on the ancient fortran no longer had any fortran programmers.

geysersam1mo ago

marginalia_nu1mo ago

demaga1mo ago

Do you mind sharing how exactly it turned out to be a disaster? What went wrong?

maybewhenthesun21d ago

Nothing went 'wrong' per se. they just massively underestimated how much work it would be.

I can't remember the details, it's 20 years ago.

1 more reply

omoikane1mo ago

chrysoprace1mo ago

We tried adoption of Claude Code at my last job as our PM was quite excited about its potential and I saw some horrors that were raised during code review (where I thankfully read the code):

- Using private undocumented APIs subject to breaking changes;

- Removing a call to produce a message to an external topic, with no internal references besides tests, declaring it as "redundant" (it was very much not redundant);

- Repeated duplication of formatting, calculation, and permission logic, causing inconsistencies and bugs.

They met whatever goal the agent created for itself, but they would've broken several features in ways that would've been difficult to test for, either today or in the future.

williamstein1mo ago

He is trying to use a different phrase “write-only code” to define exactly the same thing Karpathy defined last year as “vibe coding”.

For what it is worth, in my experience one of the most important skills one should strive to get much better at to be good at using coding agents is reading and understanding code.

resonious1mo ago

paulryanrogers1mo ago

How often do the AIs devolve at some task? Or does switching models make those assumptions inaccurate?

jopsen1mo ago

> good at using coding agents is reading and understanding code.

You can understand the code using an agent; it's much faster than reading the code.

I think the argument the author is making is that: given this magic oracle that make code, how we so contain and control it.

This is about abstractions and invariants and those will remain important.

YokoZar1mo ago

vunderba1mo ago

https://xcancel.com/karpathy/status/1886192184808149383?lang...

FartyMcFarter1mo ago

A lot of people doing vibe coding can barely (or not at all) understand how to read code.

1 more reply

medstrom1mo ago

I, too, appreciate the clarification in the term "write-only code".

pron1mo ago

avidiax1mo ago

We already have effectively write-only code. It's all the assembly language and eventually machine code that we produce from other languages.

So there are two compelling strategies:

1) business as usual, i.e. not documenting things rigorously, and planning to hack in whatever new features or bugfixes you need until that becomes unsustainable and you reboot the project.

seanmcdirmid1mo ago

avidiax1mo ago

It requires real work to create good documentation and review it.

I've never worked in a place that requires that every commit update some documentation, but if you want to rebuild software based on the documentation, that's what it would take.

1 more reply

pconstantine29d ago

c-fe1mo ago

> In the Write-Only Code story, that same engineer becomes a systems designer, a constraint writer, and a trade-off manager.

This is also what I see my job to be shifting towards, increasingly fast in recent weeks. I wonder how long we will stay in this paradigm, I dont know.

philipwhiuk1mo ago

> “AI writes the code” is already true inside many enterprise teams

I'm highly doubtful this is true. Adoption isn't even close to the level necessary for this to be the case.

1 more reply

lowsong1mo ago

> LLMs are clearly a massive productivity boost for software developers, and the value of humans manually translating intent into lines of code is rapidly depreciating.

logicprog1mo ago

Re productivity: the METR study is seriously flawed overall, and:

2. developers were asked to estimate time beforehand per-task, but estimate whether they were sped up or slowed down only once, afterwards, so you're not really measuring the same thing

3. There were no rules about which AI to use, how to use it, or how much to use it, so it's hard to draw a clear conclusion

4. Most participants didn't have much experience with the AI tools they used (just prompting chatbots), and the one that did had a big productivity boost

5. It isn't an RCT.

See [1] for all.

pconstantine1mo ago

https://arxiv.org/pdf/2211.03622

EOD

petetnt1mo ago

botusaurus1mo ago

Your comment is so divorced from reality...

p0w3n3d1mo ago

ddoottddoott1mo ago

Write-only blog posts.

dana3211mo ago