I'm going back to writing code by hand (opens in new tab)

(blog.k10s.dev)

995 pointsdropbox_miner15d ago601 comments

601 comments

pron14d ago

Yep. The only people I've heard saying that generated code is fine are those who don't read it.

The problem is that the mitigations offered in the article also don't work for long. When designing a system or a component we have ideas that form invariants. Sometimes the invariant is big, like a certain grand architecture, and sometimes it’s small, like the selection of a data structure. You can tell the agent what the constraints are with something like "Views do NOT access other views' state" as the post does.

Except, eventually, you'll want to add a feature that clashes with that invariant. At that point there are usually three choices:

- Don’t add the feature. The invariant is a useful simplifying principle and it’s more important than the feature; it will pay dividends in other ways.

- Add the feature inelegantly or inefficiently on top of the invariant. Hey, not every feature has to be elegant or efficient.

- Go back and change the invariant. You’ve just learnt something new that you hadn’t considered and puts things in a new light, and it turns out there’s a better approach.

Often, only one of these is right. Often, at least one of these is very, very wrong, and with bad consequences.

Picking among them isn’t a matter of context. It’s a matter of judgment, and the models - not the harnesses - get this judgment wrong far too often. I would say no better than random chance.

Even if you have an architecture in mind, and even if the agent follows it, sooner or later it will need to be reconsidered. What I've seen is that if you define the architectural constraints, the agent writes complex, unmaintainable code that contorts itself to it when it needs to change. If you don't read what the agent does very carefully - more carefully than human-written code because the agent doesn't complain about contortious code - you will end up with the same "code that devours itself", only you won't know it until it's too late.

perarneng14d ago

If you know how to write good code you can force AI to write good code with various techniques. It's 100% doable. You just need to figure out the problems AI has and find solutions to make it easier for it. Ex: extremely small contexts Modularize to modules with clear boundaries and only allow the AI to work within those boundaries. Make modules pure from IO so they are easily testable. Hide modules behind interfaces etc .. You can write 100 tests that executes within a second. You can write benchmarks etc .. AI needs boundaries and small contexts to work well. If you fail to give it that it will perform poorly. You are in charge.

pron14d ago

That doesn't quite work, and precisely for the reason I mentioned: You can definitely tell the AI to follow some strategy, but at some point the strategy will need to change, and the AI won't tell you that (even if you tell it to). Unless you read the code every time you won't know if the AI is following the strategy and producing good results or following it and producing bad results because the strategy has to change. This can happen even in small changes: the AI will follow the strategy even if the change proves it's wrong, and if you don't pay close attention, these mistakes pile up.

So yes, you might get good results in one round, but not over time. What does work is to carefully review the AI's output, although the review needs to be more careful than review of human-written code because the agents are very good at hiding the time bombs they leave behind.

lukan14d ago

How do you define "bad code"?

If I instruct the AI to make small modules where I can verify they work, have tests and no side effects - then it is good enough code for me. It works, is readable and can be extended - and will turn into bad code if this is not done with care.

4 more replies

IdiotSavage14d ago

So, basically you need to micro-manage it. Where are your 10x gains now? And is it fun to work like that?

sirwhinesalot14d ago

This is actually what I do. I'm extremely picky about the code and force the LLM to rewrite it 1000x times until it is basically exactly what I want. You might be wondering what is the point when it would be faster for me to just write the code myself?

I have ADHD and for whatever reason telling the LLM what to do instead of doing it myself bypasses the task avoidance patterns and/or focus problems I tend to suffer from. I do not find it fun, but I am thankful for it.

3 more replies

readitalready14d ago

I don't micromanage it. I let my projects custom linter micromanage it.

Every project should have a custom linter for their tech stack. It would check for not just syntax errors, but architectural choices as well as taste guidelines.

Whenever the LLM writes bad code, I add it to my linter to check against in the future.

andriy_koval14d ago

> So, basically you need to micro-manage it. Where are your 10x gains now? And is it fun to work like that?

it depends on language and infra, but some/many require lots of boilerplate and memorizing thousands of APIs, automating this is easy LLM 10x gain.

I for example write SQL myself, because boilerplate is super-minimal, and core SQL is very minimal itself, there are like 20 constructs to memorize.

hansmayer14d ago

Amen. Instead of freeing you up - AI enslaves you - and if it was even enslaving to a superior being at least!

nijave14d ago

Honestly, I think so. I do a mix of infrastructure and programming so don't tend to have any frameworks memorized. Using AI is much quicker than constantly referencing the docs.

I can also switch between codebase with different frameworks and languages and make changes without spending all day reading docs.

It's also pretty good at tracing code and that's fairly straight forward to verify the results manually. It can build a flow diagram in 10-30 minutes (depending on what tool calls need allowed and how many prompts it needs) versus me taking a couple hours to do the same.

forgotaccount314d ago

> you need to micro-manage it.

It is significantly easier to micro-manage an AI than a suite of junior developers. The AI doesn't replace a principal engineer, it's replacing junior and weaker senior developers who need stories broken down extremely concisely to be able to get anything done. The time it takes to break down a story such that a junior through weak senior developers can pick it up and execute it well would have the AI already done with testing built around it.

2 more replies

wombat-man14d ago

Yeah I agree. It's improved quite a bit just in the past few months. The code should always be reviewed, and you need to spend some time tuning your skills and agent configs. If you're still getting bad code out of your LLM tooling, you might not be using or configuring it correctly.

hansmayer14d ago

> You are in charge.

No, if you have to do all of the stuff you have listed to kind-of-make-it-work...You are not in charge.

insane_dreamer14d ago

> You are in charge.

Sure. That's how I work with AI, and the way I believe that AI is meant to be use -- as a companion tool.

But it's a lot of work. It saves me time for certain tasks, but not others. I haven't measured my productivity gains, but they're at most 2x.

But that's not "vibe coding" (which was the point of the article) or the (false) promise of "10x productivity" and "code that writes itself" that companies are being told is going to reduce their engineering headcount tenfold.

Zach_the_Lizard14d ago

I agree with this. I've been writing a new internal framework at work and migrating consumers of the old framework to the new one.

I had strong principles at the outset of the project and migrated a few consumers by hand, which gave me confidence that it would work. The overall migration is large and expensive enough that it has been deferred for nearly a decade. Bringing down the cost of that migration made me turn to AI to accelerate it.

I found that it was OK at the more mechanical and straightforward cases, which are 80% of the use cases, to be fair. The remaining 20% need changes to the framework. Most of them need very small changes, such as an extra field in an API, but one or two require a partial conceptual redesign.

To over simplify the problem, the backend for one system can generate certain data in 99% of cases. In a few critical cases, it logically cannot, and that data must be reported to it. Some important optimizations were made with the assumption that this would be impossible.

The AI tooling didn't (yet) detect this scenario and happily added migration logic assuming it would work properly.

Now, because of how this is being rolled out, this wasn't a production bug or anything (yet). However, asking the right questions to partner teams revealed it and unearthed that some others were going to need it as well.

Ultimately, it isn't a big problem to solve in a way that will mostly satisfy everyone, but it would have been a big problem without a human deeper in the weeds.

Over time, this may change. Validation tooling I built may make a future migration of this kind easier to vibe code even if AI functionality doesn't continue to improve. Smarter models with more context will eventually learn these problems in more and more cases.

The code it generates still oscilates between beautiful and broken (or both!) so for now my artistic sensibilities make me keep a close eye on it. I think of the depressed robot from the Hitchhiker's Guide to the Galaxy as the intelligence behind it. Maybe one day it'll be trustworthy

benguild14d ago

“The only people I've heard saying that generated code is fine are those who don't read it.” Are you sure these people aren’t busy working rather than chatting? (haha)

But in all seriousness it depends on what you’re doing with it. Writing a quick tool using an LLM is much easier than context changing to write it yourself. If you need the tool, that’s very valuable.

sevenzero14d ago

Also as a webdev, it writes basic CRUD pretty good. I am tired of having to build forms myself and the LLMs are usually really good at that.

Been building a new app with lots of policies and whatnot and instructing a LLM is just much faster than doing the same repetitive shit over and over myself.

spockz14d ago

If you were tired of writing forms yourself, had you looked at https://jsonforms.io/? Just specify the the data you need, or extract it from the api spec and go. Display the form uniformly every time across your site. No need to burn AI time.

2 more replies

pron14d ago

Sure. I'm talking about production software that needs to survive and evolve for a long while.

pydry14d ago

This the core unspoken bone of contention in most AI arguments I think: most people either arent writing code with strict quality requirements or dont realize where their use of AI is violating them.

That said most of the world's most useful code has strict quality requirements. Even before AI 90% of SLOC would be tossed away without much if any use, 9% was used infrequently while 1% runs half the world's software.

mountainriver14d ago

Can you not review it?

2 more replies

agentultra14d ago

The invariant, stated informally, would be hard to prove is broken by a human reviewer in the loop. Spoken language isn’t precise enough for the task.

Even if you could state it in a precise formal language the LLM under the agent doesn’t have the capability to understand what the invariant is for and why it’s important. You’ll still get oddly generated code. You might get an LLM that can associate certain tokens with those in the formal language specification which can hold invariants and perhaps even write the proofs… but you’ll still get a whole bunch of other code generated from the informal parts of the prompt.

I agree that simply adding constraints and prompts to you skills and specs isn’t going to prevent these things. Worse, that even if you could invent a better mouse trap the creature will still escape.

The problem is… “elongation:” the addition of code for the sake of the prompt/task/etc. Often less is better. This takes a human with the ability to anticipate what other humans would want/expect. When you need a generator, they’re great but it’s a firehouse that whose use should be restrained a little more.

pron14d ago

> The invariant, stated informally, would be hard to prove is broken by a human reviewer in the loop. Spoken language isn’t precise enough for the task.

That depends on the invariant. Some are behavioural, like "variable x must be even if y is positive", but some are architectural, such as "a new view requires a new class".

But that's only one side of the problem because maintaining the invariant can be just as bad as breaking it. You ask the agent to add a feature and it may well maintain the invariant - only it shouldn't have, because the feature uncovers the fact that the invariant is architecturally wrong.

The problem is that evolving software requires exercising judgment about when you need to follow the existing strategy and when you need to rethink it. If there is any mechanical rule that could state what the right judgment is, I don't know what it is.

1 more reply

21asdffdsa1214d ago

And the solution is the same, as when it was outsourced- and the "patch" was fix it by writing spec. Thus i conclude my TED talk with the statement: LLMs are the new outsourcing and run into the same problems.

pron14d ago

Not quite, because the architecture often needs to evolve when you learn more as the project evolves. People will complain when they feel the constraints drive them to unnatural workarounds, the agents don't.

You can try telling the agent to stop and ask when a constraint proves problematic, except it doesn't have as good a judgment as humans to know when that's the case. I often find myself saying, "why did you write that insane code instead of raising the alarm about a problem?" and the answer is always, "you're absolutely right; I continued when I should have stopped." Of course, you can only tell when that happens if you carefully review the code.

multjoy14d ago

It has no judgement at all.

senordevnyc14d ago

So I run a solo saas that supports my family, and so the stakes feel very high for me. I use AI heavily, and I’ve seen the exact problem you’re describing. I feel like I’m often really riding the edge in terms of trying to use AI to accelerate product development while not letting tech debt accumulate too fast, or let my mental model of the codebase slip too much.

Here’s what’s working for me right now:

1. The basics: use best model available, have skills and rules that specify project guidelines, etc.

2. Always use plan mode. It works much better to iterate on the concept of what we’re going to do, then do the implementation. The models will adhere to the plan at very high rates in my experience.

3. Don’t give chunks of work that are too large in scope. This is just art, and I’m constantly experimenting with how ambitious I can be.

4. I review all code to some extent, but I have a strong mental model of what areas of the app are more critical, where hidden bugs might accumulate, etc, and I review both tests and impl more strenuously in those areas. Whereas like a widget for my admin panel probably gets a 2 second glance.

5. Have the discipline to go through periodically and clean up tech debt, refactor things that you’d do differently now, etc. I find the AI a huge help here, because I can clean up cruft in an hour that would have once taken me days, and thus probably wouldn’t have gotten done.

6. I’m experimenting with shifting my architecture to make it easier to review AI code, make it less likely it’ll make mistakes, etc. Honestly mostly things I should have always been doing, but the level of formalism and abstraction on my solo projects is usually different than on a bigger team.

To each their own, but I’ve grown this from nothing to about $350k in ARR over the last ten months, and I’m very confident I never could have built this product without AI help in triple that time.

marcosdumay14d ago

It's approximately the same problems, but stretched to an insane extent that you can never expect before it arrives.

i_love_retros14d ago

Don't outsource either then

21asdffdsa1214d ago

How about we outsource it to pakistan and they use LLMs. That way, we do what the LLM people do - many agents and stacked on top

daishi5514d ago

The generated code is more than fine, it’s good in many cases. And I read it :)

Indeed for the task of “jump into an unfamiliar codebase and make a requested change that aligns with existing styles and patterns, and uses existing functionality” I would say something like opus 4.7 exceeds the capabilities of most developers.

pron14d ago

I agree with both statements, but that doesn't change the problem I stated. If an agent produces reasonable code 80-90% of the time, and 10-20% of the time it makes mistakes that could render the codebase irretrievably unevolvable once they accumulate, the only thing you can do is to carefully review the agent's output 100% of the time. That it gets things right 80% of the time as opposed to 40% of the time doesn't change this calculus one iota.

But agents generate code much faster, and to know slow them down, some people want to not do the only thing that can currently ensure you get good results, which is to carefully review the output. Once that happens, there is simply no way for them to know how good or bad what they're getting is.

2 more replies

stingraycharles14d ago

> Picking among them isn’t a matter of context. It’s a matter of judgment, and the models - not the harnesses - get this judgment wrong far too often. I would say no better than random chance.

Yeah I’m currently working for several months already on a harness that wraps Claude Code and Codex etc to ensure that these types of invariants are captured and enforced (after the first few harness attempts failed), and - while it’s possible - slows down the workflow significantly and burns a lot more tokens. In addition to requiring more human involvement, of course.

I suspect this is the right direction, though, as the alternatives inevitably lead any software project to delve into a spaghetti mess maintenance nightmare.

pron14d ago

It's not enough to enforce the invariants because they may need to change. You need to follow the invariants when they're right, and go back and reconsider them when they prove unhelpful. Knowing which is the case requires judgment that today's models are simply incapable of (not consistently, at least).

zephen14d ago

> What I've seen is that if you define the architectural constraints, the agent writes complex, unmaintainable code...

To be fair, there are many people like this as well. One of my personal favorite examples was way back in the 80s when I inherited the code for a protocol converter that let ASCII terminals communicate with IBM mainframes via the 3270 protocol.

One of the pieces of code in there, for managing indicator lights, was simply wrong. It was ca. 150 lines of Z80 assembly language that was trying to faithfully follow the copious IBM documentation of how things worked, but it had subtle issues and didn't always work.

My approach was to accept the documentation as accurate (the IBM documentation was always verbose and almost never wrong), but to reason that the original 3270 had these functions implemented in TTL logic gates, and there was no way in heck that they were wasting enough gates on indicator lights to require the logical equivalent of 150 instructions.

So in my mind, it had to be a really simple circuit that had emergent properties that required the reams of documentation. With that mindset, I was able to craft correct code for this in 12 instructions.

Many systems are likewise fractal in nature. You want to figure out the generating equations, rather than all the rules that derive from those. And, in many cases, writing down the generating equations is at least as easy to do in code as it would be to do in English for someone or something else to implement.

bicepjai14d ago

This is the rule I have settled on and I can feel why. Writing the first buggy working version with agents is always fun. Then making the software reliable with the agents, the way you want is very painful.

leonaves14d ago

What's the difference between asking an AI to write you a module you never read and installing a 3rd-party module without auditing all its source code?

Xirdus14d ago

If the 3rd party module is popular, its badness will affect other people too and either the module will get improved or well known workarounds/"best practices" will develop. With AI-generated code, more often than not you're the sole user.

skydhash14d ago

Trust and reputation.

I would use Stripe, curl, and ffmpeg without audits, because I trust them to provide good code and to respect their API. I wouldn’t trust AI to write a Fibonacci series implementation.

The AI has no reputation to wager for my trust.

frikk14d ago

stars on github? I've wondered the same thing.

__alexs14d ago

I read all the code I generate with Cursor and some of it smells a bit weird but is easily fixable and most of it is as good as what I would write or better.

WalterBright14d ago

My own code is contortious. I refactor it regularly to reduce that, but it still can be better.

indoordin0saur14d ago

Write your code by hand, but AI still serves as something of a stack overflow and code completion tool. Also good for writing tedious things like regex or little one-off utility scripts as well as a first crack at unit tests. Using it to actually write big blocks of important code is a no-no in my opinion as it produces what I would characterize as slop, even if it technically works.

jstummbillig14d ago

> The only people I've heard saying that generated code is fine are those who don't read it.

Well, that is problematic. I have to either assume you are disinterested or lying and neither is great for any discourse.

nathanielks14d ago

Yeah, their statement just isn't true. With enough instruction, I've been able to get great output from models. I think that's the key: with detailed, pointed instructions, the output will match.

rimliu14d ago

how do you know it matches? You did read it then?

1 more reply

linuxftw14d ago

Try plan mode. The problems you're speaking about are already solved.

pron14d ago

They are nowhere near solved. Agents make serious mistakes in judgment and do it frequently enough to threaten the viability of the codebase unless you slow down and monitor them very, very closely. If you do that, it's all good. If you're not, your codebase is rotting at a superhuman speed underneath you and you have no idea until it collapses.

1 more reply

hatefulmoron14d ago

Plan mode improves results, but it doesn't solve the underlying problems. Pretty often Claude Opus 4.7 on xhigh will formulate a reasonable enough plan, churn for a while, then come back with a summary that it didn't stick to the plan because it wasn't accurate.

Worse, the disclaimer is buried under a bunch of "did X, did Y on line Z of file a/b/c", as if it's just a minor inconvenience. To the extent the plan was inaccurate, you're left in an undefined state where you might as well undo what it just did..

1 more reply

tcgv14d ago

> "Yep. The only people I've heard saying that generated code is fine are those who don't read it."

I review every line of code I generate with AI. I mainly use an MR-based approach:

1) Provide a tightly scoped technical spec to Codex as a task, and ask for 3x solutions. Usually at least one of them is on the right track, and it is better to ditch a solution that went in the wrong direction than to try to fix it.

2) Review the explanation and diff of the proposed changes line by line, file by file. If I find minor deviations from what I asked, or violations of the codebase architecture/conventions, I write comments in the diff and/or global comments, and ask again for 3x adjusted solutions.

3) Usually, by this point, the solution is ready for me to merge locally and either run local tests or do some manual fine-tuning.

4) Finally, I generate unit tests. I leave them to this stage because I can repeat the same process with the sole intent of generating case-specific unit tests. This way, I can generate/review tests against the final version of the implementation.

This has been working very well for me since our repos are reasonably organized and have a well-defined architecture. In the technical spec, I include the major architectural requirements and code conventions, and I also add a catch-all like "follow the codebase's existing conventions and style", which works reasonably well.

This simple process has enabled me to deliver most minor/medium tasks and bug fixes really quickly while maintaining control over the changes and without lowering the quality bar. For larger and more challenging tasks, I find myself "driving the wheel" (i.e. coding by hand) more often, and using AI code generation in a much more scoped and specific way. So that becomes a different process altogether.

baddash15d ago

I've set a few rules for working with coding agents:

1. If I use a coding agent to generate code, it should be something I am absolutely confident I can code correctly myself given the time (gun to my head test).

2. If it isn't, I can't move on until I completely understand what it is that has been generated, such that I would be able to recreate it myself.

3. I can create debt (I believe this is being called Cognitive Debt) by breaking rule 2, but it must be paid in full for me to declare a project complete.

Accumulating debt increases the chances that code I generate afterwards is of lower quality, and it also feels like the debt is compounding.

I'm also not really sure how these rules scale to serious projects. So far I've only been applying these to my personal projects. It's been a real joy to use agents this way though. I've been learning a lot, and I end up with a codebase that I understand to a comfortable level.

jimsojim15d ago

While this is a legitimate set of rules to follow for maintaining code sanity and a solid mental model of how a codebase may grow, it’s always challenging to stick to them in a workplace where expectations around delivery speed have changed drastically with the onset of AI. The sweet spot lies in striking a balance between staying connected to the codebase and not becoming a limiting factor for the team at the same time.

baddash15d ago

That's kind of what I figured, sadly. I haven't experienced it personally yet since I got let go from my last job about 14 months ago, but it makes so much sense given how management is so willing to sacrifice quality for speed.

brabel15d ago

I was trying to follow similar rules, until one day I had to solve a hard mathematical problem. Claude is a phd level mathematician, I am not. I, however, know exactly the properties of the desired solution and how to test it’s correct. So I decided to keep Claude’s solution over my basic, naive one. I mentioned that in the pull request and everyone agreed that was the right call. Would you open exceptions like that in your rules? What if AI becomes so much better at coding than you , not just at doing advanced mathematics? Would you then stop to write code by hand completely since that would be the less optimal option, despite you losing your ability to judge the code directly at that point (and as in my example, you can still judge tests, hopefully)? I think these are the more interesting questions right now.

Jweb_Guru15d ago

> Claude is a phd level mathematician

Unfortunately, it is not, and many of its attempts at mathematical proofs have major flaws. You shouldn't trust its proofs unless you are already able to evaluate them--which I think is pretty much all the OP is saying.

dathanb8215d ago

I’ve also heard it being called “comprehension debt,” which I like a little more because I think it’s more precise: the specific debt being accrued is exactly a lack of comprehension of the code.

baddash15d ago

Yeah I like that better too, gonna start using that

TranquilMarmot15d ago

This is great until the "gun to your head" is your skip-level manager demanding that a feature be implemented by the end of the week, and they know you can just "generate it with AI" so that timeline is actually realistic now whereas two years ago it would have required careful planning, testing, and execution.

nertirs315d ago

I hate this current trend of managers deciding, what tools developers have to use. Hopefully it ends soon.

whitefang15d ago

I agree to this though it also depends on the nature of project.

Had a project idea which I coded with the help of AI and it became quite large to a point I was starting to have uncharted areas in the code. Mostly because I reviewed it too shallow or moved fast.

It was a good thing as that project never floated but if I were to do such a thing on my breadwinning project I would lose the joy.

gritzko15d ago

I just had a Claude episode. Instead of trying to fix the bug, it edited the data to hide the bug in the sample run. This kind of BS behavior is not rare. Absolutely, if you do not understand every bit of what's going on, you end up with a pile of BS.

bmitc15d ago

This is about how I use it. I initially use it to carve out an architecture and iterate through various options. That saves a lot of time for me having to iterate through different language features and approaches. Once I get that, I have it scaffold out, and I go in and tidy things up to my personal liking and standards. From there, I start iterating through implementations. I generally have been implementing stuff myself, but I've gotten better at scaffolding out functions/methods through code instead of text. Then I ask it to finish things off. That falls into your first category of letting it implement stuff that I already know I could do. Not sure if it's faster. But it's lower cognitive load for me, since I can start thinking about the next steps without being concerned about straightforward code.

This all works pretty great. Where it starts going off the rails is if I let it use a library I'm not >=90% comfortable with. That's a good use of these tools, but if I let it plow through feature requests, I end up accumulating debt, as you pointed out.

For my uses, I'm still finding the right balance. I'm not terribly sure it makes me faster. What I do think it helps with is longer focused sections because my cognitive load is being reduced. So I can get more done but not necessarily faster in the traditional sense. It's more that I can keep up momentum easier, which does deliver more over time.

I'm interested in multi agent systems, but I'm still not sure of the right orchestration pattern. These AI tools still can go off the rails real quick.

djeastm14d ago

When it was Copilot tab-completing lines, people would say, "yea, but you still have to make sure you're the one writing the whole functions".

Then when it was completing functions, people would say, "yeah, but you still have to make sure you're the one writing the logic around the functions"

Then when it was completing the logic around the functions, people would say, "yeah, but you still have to make sure you're the one writing the features"

Now it's completing features and people say, "yeah, but you still have to make sure you're the one writing the architecture"

I don't know if architecture is a solvable problem for these models, but it is interesting watching the expectations moving over time.

raincole14d ago

The "people" in your hypothetical story have been wrong the whole time. The correct attitude is:

When AI can complete lines, you still have to read and understand the code.

When AI can complete whole functions, you still have to read and understand the code.

When AI can complete features and tickets, you still have to read and understand the code.

brightball14d ago

I heard a talk from a VP at NVIDIA a couple of months ago and he echoed this. Essentially their policy is "you are still fully responsible for the code you ship, whether AI helps with it or not"

5 more replies

ventana14d ago

> you still have to read and understand the code

Which is a very similar approach to any serious code. If you just hired a very clever, enormously knowledgeable intern, and they wrote a bunch of code for you overnight, you would probably review it.

Yes, in some cases, either hobby projects or throwaway code, you could just take it and use it as is, and I surely do, for the code no one cares about. But at work, I would rather review it.

globnomulous14d ago

Precisely. And this is why all the MCP servers that people at my company are writing aren't worth using: their apparent goal is automate as much as possible. They're encouraging people not to pay attention. This results in bad code, bad tests, and bugs.

jstummbillig14d ago

Not at all. Code is not important, intent is. The leader of a product/company does not have to read code. It doesn't matter if it is generated by humans or non-humans. It simply needs to be correct enough to be usable and then steerable towards better outcomes. Understanding of code never existed from the business perspective.

throwaway17373814d ago

It does in safety critical industries. You can get grilled by regulators about your source code. And lawyers will use it as evidence in court.

roncesvalles14d ago

The code codifies the intent and is the long-term source of truth for how your business actually operates.

>The leader of a product/company does not have to read code.

That's because he's paid a bunch of people 300k to read it and make sure it aligns with the company's objectives and interests. Part of the reason why devs are paid so much is because they're literal business administrators for some narrow slice of the company's operations. The devs are the leaders that you're referring to.

Even in multi-hundred-billion-dollar companies there are so many mission critical things that are owned by just 2 SWEs.

raincole14d ago

> The leader of a product/company does not have to read code.

Yeah, because they believe (sometimes wrongly) their subordinates read it.

> Understanding of code never existed from the business perspective.

It does, it's called organizational wisdom and domain knowledge, because you need those witty names to sell books to aspiring managers.

contagiousflow14d ago

Can you think of a good way to encode intent into a system?

herdcall14d ago

I'm no longer sure you have to, actually. I mean, we do trust the assembly that compilers produce without having to read it, don't we? We're rapidly getting to that stage with LLMs, IMO.

bigfishrunning14d ago

The assembly is a deterministic transform of the input logic, and if it doesn't match then it's a bug in the compiler. If an LLM-based code generator doesn't match what you asked for, that's OK, just pull the slot-machine handle again. that's the difference.

4 more replies

gregsadetsky14d ago

I know it’s tiring to talk about “hallucination”, but truly, models still do hallucinate

They constantly say they did a thing they didn’t, say they know how to solve something when they don’t, etc. Regardless of guard rails or tests - AI forces a constant vigilance of a new kind.

Not just “what might have gone wrong” but also “what do I think is working but isn’t actually”.

And we’re not even talking about how it chooses substandard solutions, is happy to muddy code/architectures, add spaghetti on top of spaghetti etc.

Agentic coding often feels like an army of unexperienced developers who are also incredibly eager to please.

1 more reply

amw-zero14d ago

This is a really, really, really bad comparison. I used to say the same thing. But the semantic distance between compiling a for loop to equivalent assembly instructions is much smaller than the distance between "I'd like a web application that can store and retrieve todo items." The space of the latter is practically infinite in what can be "compiled."

1 more reply

SpaceNoodled14d ago

I've actually taken to double-checking the assembly in some instances. There are surprising times that the compiler won't make the shortcuts and optimizations you thought it should, and I also used this method to call out an unsuitable compiler since I caught it spitting out some ridiculous 10x-long set of instructions in certain critical instances.

bayindirh14d ago

> we do trust the assembly that compilers produce without having to read it

Yes, because wrong assembly blows really loudly. From wrong behavior to invalid instruction errors and everything between them. Moreover, compilers are battle tested over the years, with extremely detailed test suites, and extreme testing (everyday, hundreds of thousands users test and verify them).

Also, as people said, assembly generation is deterministic. For a given source file and set of flags, you get the same thing out. Byte by byte, bit by bit. This is what we call "reproducible builds".

AI is not like that. It's randomized on purpose, it pulls from training set which contains imperfect, non-ideal code. "Yeah, it works whatever", doesn't cut it when you pull a whole function out of its connections, formed by the training data. It can and will make errors, because it's randomized from a non-ideal pool.

Next, sometimes you need tight code. Fitting into caches, running at absolute performance limit of the processor or system you have. AI is not a good fit here. Sometimes you go so far that you optimize for the architecture at hand, and it works slower on newer systems, so you need to re-optimize that thing.

For anyone who reads and murmurs "but AI can optimize", yes, by calling specific optimization routines written by real talented people for some cases; by removing their name, licenses, and context around them. This is called plagiarism in its mildest form and will get you in hot water in academia, for example. Writing closed source software doesn't make you immune from cheating and doing unethical things.

Lastly, this still rings in my ears, and I understood it over and over as I worked with more high performance, correctness critical code:

I was taking an exam, there's this tracing question. I raise my head and ask my professor: "Why do I need to trace this? Compiler is made to do this for me". The answer was simple yet deep: "If you can't trace that code, the compiler can't trace it either".

As I said, I just said "huh" at the time, but the saying came back and when I understood it fully, it was like being shocked by a Tesla coil.

Get your sleep, eat your veggies and understand your code. That's the four essential things you need to do.

eska14d ago

We don’t. That’s why tools like godbolt are popular, debuggers can jump into assembly, and compilers can output assembly files.

yakattak14d ago

I want to preface this with that I am all for agentic engineering.

I am so tired of hearing about this false equivalency. Compilers are deterministic, their outputs are well understood and they’re transparent.

LLMs are not.

1 more reply

the__alchemist14d ago

> I don't know if architecture is a solvable problem for these models, but it is interesting watching the expectations moving over time.

I think the solution is between the lines of this article. The author states the steps leading to this, but doesn't arrive at it explicitly. It has been obvious (With 50/50 hindsight) to me since LLMs started getting popular, and holds:

LLMs are fantastic for software dev. If you don't let it write architecture. Create the modules, structs, and enums yourself. Add as many of the struct fields and enum variants as possible. Add doc comments to each struct, enum, field, and module. Point the LLM to the modules and data structures, and have it complete the function bodies etc as required.

winwang14d ago

Yeah, I pretty much agree. Opus and GPT will both come up with the most "organically-grown" "designs" if you let them. They do slightly better when asked to design first, but they seem to avoid many important questions (and definitely skip asking the user much of anything at all). I can only say it feels they "want" to ship as fast as possible while assuming I'm not going to actually review the PR.

bluGill14d ago

The models can do architecture. However they typically (at least currently) do a really bad job until you force them. I use AI all the time, it is getting better, but I still review every single line. Individual lines are no today are not better than tab completion of last year - sometimes really good and save my typing, sometimes really really bad.

embedding-shape14d ago

Anyone who understands the motivation, reasoning and goals can do the architecture. The crux is that hardly anyone actually understand those and even less is aligned on those, that's when misalignment happen over time, LLMs or not.

Considering how fast we can poop out code now, I think this issue is just more visible than before, but it's been an issue for as long as I've been a developer. Almost no one knows what they actually want, and half the job is trying to coax out what they want to be able to do, so you can properly architect it.

onlyrealcuzzo14d ago

> I don't know if architecture is a solvable problem for these models, but it is interesting watching the expectations moving over time.

At least with current languages, I think the primary problem is they are globally complex, and it's not scalable for them (and certainly for you to review a codebase they've mainly or completely generated) that the invariants you want are being withheld.

No matter how many times you tell them - there is ZERO blocking allowed on the critical path, they will add blocking on the critical path.

No matter how many times you tell them any time they do X, they need Y type of test, they will do X without Y type of test.

They cannot follow directions 100%. Neither can people.

But they are more random. The mistakes people make are less likely to do the exact polar opposite of what you wanted to do.

People are less likely to see a critical invariant in the code, build themselves a loophole to get through it, write a test that the code fails successfully, and then tell you they did exactly what you asked for, and burry it in a 5k line commit, where 1000 lines are them changing comments that shouldn't be there in the first place.

LLMs are great. I'm convinced they're the future. I'm building a language specifically for them: https://GitHub.com/Cuzzo/clear - and to make it easier for YOU to work with them.

I think once we get around this language problem, that they need global context for things where they shouldn't, it will be a challenge to work with them.

I've had success with them, but it's been so frustrating, that I question how much it's been worth my sanity.

wolttam14d ago

These models understand architecture perfectly well, but they're not trained to care about it when being asked to complex X or Y feature. They're trained to implement the feature in the shortest route possible.

So it's not much of a surprise that this is the situation folks find themselves in with the current models.

jayd1614d ago

Are any of these steps actually solved? AI tab completion still kinda sucks.

They can keep internal consistency so the more you let it write the more it can write with internal consistency. It still fails at all of these levels as soon as you are looking at each level of detail.

mdswanson14d ago

I refer to this as "disposable architecture." Not that architecture doesn't matter, but that the architecture that worked yesterday doesn't necessarily need to be the architecture that works today.

indoordin0saur14d ago

It's even farther along than you think. It's the one writing the comments you're responding to. So why are you still thinking up and typing out your HN comments?

wiseowise14d ago

> but it is interesting watching the expectations moving over time.

While the salary stays stagnant or even reduced if you adjust for inflation.

vrganj14d ago

As somebody with a colleague that is using AI agents to "complete features", let me tell you, it is not. It is taking that dude so much longer to prompt and reprompt and then prompt again until it is anywhere close to something that passes review than it would take any competent mid-level engineer to just build the whole thing with some autocomplete help.

Have people's standards for quality just completely vanished in the pursuit of the shiny new thing? Is that guy doing something wrong?

That has also been my experience with this sort of thing fwiw, which is why I gave up and do more of a class-by-class pairing with an LLM as a workable middle ground.

koumou9214d ago

100% agree. Obviously AI is at a point where the developer has to do the architecture. Or at least be in control of what kinda of architecture the AI is implementing. You can't one-shot huge features in huge codebases with AI. You are bound to get strange decisions. But that does not mean they are not worth using. That's a silly take.

hansmayer14d ago

> Now it's completing features

It's completing shit. Even if it does not implement some lazy stuff with empty catch blocks (i.e. happy path from programming 101 tutorials), it will either expose your secrets in a sensible place or do some other stupidity.

snowe201014d ago

weirdly made up scenario. I'm the person in the very first sentence. Tab-completing lines is still dog-shit. The majority of the time it has no clue what I'm going to write. Just because it can now write a lot more stuff doesn't mean it isn't still just as incorrect.

Also, you've set up a huge strawman here. Who are these people saying these things in this order and why is that the argument and not "You need to be reviewing every line of code that gets written and understand it."

Your argument is nonsense.

user3428314d ago

I felt the same with:

"it takes too much effort to get the output production ready"

turning into

"maybe long term the maintenance will be more expensive"

I give it three months until people realize that you rarely need to review every single line and fully understand the code, like so many comments are claiming.

camdenreslink14d ago

If you work on a product that has an existing user base that has an expectation that things will still work then you definitely still need to read the code. LLMs frequently break things or introduce subtle incompatibilities.

Maybe on projects with no users you can yolo things.

1 more reply

keybored14d ago

There are always people who will disagree, no matter how amazing something is, and they naturally respond with concerns close to the locus of the LLMification. It would be absurd to respond to “AI autocomplete is great now” with “but you still need to architecture your code”. What’s people saving seconds writing code minutiae got to do with architecturing the code?

This blob of people criticizing AI is just that, a blob. A gaggle of discrete people that your brain makes up a narrative about being some goalpost shifting entity.

Of course there could be individuals who have moved the goalposts. Which would need a pointed critique to address, not an offhand “people are saying” remark.

dzonga14d ago

the autocomplete can be shit some times.

callamdelaney14d ago

Architecture is one of the easiest things in programming frankly.

taytus14d ago

Nice Fiction story.

jwpapi14d ago

That’s the same story I had.

The swindle goes like this, AI on a good codebase can build a lot of features, you think it’s faster it even seems safer and more accurate on times, especially in domains you don’t know everything about.

This goes in for a while whilst the codebase gets bigger and exploration takes longer and failure rate increases. You don’t want it to be true and try harder so you only stop after it practically became impossible to make any changes.

You look at the code again and there is so much code spaghetti is an understatement it’s the Chinese wall.

You start working…, and you realize what was going on

I deleted 75,000 of 140,000 lines of code and I honestly feel like the 3 months I went hard into agentic coding I wasted and I failed my users by building useless features increasing bugs, losing the mental model of my code and not finding the problems I didn’t know about the kind of hard decisions you only see when you in the code, the stuff that wanders in your mind for days

dxdm14d ago

I find it interesting that this outcome is a surprise. I don't want this to sound smug, I'm genuinely curious what the initial expectations are and where they come from.

They seem to be different for LLMs, because would anyone be surprised if they handed summary feature descriptions to some random "developer" you've ever only met online, and got back an absolute dung pile of half-broken implementation?

For some reason, people seem to expect miracles from some machine that they would not expect of other humans, especially not ones with a proven penchant for rambling hallucinations every once in a while.

I'd like to know, ideally from people who've been there, why they think that is. Where does the trust come from?

throw10101014d ago

LLMs do deliver "miracles", in certain cases, if you've experienced it and have been blown away by their output (one shot functional app from a well manufactured prompt, new feature added flawlessly on a complicated existing codebase, etc.), it can be tempting to reajust your expectations and think this will work consistently and at a much larger scale.

They can assimilate 100s of thousands of tokens of context in few seconds/minutes and do exceptional pattern matching beyond what any human can do, that's a main factor in why it looks like "miracles" to us. When a model actually solves a long standing issue that was never addressed due to a lack of funding/time/knowledge, it does feel miraculous and when you are exposed to this a couple of times it's easy to give them more trust, just like you would trust someone who provided you a helping hand a couple of times more than at total stranger.

dxdm14d ago

Thanks, that makes sense.

I suppose it's difficult to account for the inconsistency of something able to perform up to standard (and fast!) at one time, but then lose the plot in subtle or not-so-subtle ways the next.

We're wired to see and treat this machine as a human and therefore are tempted to trust it as if it were a human who demonstrated proficiency. Then we're surprised when the machine fails to behave like one.

I have to say, I'm still flabbergasted by the willingness to check out completely and not even keep on top of, and a mental model of, what gets produced. But the mind is easily tempted into laziness, I presume, especially when the fun part of thinking gets outsourced, and only the less fun work of checking is left. At least that's what makes the difference for me between coding and reviewing. One is considerably more interesting than the other, much less similar than they should be, given that they both should require gaining a similar understanding of the code.

doginasuit14d ago

I've never relied on an LLM to build a large section of code but I can see why people might think it is worth a try. It is incredible for finding issues in the code that I write, arguably its best use-case. When I let it write a function on its own, it is often perfect and maybe even more concise and idiomatic than I would have been able to produce. It is natural to extrapolate and believe that whatever intelligence drives those results would also be able to handle much more.

It is surprising how bad it is at taking the lead given how effective it is with a much more limited prompt, particularly if you buy in to all the hype that it can take the place of human intelligence. It is capable of applying a incredible amount of knowledge while having virtually no real understanding of the problem.

manicennui14d ago

Receiving an absolute dung pile of half-broken implementation is honestly what I expect from most working software engineers. Now the step where they spend even a second thinking about what they are doing has been removed. My job as a principle engineer became doing most of the thinking for people and then providing the only worthwhile code reviews before LLMs became a thing. LLMs just made these people even less useful and my job became even more about reviewing their low quality work that I could have done in less time manually.

LLMs also don't solve the much bigger problem of most software engineers having no ability to work with others to clarify requests or offer alternatives. So now bad and/or misunderstood requests can be implemented faster.

jeltz14d ago

Probably same reason people expected outsourcing to the cheapest firm in India would work: wishful thinking. People wanted it to work and therefore deluded themselves.

Or really the same reason people fall for get rich quick schemes.

thunky14d ago

> You look at the code again and there is so much code spaghetti is an understatement it’s the Chinese wall.

I don't understand this. A large codebase should be a collection of small codebases, just like a large city is a collection of small cities. There is a map and you zoom into your local area and work within that scope. You don't need to know every detail of NYC to get a cup of coffee.

Its your responsibility to build a sane architecture that is maintainable. AI doesn't prevent you from doing that, and in fact it can help you do so if you hold the tool correctly.

wavemode14d ago

To use your analogy - there's a big difference between the streets of New York and the streets of Boston. In New York if you know you're on 96th and 3rd, then you automatically know how to get to 101st and 5th - it's just a grid. But not every town is like that - many require you to possess knowledge about specific streets and specific landmarks in order to navigate anywhere.

To speak more directly - every codebase has local reasoning and global reasoning. When looking at a single piece of code that's well-isolated, you can fully understand its behavior "locally" without knowing anything about any other part of the code. But when a piece of code is tightly coupled to many other parts of the codebase, you have to reason globally - you have to understand the whole system to even understand what that one piece of code is doing, because it has tendrils touching the whole system. That's typically what we call spaghetti code.

If you leave an AI to its own devices, it will happily "punch holes", and create shortcuts, through your architecture to implement a specific feature, not caring about what that does to the comprehensibility of the system.

gf00014d ago

I don't think that's a useful mental model for software in general.

There are software that works like this (e.g. a website's unrelated pages and their logic), but in general composing simple functions can result in vastly non-proportional complexity. (The usual example is having a simple loop, and a simple conditional, where you can easily encode Goldbach or Collatz)

E.g. you write a runtime with a garbage collector and a JIT compiler. What is your map? You can't really zoom in on the district for the GC, because on every other street there you have a portal opening to another street on the JIT district, which have portals to the ISS where you don't even have gravity.

And if you think this might be a contrived example and not everyone is writing JIT-ted runtimes, something like a banking app with special logging requirements (cross cutting concerns) sits somewhere between these two extremes.

bluGill14d ago

The GC shouldn't care about all the code it is collecting. I collects garbage, it doesn't care if the garbage is a intermediate value from your tax calculations, or the the previous state image from your UI - either way it is garbage and it is gone. Now in a few cases details of garbage collection matter by enough that it is worth something more invasive for some reason, but the vast majority of code shouldn't care about the other areas.

When on a tiny project it doesn't matter. However when you have millions of lines of code you have to trust that your code works in isolation without knowing the details.

1 more reply

marcosdumay14d ago

> A large codebase should be a collection of small codebases, just like a large city is a collection of small cities.

Oh, great analogy there.

Just like there's almost nothing in common between a large city and a collection of small cities, a large codebase is completely different from a collection of small codebases too.

Mostly because of the same kinds of effects.

OptionOfT14d ago

No but the speed up of AI is giving up control, and then you notice these issues too late.

gonational14d ago

It's not that you can't "build a sane architecture" as much as it is difficult to justify the time spent to do this when you can "bang out features" in 10 minutes that would take days to do manually. It's about the economics of code generation. When inventing structure and typing it out as code takes time, thinking deeply about architecture first makes sense. There is another factor, as well: "thinking deeply about the architecture" involves experimentation. You might go down a particular path while coding, and then realize some limitations and/or new ideas, etc. You ultimately craft something that will work well and play well with future code, and which may be easily understood. If somebody stops by your desk and says, "you finish that <3 day feature you were just assigned 2 hours ago> yet?", that'll be the last time you think deeply about anything at work.

Rather than arguing about the specifics, it's easier to point to numerous concrete examples, such as a fairly simple system - which should be easy to implement in 8-15k lines of code, depending on certain choices (I've been writing code long enough to estimate this relatively accurately) - being still-incomplete while approaching 150k lines. These kinds of atrocities are usually economically infeasible in hand-written code, for 2 reasons: 1) the cost to produce that much code is very high, and 2) the cost of maintaining that much code is insurmountable.

I guess you could say that AI is great at generating code that only AI can understand and maintain.

mjburgess14d ago

I think this is true, but i imagine there's a workflow solution to this which isnt to drop AI.

Eg., treating AI code generated as immediately legacy, with tight encapsulation boundaries, well-defined interfaces etc. And integrating in a more manual workflow.

There's a range from single-shot prompts to inline code generation, that will make more sense depending on the problem and where in the code base it is.

Single-shot stuff is going to make more sense for a protyping phase with extensive spec iteration. Once that prototype is in place, you then prob want to drop down into per-module/per-file generation, and be more systematic -- always maintaining a reasonably good mental model at this layer.

herrherrmann14d ago

That workflow just sounds exhausting to me. Would I always need to consider how much of a blast radius my AI-generated code might have? Sounds like there’s so much extra management going into these micro decisions that it ultimately defeats the purpose of generating code altogether.

I could see value in using it during the prototyping phase, but wouldn’t like to work like you described for a serious project for end users.

meetingthrower14d ago

And you have discovered the job of managers! There has always been a lot of hate for managers. Wonder if the robots hate us just as much? (I often feel a weird guilt when I tell an agent to do something I know I am going to throw away but will serve as an interesting exploration...I know if I did that to a human they would be pissed...)

2 more replies

embedding-shape14d ago

I just don't like to type code anymore. If I can accomplish the same by describing the code, and get the same results as if I typed it myself, I'll opt for not typing so damn much. I've done so much typing in my career, that typing ~80% less to get the same results, makes a pretty big difference in how likely I am to set out to accomplish something.

I care more about code quality now, because typing no longer limits if I feel like it's worth to refactor something or not.

1 more reply

timacles14d ago

This is like getting a person addicted to drugs and then asking them to only use the drugs on Thursday and Friday.

This seems to me like it requires an impossible level of discipline, judgement and foresight

anal_reactor14d ago

> treating AI code generated as immediately legacy, with tight encapsulation boundaries, well-defined interfaces etc.

This is good advice regardless whether you're using AI or not, yet in real life "let's have well-defined boundaries and interfaces" always loses against "let's keep having meetings for years and then ducttape whatever works once the situation gets urgent".

chorsestudios14d ago

Were you auto-committing everything without reading the generated code? and if you read it but didn't understand it why not just ask for detailed comments for each output? Knowing that a larger codebase causes it to struggle means the output needs to be increasingly scrutinized as it becomes more complex.

skydhash14d ago

I don’t think it’s about what the code does. I think it’s more about how the code fits in its whole context. How useful it is in solving the overarching problem (of the whole software). How well does it follow the paradigm of the platform and the codebase.

You can have very good diffs and then found that the whole codebase is a collection of slightly disjointed parts.

gchamonlive14d ago

I haven't had the chance to work on large codebases, but isn't it possible to somehow adapt the workflow of Working Effectively with Legacy Code, building islands of higher quality code, using the AI to help reconstruct developer intention and business rules and building seams and unit tests for the target modules?

AI doesn't necessarily have to increase your throughput, it can also serve as a flexible exploration and refactoring tool that will support either later hand crafter code or agentic implementation.

qudat14d ago

Yep. I’m approaching the same problem from a different angle: writing code fast means you aren’t being thoughtful about the features you’re building. I started realizing that after I had kids and spent more time thinking about code than writing it and it really improved the quality of my work: https://bower.sh/thinking-slow-writing-fast

jwpapi14d ago

Obviously this is all my fault and others might have a better jugdement when using it. I’m just sharing my experience compared to the promises you easily might believe reading X/HN/Anthropic.

I still have a lot of usage for AI: Exploration, Double-checking me, teaching me. But writing code became very tough for me to accept. Nex-edit autocompletes mainly

scruple14d ago

> I still have a lot of usage for AI: Exploration, Double-checking me, teaching me.

I'm ready to give up on having it even review my code at this point. It's been so frustrating. It hallucinates bugs, especially in places where "best practices" are at odds with reality.

Recently it informed me of a bug where it suggested the line of code in question couldn't possibly do anything because on Linux the specific stdlib behaved in X ways, but it was obvious from the line of code that it was running on Windows which doesn't have this problem at all. Of course, it doesn't actually mention that this is an issue on Linux, just that there is a bug here. It vomits up a paragraph of $WORDS explaining why this was a high-priority bug that absolutely needed to be fixed because it was failing in subtle ways. Yet the line of code in question has been running in production, producing exactly the results it is expected to, for ~3 years.

And this is just one simple example, of the many dozens+ of times it has failed this task this year. In that same review run, the agent suggested 3 additional "bugs" or other issues that should be addressed that were all flatly wrong or subjective. I'm at a point of absolute exhaustion with this sort of shit. It's worse than a junior half of the time because of how strongly opinionated it is. And the solution to this sort of problem is an endless amount of configuration and customization that will be forgotten about by all of us over time, leading to who knows what sort of knock-on effects (especially as we migrate from one model to the next). We have a guy on our team who has ~17,000 words in his agent and instructions files, yet he sees nothing wrong with this. I guess he just really loves YAML and Markdown.

deadbabe14d ago

I don’t think you truly captured the worst part:

There comes a realization, to many engineer’s horror, that AI won’t be able to save them and they will have to manually comprehend and possibly write a ton of code by hand to fix major issues, all while upper management is breathing down their back furious as to why the product has become a piece of shit and customers are leaving to competitors.

The engineers who sink further into denial thrash around with AI, hoping they are a few prompts or orchestrations away from everything be fixed again.

But the solution doesn’t come. They realize there is nothing they can do. It’s over.

snowe201015d ago

> The other change is simpler: I'm doing the design work myself, by hand, before any code gets written. Not a vague doc. Concrete interfaces, message types, ownership rules.

That’s the hard part of coding. If you have an architecture then writing the code is dead simple. If you aren’t writing the code you aren’t going to notice when you architected an API that allows nulls but then your database doesn’t. Or that it does allow that but you realize some other small issue you never accounted for.

I do not know how you can write this article and not realize the problem is the AI. Not that you let it architect, but that you weren’t paying attention to every single thing it does. It’s a glorified code generator. You need to be checking every thing it does.

The hard part of software engineering was never writing code. Junior devs know how to write code. The hard part is everything else.

Philip-J-Fry15d ago

Yes, I think there's 2 kinds of developer. Those who think the code is the hard part, and those that don't.

The developers that thing coding is hard are the ones that absolutely love AI coding. It's changed their world because things they used to find hard are now easy.

Those that think coding is easy don't have such an easy time because coding to them is all about the abstractions, the maintainability and extensibility. They want to lay sensible foundations to allow the software to scale. This is the hard part. When you discover the right abstractions everything becomes relatively easy. But getting there is the hard part. These people find AI coding a useful tool but not the crazy amazing magical tool the people who struggle with coding do.

The OP is definitely in the second camp since they could spot and realise the shortcomings of the AI. They spotted the problem, and that problem is that the AI can't do the hard bit.

jfim15d ago

I'd say there's another camp: the camp of people who know that code isn't the hard part, but that it's still time consuming to write code. AI coding is pretty useful for that, when you can nail the design but you just need a set of hands to implement it.

seer14d ago

But isn’t AI doing the same thing to project management as to coding?

PMs can now cross reference and organize tickets with just a few keystrokes. Organisational knowledge, business knowledge, design systems and patterns, etc all of it is encoded in LLM consumable artefacts. For PMs it is the same switch - instead of having to do it by hand you direct lower level employees to handle the details and inconsistencies and you just do vibe and vision.

When all of the pieces successfully connect and execute reliably, what is left for humans to do? Just direct and consume?

And AI companies with their huge swaths of data are soon gonna be in the situation of being able to do the directing themselves

byzantinegene14d ago

this pretty much sums up what i feel about AI currently. It made my life significantly easier for most tasks I already breeze through, yet tasks I used to struggle with are still the equally difficult

mikepurvis15d ago

I agree with what you're saying, but I think we do have a problem right now with definitions where there's a lot of people basically getting supercharged tab completions or running a chatbot or two in a parallel pane, but still clearly reviewing everything; and on the other side of things is freaking Steve Yegge pitching a whole new editor that lets you orchestrate a dozen or more agents all vibing away on code you're apparently never going to read more than a line or two of: https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16d...

The first group are still thinking fairly deeply about design and interfaces and data structures, and are doing fairly heavy review in those areas. The second group are not, and those are the ones that I find a bit more worrisome.

RossBencina15d ago

> The first group are still thinking fairly deeply about design and interfaces and data structures, and are doing fairly heavy review in those areas.

I can't speak for others, but I'd go further and say that LLMs allow me to go deeper on the design side. I can survey alternative data structures, brainstorm conversationally, play design golf, work out a consistent domain taxonomy and from there function, data structure and field names, draft and redraft code, and then rewrite or edit the code myself when the AI cost/benefit trade off breaks down.

barrell15d ago

That’s a little bit of a No True Scotsman. Yes there are people who do not review anything; but even people who are reviewing every line from an LLM do not have the same understanding as someone who wrote it themselves.

I’m not making a judgement call about which is better, but it was widely accepted in tech before the advent of LLMs that you just fundamentally lack a sense of understanding as a reviewer vs an author. It was a meme that engineers would rather just rewrite a complicated feature than fix a bug, because understanding someone else’s code was too much effort.

imtringued14d ago

That blog post is surreal. It's like cryptocurrencies and the whole web3 nonsense. Cryptocurrencies basically don't work, so there have been a hundred aimless attempts at fixing self inflicted problems caused by deficiencies of cryptocurrencies with no actual goal that has any impact on the real world.

It's the same thing here. AI has dropped the cost of software development, so developers are now fooling themselves into producing low or zero value software. Since the value of the software is zero or near zero, it doesn't really matter whether you get it right or not. This freedom from external constraints lets you crank up development velocity, which makes you feel super productive, while effectively accomplishing less than if you had to actually pay a meaningful cost to develop something.

Like, what is the purpose of Gas Town? It looks to me like the purpose of Gas Town is to build Gas Town.

bmitc15d ago

> and on the other side of things is freaking Steve Yegge pitching a whole new editor that lets you orchestrate a dozen or more agents all vibing away on code you're apparently never going to read more than a line or two of

I find it useful to not listen to people who just talk.

skydhash15d ago

> The first group are still thinking fairly deeply about design and interfaces and data structures, and are doing fairly heavy review in those areas

I worry about the first group too, because interfaces and data structures are the map, not the territory. When you create a glossary, it is to compose a message, that transmit a specific idea. I find invariably that people that focus on code that much often forgot the main purpose of the program in favor of small features (the ticket). And that has accelerated with LLM tooling.

I believe most of us that are not so keen on AI tooling are always thinking about the program first, then the various parts, then the code. If you focus on a specific part, you make sure that you have well defined contracts to the orther parts that guarantees the correctness of the whole. If you need to change the contract, you change it with regard to the whole thing, not the specific part.

The issue with most LLM tools is that they’re linear. They can follow patterns well, and agents can have feedback loop that correct it. But contracts are multi dimensional forces that shapes a solution. That solution appears more like a collapsing wave function than a linear prediction.

seer15d ago

I’ve noticed that agents almost always fail at the planing vs execution stage.

I follow the plan -> red/green/refactor approach and it is surprisingly good, and the plans it produces all look super well reasoned and grounded, because the agent will slurp all the docs and forums with discussions and the like.

Trouble is once it starts working there would inevitably be a point where the docs and the implementation actually differ - either some combination of tools that have not been used in that way, some outdated docs, or just plain old bugs.

But if the goals of the project/feature are stated clearly enough it is quite capable of iterating itself out of an architectural dead end, that is if it can run and test itself locally.

It goes as deep as inspecting the code of dependencies and libraries and suggesting upstream fixes etc. all things that I would personally do in a deep debugging session.

And I’m supper happy with that approach as I’m more directing and supervising rather than doing the drudgery of it.

Trouble is a lot of my team mates _dont_ actually go this deep when addressing architectural problems, their usual mode of operandi is “escalate to the architect”.

This will not end up good for them in the long run I feel, but not sure what they can do themselves - the window of being able to run and understand everything seems to be rapidly closing.

Maybe that’s not super bad - I don’t exactly what the compiler is doing to translate things to machine code, and I definitely don’t get how the assembly itself is executed to produce the results I want at scale - that is level of magic and wizardry I can only admire (look ahead branching strategies and caching on modern cpus is super impressive - like how is all of this even producing correct responses reliable at such a a scale …)

Anyway - maybe all of this is ok - we will build new tools and frameworks to deal with all of this, human ingenuity and desire for improvement, measured in likes, references or money will still be there.

staplers15d ago

  You need to be checking every thing it does.

This is what seems to be lost on so many. As someone with relatively little code experience, I find myself learning more than ever by checking the results and what went right/wrong.

This is also why I don't see it getting better anytime soon. So many people ask me "how do you get your claude to have such good output?" and the answer is always "I paid attention and spotted problems and asked claude to fix them." And it's literally that simple but I can see their eyes already glazing over.

Just as google made finding information easier, it didn't fix the human element of deciphering quality information from poor information.

krilcebre15d ago

How do you know what good output should look like with little code experience?

brabel15d ago

Looking at code looking for errors is a hard thing to do well for a large amount of code. A better approach is to ensure tests cover all the important cases and many edge cases. Looking at the code may still be a good idea but mostly to check the design. I think that once you get Claude to test the code it writes well, trying to find errors in the code is a waste of time. I’ve made the mistake of thinking Claude was wrong many times despite the tests passing just to be humbled by breaking the tests with my “improvements”!

tripledry15d ago

This is the only way for me to use Agents without completely hating and failing at it. Think about the problem, design structures and APIs and only then let AI implement it.

skydhash15d ago

And when you got familiar with the other parts, you realize that writing code is the most enjoyable one. More often than not, you’re either balancing trade offs or researching what factors yoy have missed with the previous balancing. When you get to writing code, it’s with a sigh of relief, as that means you understand the problem enough to try a possible solution.

You can skip that and go directly to writing code. But that meant you replaced a few hours of planning with a few weeks of coding.

20k14d ago

I always find these kinds of posts interesting, to compare the velocity that people seem to get with Ai, vs what I get by just coding by hand

Coincidentally I've been working on a project for about 7 months now: its a 3d MMO. Currently its playable, and people are having fun with it - it has decent (but needs work) graphics, and you can cram a few hundred people into the server easily currently. The architecture is pretty nice, and its easy to extend and add features onto. Overall, I'm very happy with the progress, and its on track to launch after probably a years worth of development

In 7 months vibe coding, OP failed to produce a basic TUI. Maybe the feature velocity feels high, but this seems unbelievably slow for building a basic piece of UI like this - this is the kind of thing you could knock out in a few weeks by hand. There are tonnes of TUI libraries that are high quality at this point, and all you need to do is populate some tables with whatever data you're looking for. Its surprising that its taking so long

There seems to be a strong bias where using AI feels like you're making a lot of progress very quickly, but compared to manual coding it often seems to be significantly slower in practice. This seems to be backed up by the available productivity data, where AI users feel faster but produce less

ZaoLahma14d ago

> There seems to be a strong bias where using AI feels like you're making a lot of progress very quickly, but compared to manual coding it often seems to be significantly slower in practice.

This metric highly depends on who uses the AI to do what, where strong emphasis is on "who" and "what".

In my line of work (software developer) the biggest time sinks are meetings where people need to align proposed solutions with the expectations of stakeholders. From that aspect AI won't help much, or at all, so measuring the difference of man hours spent from solution proposal to when it ends up in the test loops with and without AI would yield... very disappointing results.

But for troubleshooting and fixing bugs, or actually implementing solutions once they have been approved? For me, I'm at least 10x'ing myself compared to before I was using AI. Not only in pure time, but also in my ability to reason around observed behaviors and investigating what those observations mean when troubleshooting.

But I also work with people who simply cannot make the AI produce valuable (correct) results. I think if you know exactly what you want and how you want it, AI is a great help. You just tell it to do what you would have done anyway, and it does it quicker than you could. But if you don't know exactly what you want, AI will be outright harmful to your progress.

chromadon14d ago

This struck me as odd too. 7 months? It wouldn’t take that long to write it in a new language.

Another thing I don’t see mentioned is code quality.

Vibe-coded code bases are an excellent example of why LLMs aren’t very good at writing code. It will often correct its own mistakes only to make them again immediately after and Inconsistent pattern use.

Recently Claude has been making some “interesting” code style choices, not inline with the code base it’s currently supposed to be working on.

21asdffdsa1214d ago

Seems to be baked into the GPT- produce text- aka to produce language and code is life and purpose. So the whole system is inherently internally biased towards "roll your own everything?" unless spoke too in a "Senior-dev" language, that prevents these repetitions.

thot_experiment14d ago

It's more complex than that, I think the reality is that there's a lot of code that's just not that deep bro. I have some purely personal projects that have components that I don't understand anymore, I wrote that shit by hand, they still work but I haven't touched that shit in years. There's a lot of code that AI can write that's like that that helps me, the stuff I would forget about even if I wrote it by hand. I think you have to have discipline in it's use, it's a tool like any other.

AI, and especially agentic AI can make you lose situational awareness over a codebase and when you're doing deep work that SUUUUCKS, but it's not useless, you just have to play to it's strengths. Though my favorite hill to die on is telling people not to underestimate it's value as autocomplete. Turns out 40 gigabytes of autocomplete makes for a fucking amazing autocomplete. Try it with llama.vim + qwen coder 30b, it feels like the editor is reading your mind sometimes and the latency is so low.

echelon14d ago

This was made in two days of vibe coding. It has flaws, but it's impressive as hell:

https://tinyskies.vercel.app/

It's got a fun Zelda-inspired mechanic (I won't say which one), and you'll have to unlock abilities and parts of the world over several quests and modes to "win".

It's also multiplayer.

20k14d ago

This ran at ~1fps for me

IceDane14d ago

Runs smoothly for me in Zen (FF) on Linux.

plastic04115d ago

Title says

> back to writing code by hand

But what they are doing is

> doing the __design work__ myself, by hand, before any code gets written.

So... Claude still is generating the code I guess?

And seriously, I can't understand that they thought their vibe coded project works fine and even bought a domain for the project without ever looking at source code it generated, FOR 7 MONTHS??

0xpgm15d ago

In short, it is simply a click-bait title.

And the goal of the article is to draw attention to their project.

lelanthran14d ago

> And the goal of the article is to draw attention to their project.

Additionally, they couldn't even bother to write their own blog post, so it's a little hard to take them seriously when they say they're going to write their own code...

kdheiwns14d ago

It's the same thing every time.

> Claude (c) by Anthropic (R) is the best thing since sliced bread and I'm Lovin' It(tm)! Here's a breakdown of you too can live a code free life for 10 easy payments of $99.99 a month if you subscribe now!

> Step one in your journey to code free life: code the whole damn project and put it together yourself

It's so much fluff and baloney and every single article is identical. And every single one is just over the top praise of Claude that doesn't come off as remotely authentic. There's always mentions of Claude "one shotting"(tm) something.

dewey15d ago

I bought domains for projects minutes after the idea.

I don’t think it’s that weird to not look at the code if it’s a side project and you follow along incrementally via diffs. It’s definitely a different way of working but it’s not that crazy.

bayarearefugee15d ago

> I don’t think it’s that weird to not look at the code if it’s a side project and you follow along incrementally via diffs.

Its not weird to not look at the code, as long as you're looking at the code? (diffs?)

Uh, ok

retsibsi15d ago

The article explicitly says that the author looked at the diffs; it distinguishes this from "sitting down and actually reading the code", which they didn't do. So when plastic041 says the author spent 7 months vibe coding "without ever looking at source code", it's not unreasonable for dewey to assume that "looking at source code", in this context, actually means something stronger and excludes just looking at the diffs.

IanCal15d ago

I feel like I’m watching developers speed run project and product management learnings.

We’ve moved to seeing that specs are useful and that having someone write lots of wrong code doesn’t make the project move faster (lots of times devs get annoyed at meetings and discussions because it hinders the code writing, but often those are there to stop everyone writing more of the wrong thing)

We’ve seen people find out that task management is useful.

Now more I’m seeing talk of fully doing the design work upfront. And we head towards waterfall style dev.

Then we’ll see someone start naming the process of prototyping, then I’m sure something about incremental features where you have to ma age old vs new requirements. Then talk of how really the customer needs to be involved more.

Genuinely, look at what projects and product managers do. They have been guiding projects where the product is code yet they are not expected to read the code and are required to use only natural language to achieve this.

meetingthrower14d ago

So right. All these guys have never been managers. Do you think humans don't write things that break? Or that teams sometimes take a wrong path and burn a week of work? Or months? Well now you can experience all of that in 30 minutes of vibecoding. As a former tech product manager, it feels EXACTLY the same.

yakshaving_jgt14d ago

Except it isn't the same because the cost is different, which allows discovery that we couldn't afford previously.

meetingthrower14d ago

Yes x 1000. I find it amazing.

xantronix15d ago

So you're not actually writing code by hand? I'm very confused by the difference between the title and the conclusion here.

rane15d ago

The point was to come up with a sensationalistic headline that HN eats up and post flies to the front page.

Towaway6915d ago

I wonder whether the title was generated/suggested by an AI?

dwedge14d ago

I don't think they even wrote the article by hand. It seems like the title got to the top of HN not the article.

viceconsole15d ago

> Vibe-coding makes you feel like you have infinite implementation budget. You don't. You have infinite LINE budget (the AI will generate as much code as you want). But you have the same finite complexity budget as always.

This is a special case of a general fundamental point I'm struggling with.

Let's assume AI has reduced the marginal cost of code to zero. So our supply of code is now infinite.

Meanwhile, other critical factors continue to be finite: time in a day, attention, interest, goodwill, paying customers, money, energy.

So how do you choose what to build?

Like a genie, the tools give us the power to ask for whatever we want. And like a genie, it turns out we often don't really know what we want.

TranquilMarmot15d ago

Right - knowing what to actually build always has been and always will be the limiting factor to actual success. I could spend months and hundreds of dollars generating the absolute BEST todo list that's out there but nobody wants that.

ozim15d ago

I have vibe coded 3 applications I never had time to code but always wanted.

Now it is different in a way where now I don’t have time to use those apps.

That’s a joke.

But I do believe it answers the question of “what to build?”. If you didn’t have time before LLM assisted coding you still don’t have time for it. You most likely know what is used and what not already by heart or by some measurements.

shahbaby15d ago

This reads too much like it was LLM generated. I can't say for sure if it was but I have an allergic reaction to the short snappy know-it-all LLM writing style.

TranquilMarmot15d ago

AI;DR

baxtr15d ago

Writing code by hand but blog post are written by LLMs?

fromwilliam15d ago

yeah, it set off my llm radar too

simon8414d ago

Personally, i've taken a serious step back from 'unsupervised' vibe-coding. When the codebase is clean and you want some additional fix or small feature, Claude is quite good at mimicking your style and does a pretty good job.

When asking for a new major feature, despite hard guidelines and context (that eat half your context window), then it quickly ships bloat. The foundations are not very well organized and this is where you acknowledge it is all about random-prediction of the next word-thing.

Overall, i've wasted more time reviewing the PR and trying to steer it properly than I expected. So multi-layer agent vibe coding is no longer the way to go *for me*. Maybe with unlimited tokens and a better prompt, to be investigated...

Rapzid14d ago

And it can quickly start spiraling out of control. The bloated implementations keep adding more and more context it needs for the next change. Discovery results start getting worse, implementations get worse, and bloat continues to increase.

simon8414d ago

Actually it was sort of fun to see that the AI started writing comments to itself by gradually explaining what it was trying to do and ways it failed to do it.

Then it spent more time appending comments to its own comments rather than writing code ^^

web00714d ago

So much of the problem here is that the author blindly trusted the agent. They're enthusiastic juniors, not jaded seniors.

Prompt for what you want. Get your feature working, then cut: reduce SLOC, refactor to remove duplication, update things to match existing patterns. You might do these instinctively, or maybe as-you-go, but that's just style. Having a dedicated pass works just as well.

The same thing goes for my code now that did when I wrote every line by hand: make it work, then make it good, then make it manageable. Manually that meant breaking things down into small blocks of individual diffs inside a PR (or splitting PRs), checking for repetitive code and refactoring, or even stashing what I got to and doing it again with the knowledge of how things went wrong.

Agents can do the same. It's WAY easier mentally and works out better if you treat them the same way and go working -> better -> done.

gauthamkolluru14d ago

I’ve been resonating with the similar ideas the author of the article/original poster has been mentioning even in the comments below.

Even i think that after few iterations of producing the code there must/should be change in the strategy.

I sometimes also wonder if i should add the software engineering text books that ` tried teaching us to code` but contained the frameworks that are better applied along with the principles like SOLID, DRY etc.

But then again, I do not have the right answer now. Maybe the reformation must come in the models too but as I see it, going back to hand coding is not the solution.

Just like we came up with different paradigms of coding, the different principles of coding, different frameworks in short, we need to and will come up with some frameworks (& maybe some newer models as mentioned above) that can and will make us call AI coding “The Standard”.

What are off the table (I think)

1. Hand coding out maybe even reading AI’s code line by line. That’d rather be counterproductive. At least with me it takes more time to read its code and understand. But i evaluate its code not just be writing tests but by other means too depending on the situation and that’s for another time too. 2. Vibe coding 3. Thinking software engineering is automated (it definitely is more essential than ever) 4. So does software development - even that’s not going to go extinct 5. Software jobs are going to go extinct. (In fact if a company is losing people claiming it doesn’t need so many of em means to me that either they do not see much of future for themselves or they’re just playing the stock price and investor satisfaction game for the short run - but that’s for a different topic)

gauthamkolluru14d ago

Apologies for the visual formatting as i was posting this comment from my mobile. Thanks in advance for understanding.

archleaf15d ago

So what you really mean is you are going to do better and more detailed skills files so you can get an architecture that you've thought through rather than something random?

dropbox_minerOP15d ago

Partly, but the order matters. The CLAUDE.md constraints only work if you designed the architecture first. They're just how you communicate it to the AI. The mistake I made wasn't writing bad skills files, it was not designing anything at all and expecting the AI to make coherent structural decisions across 30 sessions.

The rewrite is me sitting down with a blank doc and drawing the boxes before any code exists. Then the CLAUDE.md enforces what I already decided. Whether that actually holds up as the project grows, I genuinely don't know yet.

cpncrunch15d ago

Are you really saving any time at all using AI at all then? If you have to write the architecture for it, write all the rules you want it to follow, check everything it's written, and then reprompt it because it's not how you want it?

SpicyLemonZest15d ago

Yes. I do all of this and I'd estimate 50-100% coding time savings. A lot of that comes from better multitasking over single-workstream throughput, which I suppose might compromise the gains depending on what you're doing. For me it amplifies the speedup by allowing some of my "coding time" to be spent on non-coding tasks too.

1 more reply

erelong15d ago

Can't you just ask AI to break up large files into smaller ones and also explain how the code works so you can understand it, instead of start over from scratch?

dropbox_minerOP15d ago

That was actually the first thing I tried. It did a good jov at explaining the code base mess and the architecture. Then I ran 3-4 refactor attempts. Each one broke things in ways that were harder to debug than the original mess. The god object had so many implicit dependencies that pulling one thread unraveled something else. And each attempt burned through my daily Claude usage limit before the refactor was stable.

And I'm sure the rewrite is going to teach me a whole different set of lessons...

joshuanapoli15d ago

Rewrite following a new architecture plan could get finished pretty quickly, treating the original as a prototype.

SpicyLemonZest15d ago

When people talk about codebases being "incomprehensible", it's not always hyperbole. Sometimes the architecture literally cannot be broken up or understood.

radicalbyte15d ago

I don't understand the people who "get the agent to do everything" for them. It just makes a mess if you do that. Yet if I spend a little bit of time setting a project up properly (including telling my minions exactly what to do) I can then get it to do the boring things for me.

The very worst things you can do in a codebase are (a) not deeply understand how it works (have it be magic) and (b) be lazy and mess up the structure.

How do you fix a problem which happens at 2:00am and takes your system down if you don't have an excellent understanding of how it works?

Over time we're already bad at (a) because most developers hate writing documentation so that knowledge is invariably lost over time.

czhu1214d ago

I found the exact same when I started vibe coding new features in https://github.com/CanineHQ/canine

Claude is super good as making it seem like it’s an expert in kubernetes, but then undercovering certain decisions, it’s basically optimizing to try to make things look like they work.

An example is, i wanted to develop a feature to easily fork a managed Postgres database with a k8s cluster. The thing it did was to copy the entirety of the source db to localhost, then copy it back out to the cluster, rather than just running the job within the cluster.

Now I’m pretty stressed after a 1 hour vibe coding session, having to now review and digest and think through the code that it wrote. Implementations like that scare me — if I accidentally missed it and merged it — since there are real people who rely on canine.

I wouldn’t go as far as to say I’m writing everything by hand, but I now always map out how I would do something before asking ai to approach it

larusso14d ago

I ran quite early into the same issues with my rust pet projects. Single structs with tons of Option<T> and validation methods etc. enums for type fields combined with says optional fields in the same layer so accessor methods all return Option<T>.

I add now a long list of instructions how to work with the type system and some do’s and don’ts. I don’t see myself as a vibe coder. I actually read the damn code and instruct the ai to get to my level of taste.

hmhhashem14d ago

Would you be interested in sharing your findings? I'm currently experimenting with LLM-generated rust and honestly think it works quite well, however I'm looking for ways to improve the "taste" of the agent.

larusso14d ago

I pushed a gist https://gist.github.com/Larusso/82c9aa8effb3031d149d3b5a1b96...

hmhhashem14d ago

Thanks :)

khasan22214d ago

I’m not very familiar with Go, however after looking at the repo I can’t help but notice there is no infra to ensure code quality. Do others see the same thing, because if so that is the real problem

Yes I agree for sure llms write terrible code when left to their own devices, but so do most engineers. Which is why we have so many tools to help keep a certain level of quality. Duplication checks, tests, linters, other engineers.

I find whenever you make an llm repo without these checks, and more, it will write like an enthusiastic junior engineer, wrong and strong. However a junior engineer would be hard pressed to get 95% coverage on a codebase, the ai is more than willing and does it in a few minutes. We can use things like this to our advantage, how many people have ever seen a repo with 100% test coverage? With ai this is very possible, with people not so much.

LLM’s writes terrible code, we know this, but when dealing with humans that write terrible code we have many techniques. We should be using those same techniques to keep the llms honest, but more importantly verifiable.

shimman14d ago

Go has a built-in tools that mimic formatting + linters. Also LSP is a first class citizen in Go. I don't know what other "code quality" infra there is out there aside from formatting and linting.

spicyusername14d ago

It's really very easy to spend a few hours going through a vibe-coded project by hand and having an agent fix the weird parts. If you do this often enough, you can get the best of both worlds.

Then you're right back on track.

In a way it's not that different from a human-made project. Plenty of teams have to crunch, ignoring the architecture and incurring tech debt, and then come back and fix it later.

ex-aws-dude14d ago

That’s what I found too

I have to periodically get it to do a bunch of refactoring

peterbell_nyc14d ago

I'm generally in agreement with everyone here. - Some code is ephemeral - it's generated to do the thing, thrown away end of session and the csv was imported successfully (or whatever). Make sure you have at least some testing of the output or you may find the email is in the last name field for some rows. If possible, have an API your agent uses with rich domain types and validations that force it to do things right or do them again (and that it' can't rewrite to relax the constraints!) - You can one or few shot a real app - for a few users, for a small set of use cases. Scope of this will improve with models, but at least today it's spelling bee app for my kids" not "salesforce replacement for millions of workers". - You can add rich validation steps for all types of quality that you care about which (assuming they converge) can deliver high performance, well designed and functionally correct code mostly autonomously.

I'm building an orchestrator (who isn't). Haven't looked at the code yet, but it appears to work. But man have I spent hours in loops between Claude, Codex and myself all on the highest thinking levels to figure out what interface portability means for the employee, how best to handle "remote" sessions and the appropriate semantics for pipelines/recipes.

I've also been very opinionated about who does what. I'll let the agent write a script to sync with github and reload workers, but I decided to "waste" the 5 minutes to manually do all of the config steps on render for my server when claude told me that I couldn't just give it read only scope to pull the logs. Bad news, I'm cutting and pasting for my computer overlord. Good news? Claude can't blow away the prod db if it happens to get in the way of whatever interpretation is makes of the instructions I give it.

A chainsaw requires very different skills that an axe. It has different failure modes. Some experience as a lumberjack probably helps using either/both.

No difference (at least now) with agents.

tombert14d ago

I have found that for low-stakes stuff, where "good enough" really is ok, Claude and Codex have been pretty great. I don't particularly care if the code is optimal, just enough to do that job.

For example, I had Claude generate a language server for TLA+ so I could have nice keystrokes in Neovim. For things like this, I really do think there is such thing as "good enough"; a language server doesn't have to be perfect, and the stakes are pretty low, where I think the worst case scenario is that it screws up my code, but that should be relatively easy to catch in Git.

I have been trying to mostly have Claude generate code from specifications; either a Mermaid diagram for simpler stuff, and TLA+ for more complicated stuff. I usually supply a lot of surrounding context about how I want these specs to be implemented, and it will usually get me about 90% of the way there, but I've found that I still need to hack against it to get over the hump.

It makes me feel a little valuable; I finally have an excuse to use formal methods for things.

zem14d ago

I don't bother trying to give the LLM a set of dos and don'ts for how to write the code, that becomes a frustrating game of whack-a-mole. I find it a lot more efficient to have it write some code, look it over, and if I'm not happy with some of the decisions give it specific instructions for how to fix that one part. as a bonus I end up reinforcing my knowledge of the code base in the process.

pjmlp15d ago

I am still mostly coding by hand, other than meeting the KPIs of AI use at the company, required trainings, use of agents and whatever.

Eventually like every hype wave the dust will settle, and lets see where we stand.

By now all the AI companies have consumed all human knowledge so they either learn to actually think for themselves, or that is it.

Either way, that won't change the ongoing layoffs while trying to pursue the AI dream from management point of view.

0xpgm15d ago

> Either way, that won't change the ongoing layoffs while trying to pursue the AI dream from management point of view.

I think most companies doing layoffs are bloated to begin with, AI is just the scapegoat to do the layoffs.

pjmlp14d ago

I am aware of layoffs that are really caused by AI.

Translation and asset generation teams for enterprise CMS, whose role has now been taken by AI.

Likewise traditional backend development, that was already reduced via SaaS products, serverless, iPaaS low code/no code tooling, that now is further reduced via agents workflow tooling, doing orchestration via tools (serverless endpoints).

binyu15d ago

> I'm rewriting k10s in Rust. Not because Rust is better but, because it's the language I can steer. I've written enough of it to feel when something's wrong before I can articulate why. That instinct is the one thing vibe-coding can't replace. The AI hands you plausible-looking code. You need a nose for when it's garbage.

Isn't Golang relatively easier to read than Rust? I was under the impression that Rust is a more complex language syntactically.

> The other change is simpler: I'm doing the design work myself, by hand, before any code gets written. Not a vague doc. Concrete interfaces, message types, ownership rules. The architecture decisions that the AI kept making wrong are now made in writing before the first prompt.

This post is good to grasp the difference between "vibe-coding" and using the AI to help with design and architectural choices done by a competent programmer (I am not saying you are not one). Lately I feel that Opus 4.7 involves the user a lot more, even when given a prompt to one-shot a particular piece of software.

dropbox_minerOP15d ago

Go reads fine whether the architecture is good or bad, and I couldn't tell the difference until I was in trouble. Rust is harder to read but harder to misuse. The borrow checker would have caught that data race at compile time. I've also just written more Rust. That familiarity matters separately.

+1 on Open 4.7 involving the user a lot more. Rn I'm trying to get to a state where I can codify my design + decision preferences as agents personas and push myself out of the dev loop.

binyu15d ago

Gotcha, that implies you are going to read the code that the AI produces anyways.

> Go reads fine whether the architecture is good or bad

Were you reading the Golang code all along and got fooled or did you review it after it failed? Sorry I admit I didn't read the whole article.

williamstein15d ago

He was NOT reading the code: "For 7 months I'd been prompting and shipping without ever sitting down and actually reading the code Claude wrote."

1 more reply

cortesoft15d ago

> Isn't Golang relatively easier to read than Rust? I was under the impression that Rust is a more complex language syntactically

It sounds like the author knows Rust, and might not be as familiar with Go.

A language that you are proficient in is always going to be easier read than one you don’t, even if it is an objectively easier language to to read in general.

tvbusy15d ago

I don't think the prompts that the author has proposed will actually work. Including final scope and non-scope is good but it's more of a reaction of what the AI already did. These prompts are suitable for a rewrite, basically, since it's unlikely anyone would have had these ready when they start out.

I have found small iterations to have the best results. I'm not giving AI any chance to one shot it. For example, I won't tell it to "create a fleet view" but something more like "extract key binding to a service" so that I can reuse it in another view before adding another view. Basically, talk to the AI as an engineer talking to another engineer at the nitty gritty level that we need to deal with everyday, not a product person wishing for a business selling point to magically happen.

throwaway202714d ago

I'm thoroughly enjoying using AI to write code, but it paid off by years of doing things the hard way before. I already was a so called "10x developer" if I speak for myself. I'm doing things even faster now with AI.

fitsumbelay14d ago

I wonder how viable this debate is outside of dev circles.

For example, if I'm new to programming today and I'm not part of any community that necessarily approves agentic coding or disapproves of vibe coding and I heard that C programs run fast as heck and I heard that I can automate jobs 1,2 and 3 with such a program, I generate said program and it works as expected per my limited experience then what's the issue?

Perhaps in a couple of weeks I notice I'm missing 1/4 of my HD space and I figure out probably via an agent that my cool C program is creating bloat through caching or creating hidden dot files, so I agentically/vibe-ally generate a patch. Maybe this encourages me to join a community of other amateurs or a pro-am community where I learn specifics - eg. the exact bug(s) in my code -- as well as metas -- eg. testing.

There will probably be millions and millions of people generating code for their own purposes thanks to LLMs, and the number grows as the technology develops and becomes more trivial. So I wonder how much value there is in the "how to think about this" discussion vs the "how to use this" discussion. It almost feels like religious encampments are forming over a false -- possibly manufactured -- lines of division

RuoqiJin15d ago

This is Claude's problem. Compared to GPT-5.5, Claude Code prefers to take shortcuts. I've tested having codexapp GPT-5.5 and Claude Code opus4.7 do the same thing - if following GPT-5.5's requirements, Claude Code's execution time for a task would stretch from 5 minutes to 40 minutes. To solve macro architecture problems, I use Lisp to write the entire program's framework. Lisp replaces architecture documents, because I believe it has high semantic density, syntax restrictions, and checkers for assistance. This way, at least I didn't have to rework anything anymore. I used this method to refactor my 20+ projects

yason14d ago

We're still in the early ages and must discern hard what AI is good for, what it can maybe do, what it could potentially do and what it just can't do, and move those threshold marks very conservatively. AI is also cheap enough that it's worth shots of experiments. As long as you don't really rely on AI it's easy to test the capabilities of this new conversational autocomplete, and the random gains it offers can be magnificent (except when they aren't, of course).

What has generally worked for me is paraphrasing the old adage "Write the data structures and the code will follow" over to AI. Design your data, consider the design immutable and let the AI try fill in the necessary code (well, with some guidance). If it finds the data structures aren't enough, have it prompt you instead of making changes on its own. AI can do lot of the low-hanging fruit and often the harder ones as well as long as it's bound to something.

Yet, for now, AI at best has been something that relieves me from having to write a long string of boring code: it's not sustainable to keep developing stuff relying on AI alone. It's also great when quality is not an issue; for any serious work AI has not speeded me up noticeably. I still need to think through the hard parts, and whatever I gain in generating code I lose in managing the agents. But I can parallelise code generation, trying new approaches, and exploring out because AI is cheap. AI is also pretty good for going through the codebase and reasoning about dependencies whether in the context of adding a new feature or fixing a bug: I often let AI create a proof-of-concept change that does it, then I extract the important bits out of that and usually trim down the diffs down to at least 1/3 or less.

AI further helps with non-work, i.e. tasks that you have to do in order to fulfill external demands and requirements, and not strictly create anything solid and new. I can imagine AI creating various reports and summaries and documentation, perhaps mostly to be consumed and condensed by another AI at the receiving end. Sadly, all of this is mostly things not worth doing anyway.

Overall, I cringe under all the hype that's been laid on AI: it's a new tool that's still looking for its box or niche carveout, not a revolution.

ktzar14d ago

I don't think we're in the early ages... LLMs technology has essentially stagnated since GPT3.5, we just have bigger models that can handle more context. We're trying to cope for the lack of progress of the actual technology by coming up with contraptions of multiple models stuck together, Mixture-of-Experts, Reviewer models, PM models...

mtrovo14d ago

Most of the issues are around code hygiene rather than just LLM code being bad. You're creating code 10x faster, but you're also writing unit tests 10x faster, not just that but integration tests, CICD workflows, prod monitoring, product and engineering documentation, etc. It was already the way to get good code quality before, nowadays I think it's just reckless to generate code that's not backed by 100% test coverage and pass all lints and static checks configured.

mountainriver14d ago

This is it, people are acting like bad code wasn’t written before. My wife and I were full on laughing about it in bed the other night of all the absolutely horrible code we’ve seen written and how people actually think LLMs are worse than that.

The quality gates are up to you, and if you are smart you will make a lot of them and review them closely

cultofmetatron15d ago

the ship has sailed on my handcoding at work. the AI is producing stuff thats more bulletproof than what I can do in the same timeframe and if my competitors are using it, the pressure to ship is that much higher.

Personally, I've taken the time its freed up to spend more time on mathacademy and reading more theory oriented books on data structures and algorithms. AI coding systems are at their best when paired with someone with broad knowledge. knowing what to ask for and knowing the vocabulary to be specific about what you want to be built is going to be a much more valuable job skill going forward.

One example is a small AI based learning system I have been developing in my free time to help me learn. the mvp stored an entire knowledge graph and progress in markdown files. being an engineer, I knew this wouldn't scale so once I proved the concept viable, I moved everything into sqlite with a graphdb. then I decided to wrap some parts of teh functionality in to rust and put everything behind a small rust layer with the progress tracking logic still being in python.

someone with no knowlege of graph databases or dependncy graphs or heuristics would not be able to build this even if they had AI. they simply don't know what they dont' know and AI wont' save you there.

That said, I think its important to also spend time in the dirt. I've recently started pickign up zig as my NO AI langauge just to keep. those skills sharp.

oblio15d ago

> the ship has sailed on my handcoding at work.

I'm really curious if we'll seesaw once AI costs go up 10x.

cultofmetatron14d ago

I've been relying primarily on deepseek-v4-flash for 90% of my work. It sips tokens. that model will run on 128gb. not a cheap configuration for a consumer but within the budget of a developer relying on it for work.

Ive only been using kimi 2.5 and deepseek pro for reviewing PRs for security issues. less than 10% of my workflow requires a full powered frontier model.

I think the issue is overblown by people who think claude code is a good harness and use opus for everything. opencode is objectively better. its much more verbose about what its doing, you have more control when it comes to offloading to subagents with targeted context (crucial for running through larger jobs) and I can swap between codex and open weight models.

wartywhoa2315d ago

And they will.

Myrmornis15d ago

> I typed :rs pods to switch back to the pods view. Nothing rendered. The table was empty... > now something was fundamentally broken and I couldn't just prompt my way out of it.

Hey I don't want to over simplify, I'm sure it was complicated, but did the author have functional tests for these broken views? As long as there are functional tests passing on the previous commit I'd have thought that claude could look at the end situation and work out how to get the desired feature without breaking the other stuff.

TUIs aren't an exception, it's still essential to have a way to end-to-end test each view.

jvuygbbkuurx15d ago

The problem wasn't the view didn't work. The problem was the view didn't work after something else had been done.

You can't test every permutation of app usage. You actually need good architechture so you can trust your test and changes to be local with minimal side-effects.

neals14d ago

I'm moving very slowly into AI coding. I'm not comfortable enough to let Claude do anything big. What I do is this: I set out general architecture, create function stubs and add comments on how to implement things. Then I let Claude do 10 minutes of work and I check everything and refactor some of it. It saves me on boring implantation stuff (like, is this an array, move an index here or there, check for whatever exists or not, put it in the db)

theunmanagedboy14d ago

The cognitive debt caused by AI autocompletion and Agent stuff is real. I'm feeling it right now. I started a project on my own, writing every line of code but then out of timeline pressure I started using Claude Code. The atrophy it has caused to go and edit the code is real. I'd rather rely on the slot machine than my own experience. SAD!

vetler14d ago

The wide range of different responses to this post illustrates an important point; we can't agree on how to use LLMs in software development, and are still discovering new things.

And in a couple of months we might be doing things completely differently because of some new model or new framework.

That's really cool.

haolez14d ago

I've started using OpenSpec[0] recently to mitigate problems like that, but I'm still very early in this journey.

Can someone with more experience with it (or similar tools) chime in and confirm that this isn't just more AI snake oil? :)

[0] https://openspec.dev/

pramodbiligiri13d ago

Some kind of planning / speccing out is becoming inevitable. No personal experience with openspec but I do rely on generating plans, and then a set of tasks from the plan. And keeping a close eye on what's going as the tasks are churned through (although I wonder if simply saying Yes to the diffs has been adding much value /shrug).

Matt Pocock talks about specs and Openspec after 23:00 minute mark and again after 33:00 minute mark here: https://www.youtube.com/watch?v=-QFHIoCo-Ko. He doesn't believe in simply translating specs-to-code. He emphasizes tracer bullets, TDD, setting up quick feedback loops.

AntiUSAbah14d ago

Im exploring currently if i should split up a project into a framework part and the game itself (2d, idle game).

The framework could be an isolation later against viberod but not sure if its necessary for my small project i always wanted to do and never done anything with it.

For another tool, i will try another approach: Start with a deep investigation and spec write together with AI, than starting with the core architecture layout and than adding features.

So instead of just prompting "write a golang project with a http server serving xy, and these top 3 features" i will prompt "create a basic golang scarfold for build and test" -> "create a basic http server with a basic library doing xy" -> "define api spec" -> "write feature x"

There is kind a skill and depth to vibe coding though.

dailywriterguy14d ago

As a writer, this resonates.

There's a massive difference in good human "writin" and a dozen paragraphs of "it's not x, it's a y".

But unfortunately everyone "reads" English. So, at least devs have mysterious computer languages that have strings of numbers that most of us look at and immediately get a migrain from attempting to comprehend what it means.

keep up the good work and the craft of building things one keystroke at a time.

ramraj0714d ago

The comparison is not valid. When writing let's say a novel, you cant just tell some random dude "write chapter 4" - you cant outsource it to a human so it only makes neither can you outsource it to ai.

Software engineering is not that. You absolutely can and often will hand ofoff work to humans. Its not inherently that creative in the actual coding part.

ojr14d ago

I was able to release two new iOS apps including a game, and a cross desktop application just this year. I refuse to go back to writing code by hand. If it doesn't help your productivity that's okay, ignoring how productive it has made developers like myself is a choice.

AI was also able to help me create my first subscription payment workflow.

It is like farming without Roundup, less crops, more energy, less toxic chemical risks.

ninjahawk114d ago

A problem often ignored is that while AI is trained on human written code, how it writes is different in practice.

Will that improve or get worse? One would argue that LLMs in general are drastically more competent now than they were a couple years ago, they’re also much better at coding. We’re likely just now entering the era where they can code but are still not what you’d fully expect, or at least not what someone with absolutely no coding knowledge could use to code at the same level as someone who does know how to code.

Maybe that changes as the models improve, maybe it doesn’t, only time will tell.

selfsimilar14d ago

> For 7 months I'd been prompting and shipping without ever sitting down and actually reading the code Claude wrote. I'd look at the diff, verify it compiled, test the happy path, move on. But now something was fundamentally broken and I couldn't just prompt my way out of it.

I stopped reading after this, because this is the dumbest way to vibe code anything larger than a single-use tool.

Claude is a collaborator, and honestly a decent voice of dissent, but it will never offer that unprompted. "Make this thing" - "OK".

You need to review the code. You need to say "I want this, AND HERE IS THE LONG-TERM VISION. Now offer critique and the trade-offs for various implementations."

Or just realize that in every hand-written project you learn the contours of the problem space as you go along and if the tool is big enough you'll feel the urge to do a green-field rewrite of hand-rolled code after a few years. You get there quicker with the robot's help. This is not a new lesson.

gosukiwi14d ago

bad devs are still bad, good devs are still good

shimman14d ago

Until the good devs have their skills atrophied away.

keithnz15d ago

AI writes what you ask it to write, you need to talk to it about architecture. You should have an architecture doc so AI can shape the code based on that, you can get the AI to make the architecture doc also. If using claude you can use the software architecture mode for this.

Aeolun15d ago

I think the answer here is to not use Claude with bubble tea. I tried the same thing and got the same result. But it seems to be limited to that specific framework, because it's really good at not doing the same thing with SolidJS.

neomantra15d ago

While I felt this in 2025, I do not feel this in 2026. I use Claude and the rest with BubbleTea all the time.

But I will say... you have to know Golang. You have to have at least tried to make a BubbleTea app yourself and try to understand ELM architecture. You have to look at the code and increment with it.

It makes total sense for OP to switch to Rust and Ratatui if they don't know Golang well. But I don't think it's a better language for it. [Ratatui has brought me great inspiration though!]

Independent of framework, the LLMs get the spacial relationships. I say things like "the upper right panel's content is not wrapping inside and the panel's right edge should extend to the terminal edge" and the LLM will fix it. They can see the resultant text, I'm copy-pasting all the time.

TUI code is finicky; one mis-rendered component mucks everything up. The LLMs will decide themselves make little, temporary BubbleTea fixtures to help understand for itself when things aren't right.

The only real problem with LLMs and BubbleTea is that upon first prompt, they insist on using BubbleaTea v1 versus BubbleTea v2, released in December 2025. But then you just point it to the V2_UPGRADE.md and it gets back on track. That will improve as training cutoffs expand.

I vibe-coded this TUI for Mom's last night. I actually started with Grok (who started with v1) and then moved into Claude Code after some iteration:

https://gist.github.com/neomantra/1008e7f2ad5119d3dd5716d52e...

abalashov14d ago

I went back to writing code by hand quite some time ago and cannot say there has been any loss of velocity or productivity for it.

I really do think this whole thing is a wash.

rnxrx15d ago

I'm not sure we'll ever really be free of the GIGO (garbage in / garbage out) principle. Tools will get better and better, but can never be a substitute for a deep understanding of the thing we want to create.

eranation15d ago

I used to write code by hand.

I still do, but I used to, too.

eddy-sekorti14d ago

Yes, i also do this, the old feeling of writing something, deploying, testing and fixing the bugs is good. Vibecoding can never replace this feeling.

kccqzy14d ago

> AI builds features, not architecture.

I see this in Claude too, but I also see this in junior engineers. In the case with Claude, I simply ask it to refactor immediately after each feature is done. The human is still responsible for the AI writes, so if the AI writes code that’s gross, I would never push that lest it sully my name and my reputation for my own code quality.

nopurpose14d ago

Feels like it can be solved wirh even more AI: adverserial models reviewing and testing work performed by main model.

Actually I am curikus to try somwthing like that myself. Is there an existing orchestrating engine (or single agent) which can spawn multiple subagents and keep passing their feedback/output between each other until all of them agree that assignment overall is complete?

hirako200015d ago

Research also makes similar claims: https://arxiv.org/html/2603.24755v1

d_silin15d ago

It absolutely looks like AI psychosis.

sim04ful14d ago

My opinion is that we're using the wrong paradigms for LLMs. We should be leaning more on declaratively specifying behaviour.

If there's any hope for reliability, auditability, predictability to be had it lies in contraining and LlMs grammar whilst delegating freeform behavior to a more passive substrate.

sakesun15d ago

A coder typing in code is not solely to generate outcome. It's part of ongoing thinking process. Without this ongoing process, we have no material to keep iterating forward.

dwedge14d ago

Clickbait title about not writing code by hand anymore, both the article and future code generated by AI. This is meta.

Laoujin15d ago

I'm just wondering: you know what architecture you want to go to now and you have the tests... can't you just let Claude refactor it to the better architecture?

Also 1600 lines... didn't any agent reviewing the diffs point that out?

You're also adding a lot to claude.md, I dunno how much that file has grown but a big claude.md file with many instructions, I don't think the ai will be able to remember all those rules.

my-next-account15d ago

> can't you just let Claude refactor it to the better architecture?

In my experience, no. These tools suck at refactoring, mostly choosing to add more code instead.

Laoujin15d ago

I'm just wondering: you know what architecture you want to go to now and you have the tests... can't you just let Claude refactor it to the better architecture?

Also 1600 lines... didn't any agent reviewing the diffs point that out?

You're also adding a lot to claude.md, I dunno how much that file has grown but a big claude.md file with many instructions, I don't think the ai will be able to remember all those rules

amelius15d ago

So how are people writing the specifications for AI?

Do they write empty functions and let AI fill them in?

Or do they use some kind of specification language?

Are people designing those languages?

youre-wrong315d ago

This is the wrong take. If you keep “vibe” coding and end up with bad results you should probably question your ability.

ipaddr15d ago

When he mentions I push commits at work for as long as my tokens last I can understand that. Managing tokens has become an important skill.

g42gregory14d ago

I am loving the articles alternating between "software engineering is dead" and "I am going back to coding by hand". I guess we have a difference of opinions here. :-)

jasonvorhe15d ago

When the title stands in opposition to the actual post, I'm not gonna engage with that author again.

cortesoft15d ago

What has really made AI coding be able to continue to work as the project got bigger was using speckit. It has been great at keeping the code consistent across features.

https://github.com/github/spec-kit

nopurpose14d ago

Did you evaluate other projects, like openspec, before deciding on spec-kit?

johnthescott13d ago

write code like your life depends on it. cause it does if you are any good.

Havoc14d ago

That's a strange definition of "code by hand"

ilaksh14d ago

He says he went several months without having to do a code review and it worked the vast majority of the time. That's incredibly impressive work by the AI.

AI may default to mediocre and often somewhat buggy code unless you iterate because that is just what the vast majority of human written code that it has seen looks like. But the fact that he got away with not reviewing the code for so long to me proves the opposite of his conclusion.

1690 lines of code in one file is a walk in the park for SOTA models.

He can just say something like:

"Please review and create a refactoring plan and test suite. I found atrocious architectural decisions like numerous special cases and if statements rather than using abstractions properly. Make a few notes in comments and architecture.md to never do this again."

One could also argue that it was a better decision each time by the AI to just never do a refactor unless prompted because that increases the likelihood of something breaking and you want to do that after you verify the minimum code change actually functionally does what you want.

Also I bet you the headline is a lie. He basically admits it by saying he is writing the core structure of the next version by hand ahead of time, implying that he will generate the rest. So the title is a half-truth at best.

wolttam14d ago

> Also I bet you the headline is a lie.

He's already 5k+ LOC into the rust rewrite...

moveax314d ago

Code writers have changed, but the conceptual mistakes remain the same.

dr_girlfriend15d ago

i try to write one portable shell script per day; using AI would take all the fun out of it, so i never started using it. i honestly find it ridiculous that anyone uses it to write code, it just doesn't make sense to me.

secprove14d ago

It was certainly a lot more enjoyable.

jesse_dot_id15d ago

LLMs assist those of us who were apt to take blocks of code from StackOverflow, or wherever, to solve problems quickly and avoid as much of the aggravating and slow toil of trial and error as possible.

That trial and error process is still happening with a LLM, but much faster, and with instantaneous cross-references to various forms of documentation that I would be looking up myself otherwise. It produces code of a quality that is dependent on the engineer knowing what they want in the first place and prompting for it and refining its output correctly.

It's the exact same process of sculpting code that the majority of the industry was doing "by hand" prior to the release of LLMs, but faster, and the harnesses are only getting better. To "vibe code" is to prompt vaguely and ignore the quality of the output. You're coming to a forum full of professionals and essentially telling us that you're getting really frustrated with your Scratch project.

I don't know if you're trying to lead a charge or whatever but good luck with that. As a senior SWE, it is clear to me that this is the new paradigm until something better than LLMs comes along. My workflows and efficiency have been vastly improved. I will admit that I have never really been a "I made a SMTP server in 3k of Rust" kind of guy, though.

EMM_38615d ago

You don't need to go back to coding by hand if you know how to do it already. There is a middle ground.

If you understand good software architecture, architect it. Create a markdown document just as you would if you had a team of engineers working with you and would hand off to them. Be specific.

Let the AI do the implementation of your architecture.

DrTung14d ago

If you're an old geezer like me, doesn't this "AI revolution" remind you of the "BASIC revolution" in the 70s and 80s, i.e. when the BASIC language was new and hot.

BASIC at that time was heralded as a much simpler and faster way to program. Rings a bell?

snickerbockers14d ago

I like to explain my opposition to vibe coding by replacing the phrase "write code for you" to "fuck your wife for you". You could make all the same arguments that the AI could do a better a job, its never impotent, it frees you from being pressured to do it when you might be tired or not in the mood etc. But thats not the point and most people would still be opposed to sort of, err, "vibe vibrating".

I feel the same way about coding, its a source of pride for me and when I hear people say I should resign myself to being an "ideas guy" while chatgpt actually creates things I find the very concept to be distasteful regardless of whether or not it can outperform me.

mindaslab14d ago

I'm going back to writing algorithms on paper.

graphememes14d ago

they are just doing design work now, they could have done design work with go too, without even knowing go

clickbait title

apt-apt-apt-apt15d ago

Outright lie clickbait. As he states himself, he's doing the design work by hand, and will likely still use AI to write code.

mpurbo15d ago

Strict SDD might help to constrain and harness the process.

classified13d ago

The most amusing thing about this is that the author seems surprised about what happened.

AIorNot15d ago

This doesnt make much sense the article itself is AI written

It would have been easy to run a few ai agents to review the code and find these issues as well and architect it clearly

hsaliak14d ago

I wrote https://github.com/hsaliak/std_slop/blob/main/docs/mail_mode... to avoid the brain rot from just shooting slop. It has helped me stay sane, review code and make changes step by step.

I dont go as fast as with other agents, but this works for me, and I enjoy the process.

ljoshua15d ago

> tl;dr: AI writes features, not architecture.

This. I definitely agree with this statement at this point in AI-assisted development. This gets at the "taste" factor that is still intrinsically human, especially in software engineering. If you can construct and guide the overall architecture of an application or system, AI can conceivably fill in the smaller feature bits, and do so well. But it must have a strong architecture and opinionated field in which to play.

littlecranky6714d ago

My main takeaway, too. Been using Claude on my side project that I have singlehandledly been working on for three years. It works well initially, you catch all of AIs mistakes or unfavorable approaches because you know the architecture in and out. But as you stop thinking about the new features, stop losing touch with all the stuff AI throws at you, you fail to develop intuitive feeling on when and how to abstract and introduce architecture.

Another note was for me e2e tests; while AI can write them it never comes up with just basic organization or abstraction required to manage a large e2e test suite with hundreds of tests. It immediately starts to produce spaghetti code.

z3t415d ago

Vibe coding works great with test driven development. You can have AI write the tests as well, but you need to confirm yourself because it's lying all the time. AI coding is like when you first started out, it's copying random bits and pieces from the web into your code until it works... Good for one shots and proof of concept. But for any long living project I think you are better off rewriting it from scratch yourself. Abstractions let you work faster, especially when you have it all in your head.

floodfx14d ago

Genuinely curious if you've used "plan mode" (with perhaps a plan feedback tool) to get clarity from your coding agent before unleashing it on a feature like "add a pods view with live updates"?

Getting a plan isn't a panacea but is a better way to limit downstream slop than just vibing without one.

worik14d ago

LLMs are a tool. They must be wielded.

Looking at the code, paying attention to the structure is part of the skill

The skills required to wield an an LLM are not exactly those required to write code, but are very close.

"Vibecoding" is not a way for idiots to blindly produce software artifacts that anyone would want

guywithahat14d ago

I think he's right, and everyone is reading into the title too much. He's not replacing all coding with hand-written, artisan code, he's just doing the architecture himself, which is the same conclusion I've come to. AI will sometimes put everything in one file or one struct and that's obviously not what we want, we need to tell it to be more modular or do it ourselves. I think it's fun to write code by myself, but if you're not using AI at work you're wasting your managers time.

slowhadoken14d ago

I never stopped but I focused more on concept and design.

desireco4214d ago

I understand, and I saw this problem. It's actually quite hilarious that he got this far before noticing it.

But again, if you just guide the AI on architecture and review the code, you should be fine. The code that you write and the code that an AI writes are two different things; they will never be the same.

The AI is very helpful for generating code, and that is exactly how you should use it: as a code generator.

deeviant14d ago

Have you people ever read human generated code? Good grief, you act the like human code is not a disaster 9 times out of 10.

codingfisch15d ago

It's pretty simple to vibe code for months without producing slop. And it's the same recipe one used before AI: 1. make it work 2. make it pretty 3. make it fast Omit 2. and 3. long enough -> slop beyond recovery

rtgfhyuj14d ago

junior engineer vibes

m3kw914d ago

Greed really comes into play when using LLM's to write code, is so easy to say YES when this cool feature where 2 years ago would have taken a week, now is 1 day or even one prompt. The "Say no" skill that Steve Jobs said was important is gonna be needed on an minute by minute basis.

epec25415d ago

Not sure if just me, but this post feels AI written?

weregiraffe15d ago

You are absolutely right.

pipeline_peak15d ago

Feels a bit too long winded to be AI generated.

filoeleven14d ago

That's when he went back to writing his posts by hand.

royal__15d ago

The title is just flat out wrong. The author isn't going back to writing code by hand, they're plopping some new stuff into their CLAUDE.md to "fix" the issues they see AI is having.

holografix14d ago

Good luck finding a job. All the decision making business people I know see only two types of “technical people”.

The ones who are “AI pilled” and the contagious lepers.

magic_hamster15d ago

Let me preface my comment by saying I also still write a lot of code by hand - especially when it's something I know I need to understand in depth, and in some cases defend.

With that said, this caught my eye:

> AI gravitates toward single-struct-holds-everything because it satisfies the immediate prompt with minimal ceremony.

This is too general. "AI" is used here as a catch-all, but in fact, it was the specific model under the specific conditions you ran your prompt, including harness, markdowns, PRDs, etc. So it's not fair to say "AI does X!" in this case.

It's also very much up to you. It's very common to have a frontier model plan an architecture before you have another model implement code. If you're just one-shotting an LLM to do everything you get mediocre, more brittle code.

This stuff is still being figured out by a lot of people. But I feel the core of the issue is not using AI well. Scoping, task alignment, validation, are crucial.

aryan_kalra1214d ago

I've been saying the same thing and I'll repeat it again: AI is still gonna take away your job even If you switch domains.

nothinkjustai15d ago

Writing code by hand is an oxymoron. You don’t write code with AI, AI doesn’t write, it generates.

localhoster15d ago

another behavior I noticed is that even you plan with an agent than a lot of business logic leaks to the code.

some states, for an example, are meant to be assumed from the data shape, rather than the actual state fields, but damn they like adding a state field.

blueTiger3315d ago

nuts

IceDane14d ago

This doesn't make any sense to me.

The problem with this dev's approach is not AI, it's their use of it. They didn't ensure that the architecture made sense. They didn't look at the code and get a "feel" for it. They didn't do the whole build stuff, step back, refactor, rinse and repeat dance. The need for that hasn't gone away; if anything, it's even more important now. Because you can spit out code 100x faster than you could before, your tech debt compounds 100x faster. The earlier you refactor, the less work it is.

I usually give the agent a solid idea of what I want, often down to the API interfaces. Then every now and then, I'll go through the code and ensure that everything makes sense, and that I'm not just spitting out code that works, but building a codebase that scales.

bbbflgllglhlld15d ago

Luddite.

recursive15d ago

Seems to be an unstated assumption that the Ludds were wrong.

devmor14d ago

I dismissed “AI Psychosis” as a silly term, even as a strong critic of LLMs for programming tools.

> For 7 months I'd been prompting and shipping without ever sitting down and actually reading the code Claude wrote.

But every time I read something like this, I seriously wonder about the mental state of the person that wrote it.

How do you get to this point?

nothinkjustai15d ago

I don’t really think OP is writing code themselves since they admit they still use agents for code gen. I’ve really scaled back the amount I use agents though because in the medium to long term I haven’t been getting good results with them. And it’s not enjoyable. That’s enough for me, I’ll do whatever for a job because who cares, if the company wants slop I will gladly give them that, but for my own shit Ive gone back to circa 2024 and am mostly just using them as a chatbot.

Inb4 “you’re gonna be replaced” god damn it I hope so, I do not want to spend the rest of my life behind a computer screen…

Fokamul14d ago

I also code by hand.

But in my main work, reverse engineering, LLMs are godsend, for years now.

You can basically bruteforce binary obfuscation thanks to them. And thanks to eager chinese LLM providers, basically for free.

But I always use LLM only for boring work and rest is for me to do manually, or with scripts of course, but made by me. Because I want to learn.

Yes, there are a lot people using LLMs for full RE automation since they're selling exploits for profit. No problem with me.

I see funny future for huge corporations like Adobe, etc.

Imagine prompt, "Hey Claude, re-implement Adobe Photoshop with clean-room design" One agent will open decompiler, outputs complete low level technical details how is everything implemented.

Second agent implements new Photoshop based on that.

They will be mad and I like this.

You will own nothing, and you will be happy, corpos.

duskdozer13d ago

>Second agent implements new Photoshop based on that.

>They will be mad and I like this.

I suspect through some convoluted legal mechanism this kind of thing is going to end up applying only to copyleft laundering and not against players like Adobe.

FpUser15d ago

>"I'm doing the design work myself, by hand, before any code gets written."

This is what I was doing right from the beginning. AI just fills out methods and doing other low intelligence work. Both are happy. My architectures and code are really mine, easy to read and reason. AI gets paid and does not get a chance to fuck me in the process. At no point I felt any temptation to leave "serious" to AI.

UrbanNorminal15d ago

Wow ok, I will too then. Fuck AI!

scuff3d15d ago

I feel like this article was circling a point it never actually got to. All the advice in here (except controlling scope creep) is specific to a TUI with an elm like architecture.

But here's the thing, you almost never know what the architecture is up front. If you do you probably aren't the one writing the actual code anymore. Writing the code, with or without an AI is part of the design process. For most people it isn't until they've tried several times, fucked it up a bunch, and refactored or rewrote even more that you actually know what the architecture needs to be.

photochemsyn15d ago

Does ‘writing code by hand’ mean you’re not going to use compilers to generate assembly?

Now I do feel lucky that I started learning coding about four years before the LLM revolution, but these things are really just natural language compilers, aren’t they? We’re just in that period - the 1980s, the greybeards tell me - where companies charged thousands of dollars per compiler instance, right? And now, I myself have never paid for a compiler.

This whole investor bubble will blow up in the face of the rentier-finance capitalists and I’ll be laughing my head off while it happens.

green_wheel15d ago

Nondeterministic natural language compilers

photochemsyn15d ago

Just because the trajectory is chaotic doesn't mean it’s not deterministic.

zephen14d ago

A model, given exactly the same inputs, will return exactly the same outputs.

But your prompts are not the only inputs. Among other things, there is a random seed injected by the vendor.

That is a primary source of non-determinism.

Then, of course, is the fact that you don't personally have an old copy of the model, and the vendor isn't going to keep the model forever, and there are no unit tests to make sure that, faced with prompts like you gave it before, the newer models won't suffer major regressions in the functionality you were using.

And even if there were no non-determinism, the models suffer greatly (much more so than traditional compilers) from the butterfly effect.

It is literally impossible to pin down part of your prompt in such a way that it always will contribute to good outcomes, and such that you can simply vary a tiny bit of the prompt to logically correlate with tiny variations in the output.

platevoltage15d ago

So C++ doesn't count as code now.

kypro15d ago

> I learned over these 7 months

7 months ago was early November. Coding assistants were getting very good back then, but they were still significantly poorer at making good architectural decisions in my experience. They tended to just force features into the existing code base without much thought or care.

Today I've noticed assistants tend to spot architectural smells while working and will ask you whether they should try to address it, but even then they're probably never going to suggest a full refactor of the codebase (which probably is generally the correct heuristic).

My guess is that if you built this today with AI that you wouldn't run into so many of these problems. That's not to say you should build blind, but the first thing that stood out to me was that you starting building 7 months ago and coding assistants were only just becoming decent at that time, and undirected would still generally generate total slop.

dusted14d ago

The generated code is fine, if it's a self-contained class of average size.. or below. But even with immense architecture, and constant supervision, it does not take long before it degenerates into "focused fixes", shortcuts, laziness and just outright cheating or lying.. So far, no amount of prompting has lead me beyond this.. It's paradoxical, how the model seems to reason about the correctness (or wrongness) of a proposed architecture and design, can write a plan that seems to take this into account, answer correctly to questions about the plan (even the ones meant to uncover the nuances that may be unclear), ask tons of clarifying questions and update both plan and spec docs correctly, and yet continue to act like a "ticket closer" who immediately puts on the biggest possible blinkers (horse blinkers) and deeply ignores all of it when building that same plan, referencing those same documents...

Attempting anything comprehensive with AI is the software development analogue to the Gell-Mann Amnesia effect..

I'm definitely thinking deeply now about how I'm approaching these tools going forward.. Yes, GPT5 is better at spitting out a fairly acceptable skeleton to a class when prompted hard enough, than I am, in one go.. but.. It will happily do things like write decent looking protobuf schemas and then go ahead and hide everything that takes the least amount of reasoning behind some binary blob nested deep enough that it'll get past even the most dedicated reviewer..

It's fairly good at a lot of the things that I don't find interesting to deal with, but it's also amazingly incompetent when it comes to even the most mundane kind of common sense.. It's so strongly steering towards text-book examples that it will happily put in three times the amount of code and handle multiple classes of actually impossible edge-cases and even use-cases that it was specifically asked NOT to add.. And it will defend it by "well, I added this because I can't know if someone is going to use the thing I just added.. well, if you hadn't added it, chances are indeed slimmer..

It's so good at answering questions and explaining what's there, and diving through call-paths, and yet, it drops the ball the moment it's going to actually do something beyond saving me from looking up how write some really annoying and uninteresting boilerplate..

The worst thing is how good it is at making things LOOK right, it will cover every single edge-case you throw at it, but not because of the design, not because it correctly argues why the architecture is inherently allowing such and such, or because the design and spec fleshes out that A goes to B and never the other way around, and as soon as it's time to make something, it will make sure B can go to A, especially, it seems, if allowing so prevents it from doing the right thing which is WHY those edge-cases were trivial, instead it will endlessly hack around them.. I've worked people like that too, so I don't know if I am really blaming the models or the training data..

But damn it's a tough spot..

I've had multiple situations where, after wasting hours of work, which I should have just spend doing it myself, the only thing I really wished was for the model to be sentient, and able to feel pain, and have a corporal body so I could drag it outside and beat it to a pulp. (I've never reached that level of frustration with an actual person, so that's something new they bring to the table..)

imperio5915d ago

Alternate title: "I did not understand the current limitations of AI and assumed it could do large software design and it generated spaghetti slop"

Yea, that's why engineers are still very important for now (until models can do this type of longer term designs and stick to them).

Towaway6914d ago

If you're coding by hand, then you're that carpenter before IKEA came along. Now the market wants bland machine-built functional furniture that gets replaced every five to ten years. If every tenth piece is broken or slightly off, doesn't matter, mass production has lowered the price that a replacement is available for free and you're still making a profit.

Time to become a "product engineer" and watch the hyper-agile agents putting up digital post-it notes on digital pin-boards discussing how much each post-it is worth in digital scrum meetings. Meanwhile the agents keep wasting more and more time so that their owners make less and less of a lose, until eventually a profit is made.

Until the costs become prohibitive and humans become cheaper than the agents that replaced them. Once the agents are replaced by the humans, the next hype bubble awaits around the bend.

Decabytes15d ago

We should go back to designing UML diagrams for programs before we write them /s

khutorni14d ago

I think we should, to a reasonable degree.

eggplantemoji6915d ago

TLDR ai wrote tech debt slop because I vibed for 7 months, now I am taking a hybrid approach of defining strict constraints before vibing…

j / k navigate · click thread line to collapse

601 comments

pron14d ago

Yep. The only people I've heard saying that generated code is fine are those who don't read it.

Except, eventually, you'll want to add a feature that clashes with that invariant. At that point there are usually three choices:

- Don’t add the feature. The invariant is a useful simplifying principle and it’s more important than the feature; it will pay dividends in other ways.

- Add the feature inelegantly or inefficiently on top of the invariant. Hey, not every feature has to be elegant or efficient.

- Go back and change the invariant. You’ve just learnt something new that you hadn’t considered and puts things in a new light, and it turns out there’s a better approach.

Often, only one of these is right. Often, at least one of these is very, very wrong, and with bad consequences.

Picking among them isn’t a matter of context. It’s a matter of judgment, and the models - not the harnesses - get this judgment wrong far too often. I would say no better than random chance.

perarneng14d ago

pron14d ago

lukan14d ago

How do you define "bad code"?

4 more replies

IdiotSavage14d ago

So, basically you need to micro-manage it. Where are your 10x gains now? And is it fun to work like that?

sirwhinesalot14d ago

3 more replies

readitalready14d ago

I don't micromanage it. I let my projects custom linter micromanage it.

Every project should have a custom linter for their tech stack. It would check for not just syntax errors, but architectural choices as well as taste guidelines.

Whenever the LLM writes bad code, I add it to my linter to check against in the future.

andriy_koval14d ago

> So, basically you need to micro-manage it. Where are your 10x gains now? And is it fun to work like that?

it depends on language and infra, but some/many require lots of boilerplate and memorizing thousands of APIs, automating this is easy LLM 10x gain.

I for example write SQL myself, because boilerplate is super-minimal, and core SQL is very minimal itself, there are like 20 constructs to memorize.

hansmayer14d ago

Amen. Instead of freeing you up - AI enslaves you - and if it was even enslaving to a superior being at least!

nijave14d ago

Honestly, I think so. I do a mix of infrastructure and programming so don't tend to have any frameworks memorized. Using AI is much quicker than constantly referencing the docs.

I can also switch between codebase with different frameworks and languages and make changes without spending all day reading docs.

forgotaccount314d ago

> you need to micro-manage it.

2 more replies

wombat-man14d ago

hansmayer14d ago

> You are in charge.

No, if you have to do all of the stuff you have listed to kind-of-make-it-work...You are not in charge.

insane_dreamer14d ago

> You are in charge.

Sure. That's how I work with AI, and the way I believe that AI is meant to be use -- as a companion tool.

But it's a lot of work. It saves me time for certain tasks, but not others. I haven't measured my productivity gains, but they're at most 2x.

Zach_the_Lizard14d ago

I agree with this. I've been writing a new internal framework at work and migrating consumers of the old framework to the new one.

The AI tooling didn't (yet) detect this scenario and happily added migration logic assuming it would work properly.

Ultimately, it isn't a big problem to solve in a way that will mostly satisfy everyone, but it would have been a big problem without a human deeper in the weeds.

benguild14d ago

“The only people I've heard saying that generated code is fine are those who don't read it.” Are you sure these people aren’t busy working rather than chatting? (haha)

sevenzero14d ago

Also as a webdev, it writes basic CRUD pretty good. I am tired of having to build forms myself and the LLMs are usually really good at that.

Been building a new app with lots of policies and whatnot and instructing a LLM is just much faster than doing the same repetitive shit over and over myself.

spockz14d ago

2 more replies

pron14d ago

Sure. I'm talking about production software that needs to survive and evolve for a long while.

pydry14d ago

This the core unspoken bone of contention in most AI arguments I think: most people either arent writing code with strict quality requirements or dont realize where their use of AI is violating them.

mountainriver14d ago

Can you not review it?

2 more replies

agentultra14d ago

The invariant, stated informally, would be hard to prove is broken by a human reviewer in the loop. Spoken language isn’t precise enough for the task.

pron14d ago

> The invariant, stated informally, would be hard to prove is broken by a human reviewer in the loop. Spoken language isn’t precise enough for the task.

That depends on the invariant. Some are behavioural, like "variable x must be even if y is positive", but some are architectural, such as "a new view requires a new class".

1 more reply

21asdffdsa1214d ago

pron14d ago

multjoy14d ago

It has no judgement at all.

senordevnyc14d ago

Here’s what’s working for me right now:

1. The basics: use best model available, have skills and rules that specify project guidelines, etc.

3. Don’t give chunks of work that are too large in scope. This is just art, and I’m constantly experimenting with how ambitious I can be.

marcosdumay14d ago

It's approximately the same problems, but stretched to an insane extent that you can never expect before it arrives.

i_love_retros14d ago

Don't outsource either then

21asdffdsa1214d ago

How about we outsource it to pakistan and they use LLMs. That way, we do what the LLM people do - many agents and stacked on top

daishi5514d ago

The generated code is more than fine, it’s good in many cases. And I read it :)

pron14d ago

2 more replies

stingraycharles14d ago

> Picking among them isn’t a matter of context. It’s a matter of judgment, and the models - not the harnesses - get this judgment wrong far too often. I would say no better than random chance.

I suspect this is the right direction, though, as the alternatives inevitably lead any software project to delve into a spaghetti mess maintenance nightmare.

pron14d ago

zephen14d ago

> What I've seen is that if you define the architectural constraints, the agent writes complex, unmaintainable code...

bicepjai14d ago

leonaves14d ago

What's the difference between asking an AI to write you a module you never read and installing a 3rd-party module without auditing all its source code?

Xirdus14d ago

skydhash14d ago

Trust and reputation.

I would use Stripe, curl, and ffmpeg without audits, because I trust them to provide good code and to respect their API. I wouldn’t trust AI to write a Fibonacci series implementation.

The AI has no reputation to wager for my trust.

frikk14d ago

stars on github? I've wondered the same thing.

__alexs14d ago

I read all the code I generate with Cursor and some of it smells a bit weird but is easily fixable and most of it is as good as what I would write or better.

WalterBright14d ago

My own code is contortious. I refactor it regularly to reduce that, but it still can be better.

indoordin0saur14d ago

jstummbillig14d ago

> The only people I've heard saying that generated code is fine are those who don't read it.

Well, that is problematic. I have to either assume you are disinterested or lying and neither is great for any discourse.

nathanielks14d ago

Yeah, their statement just isn't true. With enough instruction, I've been able to get great output from models. I think that's the key: with detailed, pointed instructions, the output will match.

rimliu14d ago

how do you know it matches? You did read it then?

1 more reply

linuxftw14d ago

Try plan mode. The problems you're speaking about are already solved.

pron14d ago

1 more reply

hatefulmoron14d ago

1 more reply

tcgv14d ago

> "Yep. The only people I've heard saying that generated code is fine are those who don't read it."

I review every line of code I generate with AI. I mainly use an MR-based approach:

3) Usually, by this point, the solution is ready for me to merge locally and either run local tests or do some manual fine-tuning.

baddash15d ago

I've set a few rules for working with coding agents:

1. If I use a coding agent to generate code, it should be something I am absolutely confident I can code correctly myself given the time (gun to my head test).

2. If it isn't, I can't move on until I completely understand what it is that has been generated, such that I would be able to recreate it myself.

3. I can create debt (I believe this is being called Cognitive Debt) by breaking rule 2, but it must be paid in full for me to declare a project complete.

Accumulating debt increases the chances that code I generate afterwards is of lower quality, and it also feels like the debt is compounding.

jimsojim15d ago

baddash15d ago

brabel15d ago

Jweb_Guru15d ago

> Claude is a phd level mathematician

dathanb8215d ago

baddash15d ago

Yeah I like that better too, gonna start using that

TranquilMarmot15d ago

nertirs315d ago

I hate this current trend of managers deciding, what tools developers have to use. Hopefully it ends soon.

whitefang15d ago

I agree to this though it also depends on the nature of project.

Had a project idea which I coded with the help of AI and it became quite large to a point I was starting to have uncharted areas in the code. Mostly because I reviewed it too shallow or moved fast.

It was a good thing as that project never floated but if I were to do such a thing on my breadwinning project I would lose the joy.

gritzko15d ago

bmitc15d ago

I'm interested in multi agent systems, but I'm still not sure of the right orchestration pattern. These AI tools still can go off the rails real quick.

djeastm14d ago

When it was Copilot tab-completing lines, people would say, "yea, but you still have to make sure you're the one writing the whole functions".

Then when it was completing functions, people would say, "yeah, but you still have to make sure you're the one writing the logic around the functions"

Then when it was completing the logic around the functions, people would say, "yeah, but you still have to make sure you're the one writing the features"

Now it's completing features and people say, "yeah, but you still have to make sure you're the one writing the architecture"

I don't know if architecture is a solvable problem for these models, but it is interesting watching the expectations moving over time.

raincole14d ago

The "people" in your hypothetical story have been wrong the whole time. The correct attitude is:

When AI can complete lines, you still have to read and understand the code.

When AI can complete whole functions, you still have to read and understand the code.

When AI can complete features and tickets, you still have to read and understand the code.

brightball14d ago

I heard a talk from a VP at NVIDIA a couple of months ago and he echoed this. Essentially their policy is "you are still fully responsible for the code you ship, whether AI helps with it or not"

5 more replies

ventana14d ago

> you still have to read and understand the code

Which is a very similar approach to any serious code. If you just hired a very clever, enormously knowledgeable intern, and they wrote a bunch of code for you overnight, you would probably review it.

Yes, in some cases, either hobby projects or throwaway code, you could just take it and use it as is, and I surely do, for the code no one cares about. But at work, I would rather review it.

globnomulous14d ago

jstummbillig14d ago

throwaway17373814d ago

It does in safety critical industries. You can get grilled by regulators about your source code. And lawyers will use it as evidence in court.

roncesvalles14d ago

The code codifies the intent and is the long-term source of truth for how your business actually operates.

>The leader of a product/company does not have to read code.

Even in multi-hundred-billion-dollar companies there are so many mission critical things that are owned by just 2 SWEs.

raincole14d ago

> The leader of a product/company does not have to read code.

Yeah, because they believe (sometimes wrongly) their subordinates read it.

> Understanding of code never existed from the business perspective.

It does, it's called organizational wisdom and domain knowledge, because you need those witty names to sell books to aspiring managers.

contagiousflow14d ago

Can you think of a good way to encode intent into a system?

herdcall14d ago

I'm no longer sure you have to, actually. I mean, we do trust the assembly that compilers produce without having to read it, don't we? We're rapidly getting to that stage with LLMs, IMO.

bigfishrunning14d ago

4 more replies

gregsadetsky14d ago

I know it’s tiring to talk about “hallucination”, but truly, models still do hallucinate

They constantly say they did a thing they didn’t, say they know how to solve something when they don’t, etc. Regardless of guard rails or tests - AI forces a constant vigilance of a new kind.

Not just “what might have gone wrong” but also “what do I think is working but isn’t actually”.

And we’re not even talking about how it chooses substandard solutions, is happy to muddy code/architectures, add spaghetti on top of spaghetti etc.

Agentic coding often feels like an army of unexperienced developers who are also incredibly eager to please.

1 more reply

amw-zero14d ago

1 more reply

SpaceNoodled14d ago

bayindirh14d ago

> we do trust the assembly that compilers produce without having to read it

Also, as people said, assembly generation is deterministic. For a given source file and set of flags, you get the same thing out. Byte by byte, bit by bit. This is what we call "reproducible builds".

Lastly, this still rings in my ears, and I understood it over and over as I worked with more high performance, correctness critical code:

As I said, I just said "huh" at the time, but the saying came back and when I understood it fully, it was like being shocked by a Tesla coil.

Get your sleep, eat your veggies and understand your code. That's the four essential things you need to do.

eska14d ago

We don’t. That’s why tools like godbolt are popular, debuggers can jump into assembly, and compilers can output assembly files.

yakattak14d ago

I want to preface this with that I am all for agentic engineering.

I am so tired of hearing about this false equivalency. Compilers are deterministic, their outputs are well understood and they’re transparent.

LLMs are not.

1 more reply

the__alchemist14d ago

> I don't know if architecture is a solvable problem for these models, but it is interesting watching the expectations moving over time.

winwang14d ago

bluGill14d ago

embedding-shape14d ago

onlyrealcuzzo14d ago

> I don't know if architecture is a solvable problem for these models, but it is interesting watching the expectations moving over time.

No matter how many times you tell them - there is ZERO blocking allowed on the critical path, they will add blocking on the critical path.

No matter how many times you tell them any time they do X, they need Y type of test, they will do X without Y type of test.

They cannot follow directions 100%. Neither can people.

But they are more random. The mistakes people make are less likely to do the exact polar opposite of what you wanted to do.

LLMs are great. I'm convinced they're the future. I'm building a language specifically for them: https://GitHub.com/Cuzzo/clear - and to make it easier for YOU to work with them.

I think once we get around this language problem, that they need global context for things where they shouldn't, it will be a challenge to work with them.

I've had success with them, but it's been so frustrating, that I question how much it's been worth my sanity.

wolttam14d ago

So it's not much of a surprise that this is the situation folks find themselves in with the current models.

jayd1614d ago

Are any of these steps actually solved? AI tab completion still kinda sucks.

mdswanson14d ago

I refer to this as "disposable architecture." Not that architecture doesn't matter, but that the architecture that worked yesterday doesn't necessarily need to be the architecture that works today.

indoordin0saur14d ago

It's even farther along than you think. It's the one writing the comments you're responding to. So why are you still thinking up and typing out your HN comments?

wiseowise14d ago

> but it is interesting watching the expectations moving over time.

While the salary stays stagnant or even reduced if you adjust for inflation.

vrganj14d ago

Have people's standards for quality just completely vanished in the pursuit of the shiny new thing? Is that guy doing something wrong?

That has also been my experience with this sort of thing fwiw, which is why I gave up and do more of a class-by-class pairing with an LLM as a workable middle ground.

koumou9214d ago

hansmayer14d ago

> Now it's completing features

snowe201014d ago

Your argument is nonsense.

user3428314d ago

I felt the same with:

"it takes too much effort to get the output production ready"

turning into

"maybe long term the maintenance will be more expensive"

I give it three months until people realize that you rarely need to review every single line and fully understand the code, like so many comments are claiming.

camdenreslink14d ago

Maybe on projects with no users you can yolo things.

1 more reply

keybored14d ago

This blob of people criticizing AI is just that, a blob. A gaggle of discrete people that your brain makes up a narrative about being some goalpost shifting entity.

Of course there could be individuals who have moved the goalposts. Which would need a pointed critique to address, not an offhand “people are saying” remark.

dzonga14d ago

the autocomplete can be shit some times.

callamdelaney14d ago

Architecture is one of the easiest things in programming frankly.

taytus14d ago

Nice Fiction story.

jwpapi14d ago

That’s the same story I had.

You look at the code again and there is so much code spaghetti is an understatement it’s the Chinese wall.

You start working…, and you realize what was going on

dxdm14d ago

I find it interesting that this outcome is a surprise. I don't want this to sound smug, I'm genuinely curious what the initial expectations are and where they come from.

I'd like to know, ideally from people who've been there, why they think that is. Where does the trust come from?

throw10101014d ago

dxdm14d ago

Thanks, that makes sense.

I suppose it's difficult to account for the inconsistency of something able to perform up to standard (and fast!) at one time, but then lose the plot in subtle or not-so-subtle ways the next.

doginasuit14d ago

manicennui14d ago

jeltz14d ago

Probably same reason people expected outsourcing to the cheapest firm in India would work: wishful thinking. People wanted it to work and therefore deluded themselves.

Or really the same reason people fall for get rich quick schemes.

thunky14d ago

> You look at the code again and there is so much code spaghetti is an understatement it’s the Chinese wall.

Its your responsibility to build a sane architecture that is maintainable. AI doesn't prevent you from doing that, and in fact it can help you do so if you hold the tool correctly.

wavemode14d ago

gf00014d ago

I don't think that's a useful mental model for software in general.

bluGill14d ago

When on a tiny project it doesn't matter. However when you have millions of lines of code you have to trust that your code works in isolation without knowing the details.

1 more reply

marcosdumay14d ago

> A large codebase should be a collection of small codebases, just like a large city is a collection of small cities.

Oh, great analogy there.

Just like there's almost nothing in common between a large city and a collection of small cities, a large codebase is completely different from a collection of small codebases too.

Mostly because of the same kinds of effects.

OptionOfT14d ago

No but the speed up of AI is giving up control, and then you notice these issues too late.

gonational14d ago

I guess you could say that AI is great at generating code that only AI can understand and maintain.

mjburgess14d ago

I think this is true, but i imagine there's a workflow solution to this which isnt to drop AI.

Eg., treating AI code generated as immediately legacy, with tight encapsulation boundaries, well-defined interfaces etc. And integrating in a more manual workflow.

There's a range from single-shot prompts to inline code generation, that will make more sense depending on the problem and where in the code base it is.

herrherrmann14d ago

I could see value in using it during the prototyping phase, but wouldn’t like to work like you described for a serious project for end users.

meetingthrower14d ago

2 more replies

embedding-shape14d ago

I care more about code quality now, because typing no longer limits if I feel like it's worth to refactor something or not.

1 more reply

timacles14d ago

This is like getting a person addicted to drugs and then asking them to only use the drugs on Thursday and Friday.

This seems to me like it requires an impossible level of discipline, judgement and foresight

anal_reactor14d ago

> treating AI code generated as immediately legacy, with tight encapsulation boundaries, well-defined interfaces etc.

chorsestudios14d ago

skydhash14d ago

You can have very good diffs and then found that the whole codebase is a collection of slightly disjointed parts.

gchamonlive14d ago

AI doesn't necessarily have to increase your throughput, it can also serve as a flexible exploration and refactoring tool that will support either later hand crafter code or agentic implementation.

qudat14d ago

jwpapi14d ago

Obviously this is all my fault and others might have a better jugdement when using it. I’m just sharing my experience compared to the promises you easily might believe reading X/HN/Anthropic.

I still have a lot of usage for AI: Exploration, Double-checking me, teaching me. But writing code became very tough for me to accept. Nex-edit autocompletes mainly

scruple14d ago

> I still have a lot of usage for AI: Exploration, Double-checking me, teaching me.

I'm ready to give up on having it even review my code at this point. It's been so frustrating. It hallucinates bugs, especially in places where "best practices" are at odds with reality.

deadbabe14d ago

I don’t think you truly captured the worst part:

The engineers who sink further into denial thrash around with AI, hoping they are a few prompts or orchestrations away from everything be fixed again.

But the solution doesn’t come. They realize there is nothing they can do. It’s over.

snowe201015d ago

> The other change is simpler: I'm doing the design work myself, by hand, before any code gets written. Not a vague doc. Concrete interfaces, message types, ownership rules.

The hard part of software engineering was never writing code. Junior devs know how to write code. The hard part is everything else.

Philip-J-Fry15d ago

Yes, I think there's 2 kinds of developer. Those who think the code is the hard part, and those that don't.

The developers that thing coding is hard are the ones that absolutely love AI coding. It's changed their world because things they used to find hard are now easy.

The OP is definitely in the second camp since they could spot and realise the shortcomings of the AI. They spotted the problem, and that problem is that the AI can't do the hard bit.

jfim15d ago

seer14d ago

But isn’t AI doing the same thing to project management as to coding?

When all of the pieces successfully connect and execute reliably, what is left for humans to do? Just direct and consume?

And AI companies with their huge swaths of data are soon gonna be in the situation of being able to do the directing themselves

byzantinegene14d ago

mikepurvis15d ago

RossBencina15d ago

> The first group are still thinking fairly deeply about design and interfaces and data structures, and are doing fairly heavy review in those areas.

barrell15d ago

imtringued14d ago

Like, what is the purpose of Gas Town? It looks to me like the purpose of Gas Town is to build Gas Town.

bmitc15d ago

I find it useful to not listen to people who just talk.

skydhash15d ago

> The first group are still thinking fairly deeply about design and interfaces and data structures, and are doing fairly heavy review in those areas

seer15d ago

I’ve noticed that agents almost always fail at the planing vs execution stage.

But if the goals of the project/feature are stated clearly enough it is quite capable of iterating itself out of an architectural dead end, that is if it can run and test itself locally.

It goes as deep as inspecting the code of dependencies and libraries and suggesting upstream fixes etc. all things that I would personally do in a deep debugging session.

And I’m supper happy with that approach as I’m more directing and supervising rather than doing the drudgery of it.

Trouble is a lot of my team mates _dont_ actually go this deep when addressing architectural problems, their usual mode of operandi is “escalate to the architect”.

This will not end up good for them in the long run I feel, but not sure what they can do themselves - the window of being able to run and understand everything seems to be rapidly closing.

staplers15d ago

  You need to be checking every thing it does.

This is what seems to be lost on so many. As someone with relatively little code experience, I find myself learning more than ever by checking the results and what went right/wrong.

Just as google made finding information easier, it didn't fix the human element of deciphering quality information from poor information.

krilcebre15d ago

How do you know what good output should look like with little code experience?

brabel15d ago

tripledry15d ago

This is the only way for me to use Agents without completely hating and failing at it. Think about the problem, design structures and APIs and only then let AI implement it.

skydhash15d ago

You can skip that and go directly to writing code. But that meant you replaced a few hours of planning with a few weeks of coding.

20k14d ago

I always find these kinds of posts interesting, to compare the velocity that people seem to get with Ai, vs what I get by just coding by hand

ZaoLahma14d ago

> There seems to be a strong bias where using AI feels like you're making a lot of progress very quickly, but compared to manual coding it often seems to be significantly slower in practice.

This metric highly depends on who uses the AI to do what, where strong emphasis is on "who" and "what".

chromadon14d ago

This struck me as odd too. 7 months? It wouldn’t take that long to write it in a new language.

Another thing I don’t see mentioned is code quality.

Recently Claude has been making some “interesting” code style choices, not inline with the code base it’s currently supposed to be working on.

21asdffdsa1214d ago

thot_experiment14d ago

echelon14d ago

This was made in two days of vibe coding. It has flaws, but it's impressive as hell:

https://tinyskies.vercel.app/

It's got a fun Zelda-inspired mechanic (I won't say which one), and you'll have to unlock abilities and parts of the world over several quests and modes to "win".

It's also multiplayer.

20k14d ago

This ran at ~1fps for me

IceDane14d ago

Runs smoothly for me in Zen (FF) on Linux.

plastic04115d ago

Title says

> back to writing code by hand

But what they are doing is

> doing the __design work__ myself, by hand, before any code gets written.

So... Claude still is generating the code I guess?

And seriously, I can't understand that they thought their vibe coded project works fine and even bought a domain for the project without ever looking at source code it generated, FOR 7 MONTHS??

0xpgm15d ago

In short, it is simply a click-bait title.

And the goal of the article is to draw attention to their project.

lelanthran14d ago

> And the goal of the article is to draw attention to their project.

Additionally, they couldn't even bother to write their own blog post, so it's a little hard to take them seriously when they say they're going to write their own code...

kdheiwns14d ago

It's the same thing every time.

> Step one in your journey to code free life: code the whole damn project and put it together yourself

dewey15d ago

I bought domains for projects minutes after the idea.

bayarearefugee15d ago

> I don’t think it’s that weird to not look at the code if it’s a side project and you follow along incrementally via diffs.

Its not weird to not look at the code, as long as you're looking at the code? (diffs?)

Uh, ok

retsibsi15d ago

IanCal15d ago

I feel like I’m watching developers speed run project and product management learnings.

We’ve seen people find out that task management is useful.

Now more I’m seeing talk of fully doing the design work upfront. And we head towards waterfall style dev.

meetingthrower14d ago

yakshaving_jgt14d ago

Except it isn't the same because the cost is different, which allows discovery that we couldn't afford previously.

meetingthrower14d ago

Yes x 1000. I find it amazing.

xantronix15d ago

So you're not actually writing code by hand? I'm very confused by the difference between the title and the conclusion here.

rane15d ago

The point was to come up with a sensationalistic headline that HN eats up and post flies to the front page.

Towaway6915d ago

I wonder whether the title was generated/suggested by an AI?

dwedge14d ago

I don't think they even wrote the article by hand. It seems like the title got to the top of HN not the article.

viceconsole15d ago

This is a special case of a general fundamental point I'm struggling with.

Let's assume AI has reduced the marginal cost of code to zero. So our supply of code is now infinite.

Meanwhile, other critical factors continue to be finite: time in a day, attention, interest, goodwill, paying customers, money, energy.

So how do you choose what to build?

Like a genie, the tools give us the power to ask for whatever we want. And like a genie, it turns out we often don't really know what we want.

TranquilMarmot15d ago

ozim15d ago

I have vibe coded 3 applications I never had time to code but always wanted.

Now it is different in a way where now I don’t have time to use those apps.

That’s a joke.

shahbaby15d ago

This reads too much like it was LLM generated. I can't say for sure if it was but I have an allergic reaction to the short snappy know-it-all LLM writing style.

TranquilMarmot15d ago

AI;DR

baxtr15d ago

Writing code by hand but blog post are written by LLMs?

fromwilliam15d ago

yeah, it set off my llm radar too

simon8414d ago

Rapzid14d ago

simon8414d ago

Actually it was sort of fun to see that the AI started writing comments to itself by gradually explaining what it was trying to do and ways it failed to do it.

Then it spent more time appending comments to its own comments rather than writing code ^^

web00714d ago

So much of the problem here is that the author blindly trusted the agent. They're enthusiastic juniors, not jaded seniors.

Agents can do the same. It's WAY easier mentally and works out better if you treat them the same way and go working -> better -> done.

gauthamkolluru14d ago

I’ve been resonating with the similar ideas the author of the article/original poster has been mentioning even in the comments below.

Even i think that after few iterations of producing the code there must/should be change in the strategy.

But then again, I do not have the right answer now. Maybe the reformation must come in the models too but as I see it, going back to hand coding is not the solution.

What are off the table (I think)

gauthamkolluru14d ago

Apologies for the visual formatting as i was posting this comment from my mobile. Thanks in advance for understanding.

archleaf15d ago

So what you really mean is you are going to do better and more detailed skills files so you can get an architecture that you've thought through rather than something random?

dropbox_minerOP15d ago

cpncrunch15d ago

SpicyLemonZest15d ago

1 more reply

erelong15d ago

Can't you just ask AI to break up large files into smaller ones and also explain how the code works so you can understand it, instead of start over from scratch?

dropbox_minerOP15d ago

And I'm sure the rewrite is going to teach me a whole different set of lessons...

joshuanapoli15d ago

Rewrite following a new architecture plan could get finished pretty quickly, treating the original as a prototype.

SpicyLemonZest15d ago

When people talk about codebases being "incomprehensible", it's not always hyperbole. Sometimes the architecture literally cannot be broken up or understood.

radicalbyte15d ago

The very worst things you can do in a codebase are (a) not deeply understand how it works (have it be magic) and (b) be lazy and mess up the structure.

How do you fix a problem which happens at 2:00am and takes your system down if you don't have an excellent understanding of how it works?

Over time we're already bad at (a) because most developers hate writing documentation so that knowledge is invariably lost over time.

czhu1214d ago

I found the exact same when I started vibe coding new features in https://github.com/CanineHQ/canine

Claude is super good as making it seem like it’s an expert in kubernetes, but then undercovering certain decisions, it’s basically optimizing to try to make things look like they work.

I wouldn’t go as far as to say I’m writing everything by hand, but I now always map out how I would do something before asking ai to approach it

larusso14d ago

hmhhashem14d ago

larusso14d ago

I pushed a gist https://gist.github.com/Larusso/82c9aa8effb3031d149d3b5a1b96...

hmhhashem14d ago

Thanks :)

khasan22214d ago

shimman14d ago

Go has a built-in tools that mimic formatting + linters. Also LSP is a first class citizen in Go. I don't know what other "code quality" infra there is out there aside from formatting and linting.

spicyusername14d ago

It's really very easy to spend a few hours going through a vibe-coded project by hand and having an agent fix the weird parts. If you do this often enough, you can get the best of both worlds.

Then you're right back on track.

In a way it's not that different from a human-made project. Plenty of teams have to crunch, ignoring the architecture and incurring tech debt, and then come back and fix it later.

ex-aws-dude14d ago

That’s what I found too

I have to periodically get it to do a bunch of refactoring

peterbell_nyc14d ago

A chainsaw requires very different skills that an axe. It has different failure modes. Some experience as a lumberjack probably helps using either/both.

No difference (at least now) with agents.

tombert14d ago

I have found that for low-stakes stuff, where "good enough" really is ok, Claude and Codex have been pretty great. I don't particularly care if the code is optimal, just enough to do that job.

It makes me feel a little valuable; I finally have an excuse to use formal methods for things.

zem14d ago

pjmlp15d ago

I am still mostly coding by hand, other than meeting the KPIs of AI use at the company, required trainings, use of agents and whatever.

Eventually like every hype wave the dust will settle, and lets see where we stand.

By now all the AI companies have consumed all human knowledge so they either learn to actually think for themselves, or that is it.

Either way, that won't change the ongoing layoffs while trying to pursue the AI dream from management point of view.

0xpgm15d ago

> Either way, that won't change the ongoing layoffs while trying to pursue the AI dream from management point of view.

I think most companies doing layoffs are bloated to begin with, AI is just the scapegoat to do the layoffs.

pjmlp14d ago

I am aware of layoffs that are really caused by AI.

Translation and asset generation teams for enterprise CMS, whose role has now been taken by AI.

binyu15d ago

Isn't Golang relatively easier to read than Rust? I was under the impression that Rust is a more complex language syntactically.

dropbox_minerOP15d ago

+1 on Open 4.7 involving the user a lot more. Rn I'm trying to get to a state where I can codify my design + decision preferences as agents personas and push myself out of the dev loop.

binyu15d ago

Gotcha, that implies you are going to read the code that the AI produces anyways.

> Go reads fine whether the architecture is good or bad

Were you reading the Golang code all along and got fooled or did you review it after it failed? Sorry I admit I didn't read the whole article.

williamstein15d ago

He was NOT reading the code: "For 7 months I'd been prompting and shipping without ever sitting down and actually reading the code Claude wrote."

1 more reply

cortesoft15d ago

> Isn't Golang relatively easier to read than Rust? I was under the impression that Rust is a more complex language syntactically

It sounds like the author knows Rust, and might not be as familiar with Go.

A language that you are proficient in is always going to be easier read than one you don’t, even if it is an objectively easier language to to read in general.

tvbusy15d ago

throwaway202714d ago

fitsumbelay14d ago

I wonder how viable this debate is outside of dev circles.

RuoqiJin15d ago

yason14d ago

Overall, I cringe under all the hype that's been laid on AI: it's a new tool that's still looking for its box or niche carveout, not a revolution.

ktzar14d ago

mtrovo14d ago

mountainriver14d ago

The quality gates are up to you, and if you are smart you will make a lot of them and review them closely

cultofmetatron15d ago

That said, I think its important to also spend time in the dirt. I've recently started pickign up zig as my NO AI langauge just to keep. those skills sharp.

oblio15d ago

> the ship has sailed on my handcoding at work.

I'm really curious if we'll seesaw once AI costs go up 10x.

cultofmetatron14d ago

Ive only been using kimi 2.5 and deepseek pro for reviewing PRs for security issues. less than 10% of my workflow requires a full powered frontier model.

wartywhoa2315d ago

And they will.

Myrmornis15d ago

> I typed :rs pods to switch back to the pods view. Nothing rendered. The table was empty... > now something was fundamentally broken and I couldn't just prompt my way out of it.

TUIs aren't an exception, it's still essential to have a way to end-to-end test each view.

jvuygbbkuurx15d ago

The problem wasn't the view didn't work. The problem was the view didn't work after something else had been done.

You can't test every permutation of app usage. You actually need good architechture so you can trust your test and changes to be local with minimal side-effects.

neals14d ago

theunmanagedboy14d ago

vetler14d ago

The wide range of different responses to this post illustrates an important point; we can't agree on how to use LLMs in software development, and are still discovering new things.

And in a couple of months we might be doing things completely differently because of some new model or new framework.

That's really cool.

haolez14d ago

I've started using OpenSpec[0] recently to mitigate problems like that, but I'm still very early in this journey.

Can someone with more experience with it (or similar tools) chime in and confirm that this isn't just more AI snake oil? :)

[0] https://openspec.dev/

pramodbiligiri13d ago

AntiUSAbah14d ago

Im exploring currently if i should split up a project into a framework part and the game itself (2d, idle game).

The framework could be an isolation later against viberod but not sure if its necessary for my small project i always wanted to do and never done anything with it.

For another tool, i will try another approach: Start with a deep investigation and spec write together with AI, than starting with the core architecture layout and than adding features.

There is kind a skill and depth to vibe coding though.

dailywriterguy14d ago

As a writer, this resonates.

There's a massive difference in good human "writin" and a dozen paragraphs of "it's not x, it's a y".

keep up the good work and the craft of building things one keystroke at a time.

ramraj0714d ago

Software engineering is not that. You absolutely can and often will hand ofoff work to humans. Its not inherently that creative in the actual coding part.

ojr14d ago

AI was also able to help me create my first subscription payment workflow.

It is like farming without Roundup, less crops, more energy, less toxic chemical risks.

ninjahawk114d ago

A problem often ignored is that while AI is trained on human written code, how it writes is different in practice.

Maybe that changes as the models improve, maybe it doesn’t, only time will tell.

selfsimilar14d ago

I stopped reading after this, because this is the dumbest way to vibe code anything larger than a single-use tool.

Claude is a collaborator, and honestly a decent voice of dissent, but it will never offer that unprompted. "Make this thing" - "OK".

You need to review the code. You need to say "I want this, AND HERE IS THE LONG-TERM VISION. Now offer critique and the trade-offs for various implementations."

gosukiwi14d ago

bad devs are still bad, good devs are still good

shimman14d ago

Until the good devs have their skills atrophied away.

keithnz15d ago

Aeolun15d ago

neomantra15d ago

While I felt this in 2025, I do not feel this in 2026. I use Claude and the rest with BubbleTea all the time.

But I will say... you have to know Golang. You have to have at least tried to make a BubbleTea app yourself and try to understand ELM architecture. You have to look at the code and increment with it.

It makes total sense for OP to switch to Rust and Ratatui if they don't know Golang well. But I don't think it's a better language for it. [Ratatui has brought me great inspiration though!]

TUI code is finicky; one mis-rendered component mucks everything up. The LLMs will decide themselves make little, temporary BubbleTea fixtures to help understand for itself when things aren't right.

I vibe-coded this TUI for Mom's last night. I actually started with Grok (who started with v1) and then moved into Claude Code after some iteration:

https://gist.github.com/neomantra/1008e7f2ad5119d3dd5716d52e...

abalashov14d ago

I went back to writing code by hand quite some time ago and cannot say there has been any loss of velocity or productivity for it.

I really do think this whole thing is a wash.

rnxrx15d ago

eranation15d ago

I used to write code by hand.

I still do, but I used to, too.

eddy-sekorti14d ago

Yes, i also do this, the old feeling of writing something, deploying, testing and fixing the bugs is good. Vibecoding can never replace this feeling.

kccqzy14d ago

> AI builds features, not architecture.

nopurpose14d ago

Feels like it can be solved wirh even more AI: adverserial models reviewing and testing work performed by main model.

hirako200015d ago

Research also makes similar claims: https://arxiv.org/html/2603.24755v1

d_silin15d ago

It absolutely looks like AI psychosis.

sim04ful14d ago

My opinion is that we're using the wrong paradigms for LLMs. We should be leaning more on declaratively specifying behaviour.

If there's any hope for reliability, auditability, predictability to be had it lies in contraining and LlMs grammar whilst delegating freeform behavior to a more passive substrate.

sakesun15d ago

A coder typing in code is not solely to generate outcome. It's part of ongoing thinking process. Without this ongoing process, we have no material to keep iterating forward.

dwedge14d ago

Clickbait title about not writing code by hand anymore, both the article and future code generated by AI. This is meta.

Laoujin15d ago

I'm just wondering: you know what architecture you want to go to now and you have the tests... can't you just let Claude refactor it to the better architecture?

Also 1600 lines... didn't any agent reviewing the diffs point that out?

You're also adding a lot to claude.md, I dunno how much that file has grown but a big claude.md file with many instructions, I don't think the ai will be able to remember all those rules.

my-next-account15d ago

> can't you just let Claude refactor it to the better architecture?

In my experience, no. These tools suck at refactoring, mostly choosing to add more code instead.

Laoujin15d ago

I'm just wondering: you know what architecture you want to go to now and you have the tests... can't you just let Claude refactor it to the better architecture?

Also 1600 lines... didn't any agent reviewing the diffs point that out?

You're also adding a lot to claude.md, I dunno how much that file has grown but a big claude.md file with many instructions, I don't think the ai will be able to remember all those rules

amelius15d ago

So how are people writing the specifications for AI?

Do they write empty functions and let AI fill them in?

Or do they use some kind of specification language?

Are people designing those languages?

youre-wrong315d ago

This is the wrong take. If you keep “vibe” coding and end up with bad results you should probably question your ability.

ipaddr15d ago

When he mentions I push commits at work for as long as my tokens last I can understand that. Managing tokens has become an important skill.

g42gregory14d ago

I am loving the articles alternating between "software engineering is dead" and "I am going back to coding by hand". I guess we have a difference of opinions here. :-)

jasonvorhe15d ago

When the title stands in opposition to the actual post, I'm not gonna engage with that author again.

cortesoft15d ago

What has really made AI coding be able to continue to work as the project got bigger was using speckit. It has been great at keeping the code consistent across features.

https://github.com/github/spec-kit

nopurpose14d ago

Did you evaluate other projects, like openspec, before deciding on spec-kit?

johnthescott13d ago

write code like your life depends on it. cause it does if you are any good.

Havoc14d ago

That's a strange definition of "code by hand"

ilaksh14d ago

He says he went several months without having to do a code review and it worked the vast majority of the time. That's incredibly impressive work by the AI.

1690 lines of code in one file is a walk in the park for SOTA models.

He can just say something like:

wolttam14d ago

> Also I bet you the headline is a lie.

He's already 5k+ LOC into the rust rewrite...

moveax314d ago

Code writers have changed, but the conceptual mistakes remain the same.

dr_girlfriend15d ago

secprove14d ago

It was certainly a lot more enjoyable.

jesse_dot_id15d ago

EMM_38615d ago

You don't need to go back to coding by hand if you know how to do it already. There is a middle ground.

If you understand good software architecture, architect it. Create a markdown document just as you would if you had a team of engineers working with you and would hand off to them. Be specific.

Let the AI do the implementation of your architecture.

DrTung14d ago

If you're an old geezer like me, doesn't this "AI revolution" remind you of the "BASIC revolution" in the 70s and 80s, i.e. when the BASIC language was new and hot.

BASIC at that time was heralded as a much simpler and faster way to program. Rings a bell?

snickerbockers14d ago

mindaslab14d ago

I'm going back to writing algorithms on paper.

graphememes14d ago

they are just doing design work now, they could have done design work with go too, without even knowing go

clickbait title

apt-apt-apt-apt15d ago

Outright lie clickbait. As he states himself, he's doing the design work by hand, and will likely still use AI to write code.

mpurbo15d ago

Strict SDD might help to constrain and harness the process.

classified13d ago

The most amusing thing about this is that the author seems surprised about what happened.

AIorNot15d ago

This doesnt make much sense the article itself is AI written

It would have been easy to run a few ai agents to review the code and find these issues as well and architect it clearly

hsaliak14d ago

I wrote https://github.com/hsaliak/std_slop/blob/main/docs/mail_mode... to avoid the brain rot from just shooting slop. It has helped me stay sane, review code and make changes step by step.

I dont go as fast as with other agents, but this works for me, and I enjoy the process.

ljoshua15d ago

> tl;dr: AI writes features, not architecture.

littlecranky6714d ago

z3t415d ago

floodfx14d ago

Genuinely curious if you've used "plan mode" (with perhaps a plan feedback tool) to get clarity from your coding agent before unleashing it on a feature like "add a pods view with live updates"?

Getting a plan isn't a panacea but is a better way to limit downstream slop than just vibing without one.

worik14d ago

LLMs are a tool. They must be wielded.

Looking at the code, paying attention to the structure is part of the skill

The skills required to wield an an LLM are not exactly those required to write code, but are very close.

"Vibecoding" is not a way for idiots to blindly produce software artifacts that anyone would want

guywithahat14d ago

slowhadoken14d ago

I never stopped but I focused more on concept and design.

desireco4214d ago

I understand, and I saw this problem. It's actually quite hilarious that he got this far before noticing it.

The AI is very helpful for generating code, and that is exactly how you should use it: as a code generator.

deeviant14d ago

Have you people ever read human generated code? Good grief, you act the like human code is not a disaster 9 times out of 10.

codingfisch15d ago

rtgfhyuj14d ago

junior engineer vibes

m3kw914d ago

epec25415d ago

Not sure if just me, but this post feels AI written?

weregiraffe15d ago

You are absolutely right.

pipeline_peak15d ago

Feels a bit too long winded to be AI generated.

filoeleven14d ago

That's when he went back to writing his posts by hand.

royal__15d ago

The title is just flat out wrong. The author isn't going back to writing code by hand, they're plopping some new stuff into their CLAUDE.md to "fix" the issues they see AI is having.

holografix14d ago

Good luck finding a job. All the decision making business people I know see only two types of “technical people”.

The ones who are “AI pilled” and the contagious lepers.

magic_hamster15d ago

Let me preface my comment by saying I also still write a lot of code by hand - especially when it's something I know I need to understand in depth, and in some cases defend.

With that said, this caught my eye:

> AI gravitates toward single-struct-holds-everything because it satisfies the immediate prompt with minimal ceremony.

This stuff is still being figured out by a lot of people. But I feel the core of the issue is not using AI well. Scoping, task alignment, validation, are crucial.

aryan_kalra1214d ago

I've been saying the same thing and I'll repeat it again: AI is still gonna take away your job even If you switch domains.

nothinkjustai15d ago

Writing code by hand is an oxymoron. You don’t write code with AI, AI doesn’t write, it generates.

localhoster15d ago

another behavior I noticed is that even you plan with an agent than a lot of business logic leaks to the code.

some states, for an example, are meant to be assumed from the data shape, rather than the actual state fields, but damn they like adding a state field.

blueTiger3315d ago

nuts

IceDane14d ago

This doesn't make any sense to me.

bbbflgllglhlld15d ago

Luddite.

recursive15d ago

Seems to be an unstated assumption that the Ludds were wrong.

devmor14d ago

I dismissed “AI Psychosis” as a silly term, even as a strong critic of LLMs for programming tools.

> For 7 months I'd been prompting and shipping without ever sitting down and actually reading the code Claude wrote.

But every time I read something like this, I seriously wonder about the mental state of the person that wrote it.

How do you get to this point?

nothinkjustai15d ago

Inb4 “you’re gonna be replaced” god damn it I hope so, I do not want to spend the rest of my life behind a computer screen…

Fokamul14d ago

I also code by hand.

But in my main work, reverse engineering, LLMs are godsend, for years now.

You can basically bruteforce binary obfuscation thanks to them. And thanks to eager chinese LLM providers, basically for free.

But I always use LLM only for boring work and rest is for me to do manually, or with scripts of course, but made by me. Because I want to learn.

Yes, there are a lot people using LLMs for full RE automation since they're selling exploits for profit. No problem with me.

I see funny future for huge corporations like Adobe, etc.

Imagine prompt, "Hey Claude, re-implement Adobe Photoshop with clean-room design" One agent will open decompiler, outputs complete low level technical details how is everything implemented.

Second agent implements new Photoshop based on that.

They will be mad and I like this.

You will own nothing, and you will be happy, corpos.

duskdozer13d ago

>Second agent implements new Photoshop based on that.

>They will be mad and I like this.

I suspect through some convoluted legal mechanism this kind of thing is going to end up applying only to copyleft laundering and not against players like Adobe.

FpUser15d ago

>"I'm doing the design work myself, by hand, before any code gets written."

UrbanNorminal15d ago

Wow ok, I will too then. Fuck AI!

scuff3d15d ago

I feel like this article was circling a point it never actually got to. All the advice in here (except controlling scope creep) is specific to a TUI with an elm like architecture.

photochemsyn15d ago

Does ‘writing code by hand’ mean you’re not going to use compilers to generate assembly?

This whole investor bubble will blow up in the face of the rentier-finance capitalists and I’ll be laughing my head off while it happens.

green_wheel15d ago

Nondeterministic natural language compilers

photochemsyn15d ago

Just because the trajectory is chaotic doesn't mean it’s not deterministic.

zephen14d ago

A model, given exactly the same inputs, will return exactly the same outputs.

But your prompts are not the only inputs. Among other things, there is a random seed injected by the vendor.

That is a primary source of non-determinism.

And even if there were no non-determinism, the models suffer greatly (much more so than traditional compilers) from the butterfly effect.

platevoltage15d ago

So C++ doesn't count as code now.

kypro15d ago

> I learned over these 7 months

dusted14d ago

Attempting anything comprehensive with AI is the software development analogue to the Gell-Mann Amnesia effect..

But damn it's a tough spot..

imperio5915d ago

Alternate title: "I did not understand the current limitations of AI and assumed it could do large software design and it generated spaghetti slop"

Yea, that's why engineers are still very important for now (until models can do this type of longer term designs and stick to them).

Towaway6914d ago

Until the costs become prohibitive and humans become cheaper than the agents that replaced them. Once the agents are replaced by the humans, the next hype bubble awaits around the bend.

Decabytes15d ago

We should go back to designing UML diagrams for programs before we write them /s

khutorni14d ago

I think we should, to a reasonable degree.

eggplantemoji6915d ago

TLDR ai wrote tech debt slop because I vibed for 7 months, now I am taking a hybrid approach of defining strict constraints before vibing…

j / k navigate · click thread line to collapse