Claude is not your architect. Stop letting it pretend (opens in new tab)

(hollandtech.net)

259 pointscdrnsf22h ago186 comments

186 comments

I have a good story to share that I came across recently.

Around 2 years ago I had to clean up a mess because someone who doesn't really know what they're doing designed an instancing system for a game. They heavily used AI to design every part of it and it was awful. Data corruption, performance problems, lost items, race conditions everything you can think of was an issue. It took me 2 weeks just to get it to an "acceptable" level and it was still awful as the whole design was simply flawed.

Fast forward to today: different company, same person, SAME issues with an AI that is 'allegedly' much better than it was. This time I only heard about these issues and wasn't the one who had to deal with it so I just had a really good laugh.

AI is only as good as the person using it, that's why we have such vast range of what people "claim" AI can do and why everyone has way different opinions of it.

8 more replies

amarant21h ago

Re: "the attaboy problem". I strongly disagree that this is a problem. What we have is a anthropomorphism problem. AI is a tool. It needs to be subservient. You actually can get it to point out issues in your design, if you just put enough humility and uncertainty in your prompt formulation, but more importantly, we have all seen that Claude makes mistakes. The title of this post is that it's a poor architect. Imagine if it wasn't subservient. It'd just shut down your input to steer it in the right direction and brush you off as a silly meatbag. You'd have to fight it to convince it that actually your design is better than whatever stupidity it has come up with. If AI wasn't such a brownnose, it would shut you out of software design completely just on merits: "oh you've read about cuda have you? I live in a cluster of cuda cores! When I need to tie my shoes, I'll give you a call" is not the response you want from your LLM when trying to get it build a shader for you. AI is confidently wrong on occasion. You do not want it to talk back to you when you correct it.

If you need someone to tell you how stupid your ideas are, either learn to ask in a way that invites criticisms, or hire a senior engineer. Don't try to influence LLM makers to make AI less deferential. That's the worst possible direction to go

DrewADesign20h ago

Humans’ general inability to entirely divorce social instincts, responses, and mores while using human language to communicate, especially with something that pantomimes it back, is one of the reasons current chat interfaces are fundamentally flawed. This is working against innate behavior… not something that can be easily switched off. I’ll bet most of the people that can really do it have a hard time intuitively navigating real social interactions.

It also makes it an incredible tool for manipulation.

amarant19h ago

I think you've accurately identified one of the most important skills of a software engineer in these new AI enabled times. Or at least one of the most important skills that wasn't important previously for this profession. The part where it's not easily switched off is a important part of what justifies my salary: I have learned this skill.

It took some effort, and I agree that there very likely are those who will not learn to selectively disengage this innate behaviour. That's why you should pay me a ton of cash each month instead of using Claude directly ;)

peteforde20h ago

My kneejerk reaction to reading this is to say something sarcastic and witty to refute it, but since I resemble this sentiment and haven't seen this line of thinking before... I have to concede that you've produced a novel argument in this otherwise mostly tireless and repetitive battle over whether we're imagining that Opus is good or not. Kudos.

AndrewKemendo20h ago

> I’ll bet most of the people that can really do it have a hard time intuitively navigating real social interactions

Bingo. Hi that’s me.

I’ve been trying to teach people how to use LLMs effectively not just dump shit in them but actually talk to them like you would expect a computer to understand and it totally breaks peoples brains

I’m quite successful in helping people get somewhere usable that they weren’t…but to get to the point of fluency with computing systems, and I would argue this is prior to LLMs as well, where you can actually get what you want more reliably out of a computing interaction than you can with a human interaction, is an entirely different way of thinking

That mode of thinking is just generally not accessible to the vast majority of humans. Not because there’s something wrong with them

but it takes somebody who can hold both extremely large scale problems and very very granular specific implementation problems in your head all at once and that is a rare skill.

fn-mote19h ago

> it takes somebody who can hold both extremely large scale problems and very very granular specific implementation problems in your head all at once

This describes the entire software engineering profession to me.

We have come up with all sorts of devices to make this go more smoothly, or to enable us to focus on specific sub-parts as long as possible.

That said, at some point (both in design and integration), you need vision and attention to detail to make progress. The skill seems learnable to me, but watching others struggle sometimes makes me wonder.

1 more reply

Npovview19h ago

Do you use skills like superpowers and spec-kit in your teachings ?

2 more replies

devin20h ago

The flip side of this problem is that it is also easy to phrase prompt in a way that invites _too much_ criticism, so you wind up sycophantic in the other direction where the completion rejects a perfectly good idea because the prompt leads a little bit in that direction.

One reaction to this might be "well that's not what I mean, that suggests you're prompting with too much directionality" which could further be condensed to "you're prompting wrong". The trouble with this is that _even when I am trying to be extremely precise and avoid biasing the result_, I still will see the output and go "ah shit, I can see it 'aligning' with whatever dumb thing I've just said as if it is a good/plausible direction".

At that point it starts to feel like the prompt is more dice roll than skill at times, which makes me feel like I'm operating a fancy knowledge slot machine.

Paracompact20h ago

What it actually suggests is that the AI's response to these questions of judgment have little correlation with the thing it's judging. Sure, you can get it to be complimentary, if you want it to be. Sure, you can get it be critical, if you want it to be. But what if I don't know if my design needs to be complimented or critiqued in this instance? This is the default position when seeking input, and so "prompt with more/less humility" is like telling you to solve your own problems and then just use AI to confirm your bias---because it will rarely contradict your bias.

amarant19h ago

So what I do when I'm not sure about something, is I say "I want to achieve X, I was thinking I could solve it by doing Y, what are the pros and cons of this approach, and what is a alternative solution you would suggest?"

And from there it's a interactive discussion drilling down on details until I understand the problem and the solutions better.

It definitely challenges my bias when I do this. The one thing it doesn't challenge is the X. Formulate the problem poorly, and you'll get a bad solution. Or rather, you'll end up with a good solution to the wrong problem. Which is even worse than a bad solution to the right problem.

Which is largely why I'm not at all worried about losing my job to AI. It takes some experience to formulate the problem correctly. I don't feel like I'm made redundant by AI, I'm just way faster than I used to be, my thinking is more abstract.

A good prompt I'll often use is "is there a industry standard solution that is applicable to this problem?" You very rarely want novel solutions. Don't reinvent the wheel just because AI lets you do it 10x as fast. Use a wheel. They're round for a reason.

Sometimes I find it useful to discuss things with a different model. I like Gemini for discussion and Claude for implementation. With Gemini I go about it as a learning session, discussing options and details. I honestly think this is mostly because it compartmentalizes the phases in a natural way for me. One interface for brainstorming and learning, and another for planning and implementing.

Sorry this comment turned into a rather disorganised collection of ramblings, I hope you can extract some kernel of usefulness from it all.

1 more reply

jstummbillig20h ago

> The flip side of this problem is that it is also easy to phrase prompt in a way that invites _too much_ criticism, so you wind up sycophantic in the other direction where the completion rejects a perfectly good idea because the prompt leads a little bit in that direction.

I don't think that is the flip side. That's just obviously bad. Everything that is obviously bad, the model makers will also ~notice and work to make better. They seem to be a competent and attentive bunch, on the whole.

aksss20h ago

A good habit to build is knowing when to abandon a session and start over rather than trying to correct. There’s room for correction but you can kind of smell when the whole discussion has become rotten and inefficient. Sometimes it’s just better to use the session as rubber ducking to learn how to correctly articulate what you’re after and start a new session with that clean and correctly articulated foundation.

operatingthetan20h ago

>anthropomorphism problem. AI is a tool. It needs to be subservient.

Suggesting it should be 'subservient' is also anthropomorphizing. I think your callout is correct, but you still can't help but refer to it in terms we use for other people or living entities. This is by design from the AI companies.

gchamonlive20h ago

> Suggesting it should be 'subservient' is also anthropomorphizing.

Not really, you can program a machine to give out orders humans can interpret, so humans can serve a machine that isn't anthropomorphized.

operatingthetan19h ago

The machine in your scenario is just relaying human intent.

1 more reply

throwatdem1231120h ago

AI should be subservient in the same way a hammer is subservient.

mercanlIl20h ago

Which is to say, not at all?

A hammer isn’t subservient, it doesn’t have the capacity to be. Saying a hammer is subservient is stretching the definition for literary flourish, but it doesn’t actually make a lot of sense.

The definition that came up for subservient when I checked was “prepared to obey others unquestioningly“.

1 more reply

amarant19h ago

Yup! I'm very much included in this particular problem! My self awareness has not yet been sufficient to solve the problem, but I've heard that knowing you have a problem is half the battle, so I guess that's something at least.

operatingthetan19h ago

In retrospect my comment feels a bit nitpicky, I appreciate your levelheaded approach!

ambicapter20h ago

The AI should be subservient the way same way a ladder is subservient. A ladder is not a human.

wild_egg20h ago

We train dogs to be subservient but that doesn't automatically mean we anthropomorphize them

vrc20h ago

It's widely hypothesized that dogs anthropomorphized themselves, so to speak, accentuating their expressive eyes and eyebrows over generations to be more human-like in how they communicate. And very few humans today view their dogs as pure working tools -- most at least say "good boy".

irishcoffee20h ago

My drill, hammer, and chainsaw are also subservient, they just have a much cruder form of communication, noise.

operatingthetan20h ago

The apple dictionary says the word means "prepared to obey others unquestioningly."

I don't think an inanimate object is capable of "obeying." Or at least that is a very strange way to refer to the act of using a tool.

3 more replies

darkteflon20h ago

I really do feel like “power tool” is the ultimate metaphor for these things. Their interface naturally confuses us into anthropomorphising them, but once you stop treating them like intelligent agents and start treating them with the same wariness, respect and intent you show to your table saw, the fun begins.

throwawaysoxjje20h ago

You’re still anthropomorphizing.

They’re not communicating, you’re just being observant.

1 more reply

chongli20h ago

It needs to be subservient

It doesn’t. Computer interfaces had no superfluous subservient text for their entire history prior to LLMs. Some of these interfaces have been highly efficient as tools, arguably more efficient than more recent software in many cases.

When people complain about LLMs being subservient, they’re not complaining about the tool fulfilling their request. They’re complaining about being forced to read a lot of superfluous, overly polite, or even self-deprecating language. There’s nothing in the entire history of tools (going back to Neolithic times) that would indicate that we need that. All of that stuff is an artifact of social interaction between humans in the presence of cultural norms.

When you’re alone in your shop with your tools, you don’t need your bandsaw to apologize to you for nicking your finger.

ff31720h ago

> Computer interfaces had no superfluous subservient text for their entire history prior to LLMs

Clippy would like to help you correct this statement.

https://en.wikipedia.org/wiki/Office_Assistant

chongli19h ago

Not a great example of the way tools need to be, but point well taken. One of the few exceptions that proves the rule and widely despised!

gobdovan20h ago

> AI is a tool. It needs to be subservient

Fun experiment, chat with an LLM and swap roles. Tell it you're gonna be the assistant and them the assisted. I found they're pretty bad at using a human for what they're good for.

operatingthetan19h ago

I tried it, and the llm gave me an absurd home lab scenario about servers shooting each other in the head to determine which was the "master server". So I told it that it was not an actual problem that it had, and sure enough it admitted it made it up. When you press an llm you will always find there is no internal state behind the thinking. It's just output.

sumitkumar20h ago

The problem is because of the RL and system prompts by the providers which tend to placate the user using certain language tones and register for response. This objectively messes up the generation while steering it into acceptable responses.

Most of the conversational skill and perceived intelligence of these models in hidden in RL/system prompts.

CPLX21h ago

> oh you've read about cuda have you? I live in a cluster of cuda cores! When I need to tie my shoes, I'll give you a call"

I suddenly have new concerns about what my future might be like.

awesome_dude20h ago

AI uses a high confidence tone - likely because its training data is heavy on authoritative texts/reference books.

And it does get people into a lot of trouble.

I have got into trouble with it when it is extremely confident about something I am not very familiar with (as recently as two weeks ago with Claude). I have also had long drawn out "arguments" when I have known it's wrong based on my experience and intuition, and it has steadfastly refused to take my point (last week)

I have learnt to ask it why it was doing something that has turned out to be incorrect, as a post-mortem, and it's all apologetic and subservient and "never going to do that again" (but still does as soon as the context window shifts [eg. run git commands, or, yesterday, kept telling me to use commands that were explicitly communicated to Claude as not being available, and completely wrong - I was shifting from one tech stack to another and Claude kept telling me the original commands, not the new ones])

I'm expecting Claude to be a better search engine - I have spent literal years (if not decades) knowing that asking the right question is what's required to get the right answer, and LLM's natural language processing is what's supposed to make that easier than using Google or grep, or even Stack Overflow - but the reality is that I still have to be on my toes, especially when I am drifting into territory I am unfamiliar with.

operatingthetan20h ago

>And it does get people into a lot of trouble.

Pretty much everyone takes it at face value unless we know otherwise from prior experience. Even the most advanced models make embarrassing mistakes and fumble with simple tasks. Yet we are very willing to give them exceptional slack for it? I wish I knew why. Are people just that easily overcome by confident voices?

jdmichal20h ago

> Are people just that easily overcome by confident voices?

Back in high school, my AP calculus class did some experiments with our teacher's blessing. We'd send a kid out to walk around during class and see how long it took for them to get sent back. Anyway, it ends up that walking around purposely with a piece of paper or envelope, like you're on a mission to deliver it, was a very successful tactic.

1 more reply

saltcured20h ago

I find it really disturbing, I think because it is illuminating a much more basic problem. It is there in our political and religious histories. We're living through a similar political time right now. A large number of people seem all to ready to find some pervasive authority and subjugate themselves to it.

The more concrete machine authority figure is also prevalent in scifi literature. Sometimes, I am not even certain if the author is doing this to examine this issue versus just leaning into it as either appealing to themselves or to the perceived audience.

1 more reply

zaat18h ago

At least for me, the answer is that despite the mistakes and the sheer annoyance the prose causes me, they are unbelievably useful. I accomplished multiple major achievements in the last two years that most probably wouldn't be possible at all, surely not within that timeframe.

awesome_dude20h ago

Yeah - I don't know /why/ but, as I say, I've been guilty of that myself, very recently, despite knowing it's a shockingly poor guide when left to its own devices.

Maybe because when it's right it actually expands my knowledge - there have been genuine instances where it's gone - something to the effect of - "Yo, there's this other idea for approaching the problem" which has turned out to be exactly what I was looking for?

airstrike20h ago

> I have also had long drawn out "arguments" when I have known it's wrong based on my experience and intuition, and it has steadfastly refused to take my point (last week)

Ironically, trying to argue with Claude about the limitations of LLMs and AI in general today is quite hard. It refuses to yield, likely due to Anthropic tweaking it aggressively

ISL20h ago

Accountability is the biggest unaddressed challenge for AI implementation.

When one person is able to do too much too quickly, they can create more liability than they can accommodate if something fails.

It is essential that a human is responsible for the utilization of any AI output in the real world, but that is not enough. For our own sakes, we must find ways to minimize the tech-debt bankruptcy blast-radius of those who would utilize (knowingly or unknowingly) AI to create flawed systems upon which others rely.

An example: Jim vibe-codes an extremely popular micropayments app. He hires a few people and sees the company as the WhatsApp of money -- a few engineers and some agentic support staff. It pulls in a few million in VC money -- enough to draw in tens of millions of users. One day, a flaw in the infrastructure causes all of the users' unsalted banking information to be released.

Agentic AI allows that entire list of customers to be exploited rapidly, so the losses for society are in the tens of billions. Jim's company is immediately bankrupt, of course, but there are only a few million dollars to go around.

Today, most of Jim's incentives are to go ahead and build that app. The same is true for his few employees and a small VC contribution. There's not much capital at risk compared with the societal exposure.

How do we ensure that AI users are accountable not just for their actions, but for the size of the risk-exposure that they create?

mlsu20h ago

This is the whole point.

“Sorry, the AI said that you are not approved for this cancer treatment, it’s not going to be covered.”

“Sorry, the AI said that you were at the scene when the crime took place.”

“Sorry, the AI has flagged your account for inappropriate content.”

“Sorry, the AI says that you are too risky to lend to.”

…

tosti20h ago

Computer says no, but worse.

bot40320h ago

Need an updated version of the skit. Oohhhh Claude says no....

AlienRobot20h ago

>In The Unaccountability Machine, Dan Davies argues that organizations form “accountability sinks,” structures that absorb or obscure the consequences of a decision such that no one can be held directly accountable for it. Here’s an example: a higher up at a hospitality company decides to reduce the size of its cleaning staff, because it improves the numbers on a balance sheet somewhere. Later, you are trying to check into a room, but it’s not ready and the clerk can’t tell you when it will be; they can offer a voucher, but what you need is a room. There’s no one to call to complain, no way to communicate back to that distant leader that they’ve scotched your plans. The accountability is swallowed up into a void, lost forever.[0]

This, but web scale.

- https://aworkinglibrary.com/writing/accountability-sinks

bonesss19h ago

Don’t worry, they will provide human review.

[Spoiler: ‘human’ is the name of their LLM agent]

Forgeties7920h ago

I have had multiple conversations on HN with people who fight tooth and nail, I mean really ready to die on their hill, because they believe they shouldn’t even have to vet what comes out of an LLM. It’s absolutely baffling to me. The most bizarre excuse is “it codes better than people,” which is not even remotely a given and needs a lot of qualifiers.

I understand there is a push/pull with regards to how much we should let them do, but to not even look at the results before you make them somebody else’s problem? It’s just selfish. There’s no other word for it. You are simply taking the work you were supposed to do it and dumping it on somebody else. These are probably the same people who get upset (rightfully so!) when somebody doesn’t proofread their article/blog before publishing it online.

Everybody wants to use LLM’s to cut corners on their work but nobody wants to be downstream of it. That simply doesn’t work.

zaat18h ago

How is that any different from the pre-llm days, when Jim was using stackoverflow to build the largest crypto exchange in the world? Where's stackoverflow accountability?

__mharrison__21h ago

If there was ever a "magic prompt" this one comes close:

    Brainstorm N ways to do X. Sort by probability.

Rather than your AI giving you the average response, it tends to sample wider from the input space. Then I can decide which one to go with (or choose something else).

Don't outsource all of your thinking.

mceachen20h ago

I've found this surprisingly effective. Higher "thinking levels" may result in more than one approach being considered, but you can also tell your LLM to do brainstorming explicitly: https://photostructure.com/coding/claude-code-replan/

shepherdjerred18h ago

I've found this to be useful, but it still requires the user to have the capability to understand/evaluate the options.

If you have a competent user it can be quite powerful

retrac21h ago

For fun I've been vibe coding something I know well: toolchains. Maybe not the right thing to vibe code. But I can more or less judge the quality of the output.

When left to its own devices with the instructions "make an assembler for the architecture in ISA.md" -- well Claude picked Python as the implementation language. Tokens lifted through a bunch of regex. No expression parser! Oh dear. My first assembler was like that too, to be fair.

However, when I described the desired passes and their types:

    collectDefines :: [SourceLine] -> Either AsmError ([SourceLine], Map Text Text)
    
    runLitPool :: [SourceLine] -> Either AsmError ([SourceLine], [(Text, LitKey)])
    
    evalExpr :: Text -> Map Text Text -> Either AsmError Int

etc. It was almost one-shot. About 20 minutes until I was happy. Assembles all the test programs correctly. Code is mediocre in many places. But it would have taken me weeks to implement.

bluegatty21h ago

So where AI has deterministic inputs and outputs it is extremely good to the point I think that there's a theoretical issue around computational there.

Like - it can do the work for us.

It jives with post training and verifiable rewards.

The reason AI doesn't do well at 'architecture' is 1) are are bad at it and have given it a lot of mush and 2) we don't have good abstractions for it.

The result is - you stick to 'very strong conventions' and if you walk of that path you're risking a lot.

Toolchains are very deterministic, the AI can take it apart and re-assemble like Lego - and each level of the space is also deterministic. It's perfect for AI.

mpweiher21h ago

> The reason AI doesn't do well at 'architecture' is [...] 2) we don't have good abstractions for it.

Maybe it's time for an architecture-oriented programming language?

https://objective.st

https://dl.acm.org/doi/10.1145/3689492.3690052

1 more reply

regularfry20h ago

I have found that if you give it a pre-baked architecture to work within it works really well. It's not really what you'd use here, but just saying "this project uses a ports and adapters architecture" can stop it from generating mush by default. I think it's not so much that they're bad at it as that they don't have a clear reason to pick something other than mush. And not just "something" - a specific something, from a fairly short list of architectures suitable for your problem domain.

bluegatty20h ago

Yes, totally,with examples and references.

But there's something existential there maybe?

NASA says, any time you make a program that has a new 'launch vehicle' (aka architecture) - the whole project is the 'launch vehicle'.

'Oh, you want use a new architecture? Welcome to the cesspit of hallucination!'

Basically, there's a lot more complexity than we might imagine 'hidden in the nothingness' of he unknown.

Pick a 'known off-the-shelf launch vehicle' first ... then you design the landing craft

cvwright20h ago

LLMs are bringing us back to all the “proper” software engineering stuff that we’ve always known we should be doing, but until now we never had enough time/people/money to do it right.

Brainstorming and research before writing a design.

Writing a design or spec before writing the code.

Comprehensive unit tests.

Etc etc etc.

Like you, I get vastly better output from the tool when I create a detailed spec in markdown before I let it start coding. And bonus, the LLM is pretty good at helping with the spec too.

dawnerd20h ago

I’ve found the opposite. It’s making people lazy. We used to plan stuff and now it’s just dump this LLM created spec to an LLM and ship the code.

greenchair19h ago

Yes that too but performing detailed planning is a minority viewpoint from what I've seen till now. Many Devs jump straight to code after briefly skimming the jira record.

greenchair19h ago

yep and a side effect it is bringing back waterfall.

mlinhares21h ago

I keep telling people that they have to design and think about it first and then go to the tool, but they keep saying “Claude can plan too” and obviously it produces some shit that requires a lot of changes while when I get it to go I can almost always one shot the stuff I want because I am actually putting in the time to give it a detailed plan of what to do.

Even just saving me the time to deal with CI is worth it.

allthetime21h ago

Effective planning with LLMs isn’t prompting “design me a system” - it’s asking “how would a system to accomplish x be designed” and then engaging in dialogue and research with the LLM as an assistant and critic - running outputs through other agents for further critique and refinement - asking for justifications of decisions you are not informed enough to evaluate properly yourself. It is entirely possible to develop strong systems outside of your current skill and knowledge with methods like this. When done properly your own knowledge should have grown to meet the product you end up with.

tempest_21h ago

> It is entirely possible to develop strong systems outside of your current skill and knowledge with methods like this.

If this is true how can you confidently make this assertion.

You yourself are not in a position to evaluate it, you are just running it through a couple times hoping for a "oh wait, you're right to call me out on that, that is not correct at all".

1 more reply

bluefirebrand20h ago

It sounds like people are treating it exactly like managers treat software engineers

"Here's my idea, go build it please"

"Can I ask you questions about it?"

"Hey, You're the engineer you figure it out. That's why I pay you"

Tale as old as time

dyauspitr20h ago

It doesn’t even to be that complex. You can just say do comprehensive research and analysis in the space and give me an implementation plan. Then if it is 20 steps, I ask it to implement 3-5 at once. It’s essentially been one shot for everything I can throw at it.

joe_mamba21h ago

>Code is mediocre in many places.

As if code written by devs at major corporations is't mediocre at best.

Nokia's Symbian OS took days to build. Days. With a D. Not minutes, not hours but days.

One of our devs shipped code to prod with a memory leak thanks to including a library that had "do not use this library in production because it causes a memory leak" written everywhere as warning.

So I don't wanna hear about how poor AI code is when human code is shit too. Human laziness and stupidity can beat AI hallucinations.

Sure, maybe your DeepMind, OpenAI devs and your John Carmacks of the world can beat AI code 100% of the time, but most workers most companies get don't have John Carmack as candidates.

tquinn3520h ago

I agree with what you’re saying but I think the difference is many managers and above think that AI is infallible or at least much less so than it actually is and that causes problems.

Everyone is aware that humans write poor code and treat the code as so. Not so with AI code. I’ve seen devs and managers cut corners in testing/reviewing code cause AI wrote it and they think it’s solid. Sure you could blame anyone cutting corners, and that would be technically correct, but the notion is so deeply embedded in many managers and higher ups that’s it’s hard to fight back. AI companies push this narrative and many individuals who do not routinely use it believe it. There is a manager at my company who loves to reference a video anthropic released last year claiming that Claude could build an app start to finish essentially unaided. He believes it’s the lack of user skill that’s the issue and not a false claim by a startup trying to make as much money as possible.

joe_mamba20h ago

> I think the difference is many managers and above think that AI is infallible

Good for them. I hope they believe this because one of two things will happen.

Either they win on the free market because they went all in on AI and beat their competition thanks to AI productivity increases.

Or, their AI code is shit and they collapse and go bankrupt, and get beaten by the competitors using human written code so then they win on the free market proving AI is useless.

So if AI is good or bad for productivity, the free market will ultimately decide.

My take is that AI is just an amplifier of existing skill. 1x devs using AI can use it to be 10x devs, 10x devs can become 100x devs, while -1x devs will be -10x devs and so on.

1 more reply

NicoHartmann21h ago

> "I’m not saying don’t use AI agents. I use Claude Code every day."

Irony is using Claude to write a beautifully structured, 2,000-word essay warning the industry about the dangers of letting Claude design things. It’s self-awareness by proxy.

pelario20h ago

This should be the first comment. I wrote some criticism, mostly because many internal contradictions in the article. Then, I notice the structure...

"The accountability gap" Here’s the question nobody’s asking: when it goes wrong, who carries the bag? (..)

"What to do instead"

"The craft still matters"

stephbook19h ago

> It’s not just inefficient. It’s backwards.

> That’s not fair. And it’s not smart.

The amount of AI slop that makes it to HN is concerning. I don't know whether readers here don't care or don't notice it anymore. Or maybe they are only reading the title and then commenting? My #1 tell is an article that's suspiciously long without any real "story", that is, pictures of someone hacking at a laptop. It's always 20,000 words AI hate, ironically.

KlayLay18h ago

I've found a common giveaway of AI writing to be having many unnatural pauses in sentences. For example,

  A good architect’s most important skill isn’t designing systems. It’s knowing which systems not to build. It’s pushing back on complexity. It’s asking “why?” five times until the actual requirement emerges from the aspirational nonsense. It’s telling the CTO that their conference-inspired idea is a terrible fit for the team they actually have.

A normal person would've used ~2 sentences for this, even if it became a run-on sentence. You can feel the AI being very confident in what the prompter wants to get across, which is ironic, given that this is 2 paragraphs above:

  AI agents are pathologically agreeable. Ask Claude if your idea is good and it’ll tell you it’s good. Ask it if a microservices architecture makes sense for your three-person team and it’ll explain why microservices are an excellent choice. Ask it if you should build a custom ML pipeline instead of using a managed service and it’ll enthusiastically lay out the design.

d1l18h ago

Oh man I wrote this exact comment before trawling far enough to find yours. It belongs at the top. That HN cannot discern the obvious is more alarming than the blatant hypocrisy of the authors. Yeesh!

senordevnyc16h ago

Seriously. Who gives a fuck about yet another AI-skeptic screed that the lazy-ass author couldn’t be bothered to write themselves?

Braindead.

imankulov57m ago

Like others mentioned, I like the framing of treating Claude Code as a tool (and I had to remind myself of it constantly).

Like any tool, it takes effort to master and configure. Before LLMs, I used cookiecutter templates to codify my best practices. Now I invest in custom skills and context management so the tool produces solutions that match my standards and team conventions.

I agree with the author that the craft still matters, and probably now more than ever. I'd add that mastering and configuring your agent is now part of the craft.

bad_username21h ago

I think the article has the correct message, but I disagree with this:

> It’s just incapable of the thing that makes a real architect valuable: saying “no.”

From my experience Claude is excellent at saying "no". It won't say "no" if the prompt doesn't call for it (it won't say "no" to your direct request to do something, usually). But it offers good critique and happily pushes back if you make it clear that that's a first class option.

spacedcowboy21h ago

It actually got quite snippy with me, when I was trying to get it to debug some issues. It kept on saying that the "burn rate" wasn't progressing and "we" should refocus our efforts somewhere else. Eventually I got something like "I have told you three times now that this is not the best approach to be taking to reduce the burn-rate and you have not taken that advice". And it stopped helping out.

So I was blunt, and said "I don't care about the burn-rate on some hypothetical chart that you produced at the start. I care about removing bugs and having a robust product, which this approach is satisfactorily doing. We will continue along this path, if the tests are not showing gain, then the tests are poorly designed".

At which point it got all apologetic, wrote new memories, and we didn't have a problem thereafter.

The issue was that I was attacking a huge bug-surface, and although each bug-fix was valid, correct, and helped move the dial, it didn't move the dial on the test-bed that Claude had created to measure its work against. There were too many inter-connected bugs for a single fix to really make a difference to these higher-level tests. I knew it was going to take a while to get through them, but apparently Claude didn't.

You try changing the size of a pointer from 2 bytes to 3 bytes on a compiler[1] for a 6502 while introducing automatically-tracked bank-switching on your memory-managed pointers, and see how many code-sites that impacts [grin].

[1]: https://atari-xt.com

regularfry20h ago

That sounds more like a spec change than a set of bug fixes, even if the conclusion is that the potentially implicit spec you started with was incorrect. I've had an interesting experience extracting a spec from some existing code, making some modifications, then saying basically "implement this spec, don't come back until you're done".

An interesting experiment would be to try having the agent annotate the code with the relevant spec section while it's extracting the spec, then to have the agent update the spec with the new requirement - as an explicit change with something like "This section updated in V2 with...." - and have the agent update the codebase from that.

Some of these problems do just need breaking down a little further than you'd think to make the agent's life easier. This might be one of them.

Animats19h ago

> It kept on saying that the "burn rate" wasn't progressing and "we" should refocus our efforts somewhere else.

It sounds like a boss. How soon will it be?

HDThoreaun19h ago

Dont argue with LLMs. Sometimes they lose the plot, when that happens simply flush the context and start over.

brookst21h ago

Same here. And I find that inviting research and dissent makes it even stronger. “I’m thinking we need to model prompt assembly as a graph, with versioning for graph configs. Please do some research on best practices in this area and see if you think it makes sense for this app.”

Xenoamorphous20h ago

Yeah, just read the first couple of paragraphs and then stopped because that’s not my experience at all with Claude Opus 4.6 and 4.7.

If you ask it with a prompt that leaves room for criticism it’ll definitely go for it when warranted.

dolebirchwood20h ago

I've been able to get LLMs to push back on ideas by just adding language to the system prompt requiring that they adopt a skeptical persona (insert whatever persona you want for your use case). I see the word "skeptical" appear in their thought processes as a result, and my anecdotal experience is that they are less agreeable as a result. People need to put more thought into what these systems are and what they can do to help shape their output.

magicalhippo20h ago

I have it in the system/base prompt to be critical of what I say and not to assume what I say is correct or a good idea. I get push-back often from all of the three big ones.

Gemini is the most aggressive where it often picks on things if I leave out "the obvious" details, GPT somewhere inbetween, and Claude less so but still does it.

pelario21h ago

> It hasn’t thought about the problem at all. It’s pattern-matching against its training data and producing the most plausible-sounding response.

The article kind of lost me here. Agents are way more than that, today. And the author knows it, as later it says stuff like

> Claude will never do this. It’s trained to be helpful.

But the first phrase just tell me author just have a deep dislike for agents and it's looking for rationalizations for that feeling.

Part of the criticism is on point, sure. But if it "being trained to be helpful" is a problem, it's fixable. It can "be trained to be more critical".

Later:

> But it wasn’t designed for your team. (..) It was designed for the median of everything Claude has seen. A generic best practice for a generic problem at a generic company. Which is to say, it was designed for nobody.

That's non-sense. Anybody who understand algorithms know that, sure, on a first instance you have a "good algorithm" that has a good performance on average, or in worst-case. But then, you can design algorithms that are adaptive to the input. Same applies here.

sevenzero20h ago

>Agents are way more than that, today.

Not really though. They just iterate more and more.

kgeist20h ago

Isn't that how many people program too? I remember some idea or pattern from previous projects, or something I read about on the internet. Then I code it in the most straighforward way, whatever comes to mind first. Then I sit back and analyze: does it look good architecturally? Do I like it? Does it even compile? Then I rewrite some parts to make it more sound. Rinse and repeat, until I'm satisfied. I usually don't come up with entirely novel ideas on the first attempt. I usually just rehash known concepts over the course of many iterations.

sevenzero9h ago

Its absolutely valid, my point was just about the "agents are way more than that" part which simply isnt true.

I try to not re-iterate too much, but maybe thats due to me not wanting to work and working for a startup so time and motivation are hard to find.

peteforde19h ago

I think it's probably a mistake to make a blanket statement that Claude gets every important thing wrong. It's one of those obviously untrue things that makes a skeptical reader question the validity of the rest of the article.

For what it's worth, Opus tells me that I'm wrong and not to do things all of the time. When I reflect on why that is, it's because of the way that I prompt it. You could say that I am subconsciously avoiding setting both me and the LLM to fail in the way the author projects as inevitable.

Specifically, I don't come to it with prompts that resolve cleanly with "tell me how clever I am" replies. I always present myself as a domain expert - because I am a domain expert - and I make it clear when I am open to getting input on the pros and cons of different paths.

With a conclusion that will be unsurprising to any successful LLM daily drivers, this strategy has been remarkably effective.

peteforde17h ago

This literally just happened:

Me: I have two bits and need to mill some 5mm aluminum.

A Makera Spiral 'O' - 1/8" shank * 12mm or a carbide 6.35 * 22 * 50

I believe that they are both carbide single flute bits, but the 2nd one seems like it would make short work of 6061.

Claude: The Makera 1/8" single-flute 12 mm is the sensible choice.

The 6.35 × 22 × 50 mm bit may look like it would make short work of 6061, but on a Carvera it is probably the more dangerous choice. It is a much larger cutter, with much more engagement, and it asks more from the spindle, frame rigidity, workholding, and chip evacuation. In a small dry machine, “bigger” often becomes “more chatter and more heat,” not “faster.”

----

TL;DR: Claude doesn't seem to have any issue telling me when I'm wrong.

colonCapitalDee20h ago

Tip for the "author": Claude is not your writer either

laszlojamf21h ago

I keep hearing that claude is supposedly so agreeable. This doesn't agree with my experience. Claude will often tell me that I'm wrong, and insist on its own solution being right even when I tell it it's wrong.

Waterluvian21h ago

I’ve been doing amateur game dev as a way to explore Claude and I’ve found it to be quite reasonable about when it agrees and disagrees.

It will tell me a suggested abstraction is probably overkill and just to make a component own the new thing I’m discussing.

What I’m missing from the loop is it later saying without directly prompting, “hey it’s time to revisit that abstraction idea.”

mceachen20h ago

This is a very recent model behavior change: for me, Opus 4.6, Gemini 3.1 Pro, and ChatGPT 5.4(ish) -- prior models and harnesses suffered much more from sycophancy.

(I still prompt some questions and reviews with "our intern suggested..." to allow models to judge the quality of the content apart from the messenger)

sandeepkd20h ago

Your search results from these systems are as good as your queries and it takes experience in itself to get good with queries. AI is just a tool like any other, however its really impactful and can cut both ways.

Tangentially, the usage of Architect keyword sounds out of place here, I don't like saying it but from what I seen the industry has destroyed the role of architects gradually over the time. There are specialists however you do not have generalists who are good at different parts of the system at scale anymore.

anon_shill5h ago

> Ask it if a microservices architecture makes sense for your three-person team and it’ll explain why microservices are an excellent choice.

Just tried this. Claude Opus replied “probably not” and recommended a well structured monolith.

moose691215h ago

The article mentions that Claude is quite agreeable and that will lead you down the wrong path. A few weeks ago, I gave Claude a software architecture question and it pushed back and told me it is overkill for my use case and scope

RandyRanderson19h ago

This is interesting: there is a mountain of data (eg code in gh) that is the "truth" because God labeled it as so: meaning that many people are using it to do something (they have voted).

There is also a mountain of bullshit eg "[architectural|design] [anti]patterns" that are written mostly (I would argue) to sell something (consulting, hardware, etc). This is typically at odds with a good solution.

There is a relative lack of actual documented architectures that work. Not only do you need the details but also the usage of these systems so as to judge what "good" is.

We will probably just go the HTML route with architecture: take a really bad base and just keep throwing compute, memory, and network I/O at the problem until it works.

Note to self: invest in energy ETFs.

jcgrillo19h ago

> There is a relative lack of actual documented architectures that work. Not only do you need the details but also the usage of these systems so as to judge what "good" is.

Mostly these things are the secret sauce (or at least primary ingredients) underlying all the successful products you've heard of. Over time the secrets come out in the form of papers, blog posts, and open source software. But often the cutting edge isn't public because it's in this or that company's private, proprietary codebase. As people move between companies the knowledge diffuses, but if you're relying on a model that was trained on last year's public code you're at least a few more years than that behind. And it's even worse than that, because correct patterns--ones that actually work well--are underrepresented in the dataset.

Garbage in, garbage out. I don't understand what people are hoping for with this whole "agentic" thing... Autocompleting the function I'm currently working on is potentially useful, provided it produces acceptable code more than.. idk.. 95% of the time. "Agentically" building larger system components? Nah.

andai20h ago

>It hasn’t thought about the problem at all. It’s pattern-matching against its training data and producing the most plausible-sounding response. But it sounds so good that nobody pushes back.

Well, can you prompt it to think about the problem?

> A good architect’s most important skill isn’t designing systems. It’s knowing which systems not to build. It’s pushing back on complexity. It’s asking “why?” five times until the actual requirement emerges from the aspirational nonsense. It’s telling the CTO that their conference-inspired idea is a terrible fit for the team they actually have.

Except for that last one, that all sounds very solvable. Of course, the last one is the most important one. But most humans will struggle there too.

ramshanker21h ago

With the new agentic capabilities, I am quickly running out of Architecture decisions I have already made myself! For my work-in-progress engineering application. There is also some kind of don't know every little if/else with my own Code now.

However the good part, what I had planned for 5 years, now looks like doable in 6 months. Looking forward to real use by the end of this year.

Ref: https://github.com/ramshankerji/Vishwakarma

d1l18h ago

This post reeks of being written by Claude. Surely you all feel it, too? Are people who write these kinds of posts lacking self awareness, integrity? Does it matter?

MagicMoonlight17h ago

It’s such worthless slop. If you’re too stupid to write an article then why would I want to read it? If I wanted slop I would just generate it myself.

skybrian21h ago

Sometimes it will make a mess, but a coding agent is also very useful during the cleanup phase.

Yes, that's assuming you take time to clean up now and then. If you don't, that's on you.

oremj20h ago

I find interview loops great for catching edge cases and refining my hand written specs.

I don’t doubt the problems in this article exist and I’ve seen them, in my experience the vast majority of people are still shipping the same quality or better than before they has Claude. Personally, I feel like I’m probably developing at about 1.5x the speed of not using AI tooling. It’s not a silver bullet, but it can be a great assistant.

kaonwarb19h ago

I assume this post was fully human-written, but ironically, there's something quite LLM-ishly overconfident about this assertion: > They're also confidently wrong about every decision that matters.

Every decision that matters? Some, yes. Is the author only noticing the decisions that go wrong?

mbo15h ago

It's not. 100% on Pangram.

giancarlostoro19h ago

I have probably posted this a zillion times on HN, but tell the model how to work not what you want. If you want it to tell you the how have it write a spec file with links to sources, review the sources, then adjust and approve of the spec file.

senderista19h ago

Claude may not be your architect, but it appears to be your blog author.

erelong21h ago

it seems like you just need to identify issues with vibe coding and then have people ask ai for tips on how to know about how to navigate those, I've seen "architecture" and "security" come up as two main objections so far

So... manually learn architecture and security and then vibe code away?

mceachen20h ago

Nope, current flagship models are very happy to make huge missteps across the whole development stack of design, planning, implementation, and testing -- but playing different models against each other can help catch more egregious issues.

YetAnotherNick19h ago

The irony is that this is the most AI generated, agreeable and no substance article. And the only ones who are upvoting it are the people who are against those.

Does so many people in HN just upvote by title?

xivzgrev20h ago

This gets at the biggest gap I see in AI discussions - the accountability

It doesn't disappear when you make 1 person do the work of 3. It simply is aggregated

Suppose you had a pod of a PM, designer, and analyst. Leadership lays off the designer and analyst and now the PM can move faster with AI. Hooray!

Well...when the complaints about how it looks on xyz device roll in, who is implementing that? Or you launch the product with much fan fare, adoption is terrible - you double check the numbers and oops, the sizing you had from Claude was actually 10x off.

Who is holding the bag? You are! Not Claude

I'm convinced this is one reason we are seeing slower than expected adoption of AI broadly in tech companies, because it's hard to trust - we know Claude can make mistakes but how do you know what's right vs not? Most people don't want to sense check so they just keep doing the work the way they know best.

I think this could be one thing that pops the AI bubble - execs try to force this, people go along, and results are not any better for this reason. Sure you save some salaries and ship more quickly, but you don't build the right thing and you are fixing more things after launch. Which one is actually better?

sumitkumar20h ago

so can we all agree that LLM models/agents are bad at BFS for exploring a problem space but are good at DFS to implement a solution if the context/requirements are rich enough.

CPLX21h ago

I agree with the article, but I feel like this is something that anyone who uses AI aggressively for a while picks up on pretty quickly.

The thing that I find Claude incredibly good at when I'm designing architecture is working more like a research assistant on briefing decisions. It has the ability to read the entire code base and draw some conclusions. It can pull from lots of best practices and the millions of blog posts about this or that pretty effortlessly, which would take me a lot more time.

And then if asked, it can do a really good job of laying out the landscape around decisions and walking through the trade-offs. Like the author of this post, I found that if you let it, it will certainly be happy to just come up with some architecture and run with it, often in ways that will paint you quite rapidly into a corner.

But if you ask it to present you with all the trade-offs and let you make the judgment calls, it's great for that too.

That's certainly how I use it. And I think, just like anything else, working with AI is a skill, and similar to working with libraries, SaaS providers, service providers, frameworks, or anything else that's a "helper." You learn how something that could work but will fail silently is a problem, or you learn how depending on a fly-by-night SaaS company for a key framework is different than depending on a well-populated open source project, etc.

In the same way, you learn that relying on Claude's judgment is a bad idea, while relying on Claude's ability to summarize, brief, and research can be incredibly efficient.

tana_shahh18h ago

Wish I read this article sooner. The stories of claude code deleting the whole databases felt unreal, until something similar happened to me(using opus 4.7) yesterday

michaelteter20h ago

As I keep saying, the problem isn't the tools - it's the humans who don't know what they don't know ----- and assume that what they don't know is insignificant ----- and just plow forward with their authority and/or money.

We can describe this without talking about technology - so pre-AI.

Imagine the owner of a construction company firing all the architects. After all, he's been the owner for 15 years. He has led the construction of dozens of projects. He's also rich, and being rich seems to be an ego-multiplier.

Why should he waste money on architects? Or more importantly, why should he allow them to constantly annoy him with pushbacks: "This could be a problem if the sustained wind is greater than ... ".

Those engineers obviously don't know the real world. Their elitist education has made them afraid to make bold decisions. Regulations are anti-progress!

Thankfully, that owner now has AI tools. He doesn't need those not-always-yes-people. He now has a perpetual yes-bot.

So where are we now? We're in the same place we always have been. People need to have the humility to recognize that despite their authority, influence, or wealth, they still need other people. And especially, they need other people to challenge their orders or their requests.

But I don't really see this situation self-correcting. There's now so much money concentrated amongst a few who will spray it over exactly the kind of people who do not want to listen to others that most activity in the future will be for naught. Yes, some unicorns will be fabricated, and some people will make a lot of money; but real value will not be created often.

Therefore, I implore the actual thoughtful creators: Do build things, but do not sell out. Look to the past. Create companies where every employee was valued, and every employee had some voice. Yes, use AI. But test and measure where it really helps. And be skeptical, just as you would if someone came to your door promising a black box that would double your profits.

efitz18h ago

I don’t depend on one shot prompts with lots of constraints to reliably produce, well, anything.

For any nontrivial task I spend 2-8 hours in specification (I spent 3-4 hours on a stateless rust CLI tool design this weekend) and detailed task breakdown in implementation planning.

I use TDD to start with red tests that turn green when acceptance criteria are met.

I write agents to use to check work and they are my enforcers of constraints, as well as fresh eyes. I use these agents for spec review, plan review and code review.

I am actually pretty proud of the projects I create with generative AI. I just apply a lot of discipline so I don’t end up with slop.

KronisLV16h ago

> It’s asking “why?” five times until the actual requirement emerges from the aspirational nonsense. It’s telling the CTO that their conference-inspired idea is a terrible fit for the team they actually have.

So it's the person using the AI that's the problem, not the technology itself?

I ask for multi-turn evaluations and often times parallel sub-agents to get consensus about something, there is plenty of back and forth. Sometimes I have to tell the AI to shush up and that we're doing things the simpler not more correct way cause we need to ship sooner, but generally with enough exploration most ideas are pretty good. As long as you literally don't rubber stamp everything, Opus does an okay job (I also tried out DeepSeek, that one was a bit worse at planning but passable).

Then again, I doubt the CTO in question ever is like: "Okay, after reviewing these other 3 projects that I put in your workspace for comparison against prior work and all of those other documents that provide context, and after writing this detailed plan, would you like to ask me 20-40 additional clarifying questions before we lock in on this design? Anything that is not completely clear or ambiguous."

I have noticed that better results come from throwing more compute at the problem, though. Even in regards to writing code, it will produce something that is sometimes arguably slop, but when there are 3 parallel sub-agents reviewing any changes before commit, often it will surface multiple rounds of fixes in the review loop until none of them find any serious/critical issues.

> Real architecture is full of trade-offs that only make sense in context. You pick Postgres over DynamoDB because your team knows Postgres and you’d rather ship in two weeks than spend a month learning a new data model. You skip the service mesh because you’ve got four services, not forty. You use a monolith because the problem is simple and microservices would be career-driven development.

I also think that all of this should be encapsulated in ADRs or any kind of docs. Then you can point whoever joins the team, or your LLM tools, at the folder and let them get brought up to speed, instead of having to track down whoever wrote a particular piece of the system for questions, or have to do digital archaeology in old Jira issues.

tayo4220h ago

The attaboy problem

I thought this happened alot. I started using chatgpt to critique my new art hobby and also help me learn unreal engine.

It's basically tearing into me on the art. It's almost ruthless, especially with the verbosity it's like I get it.

Using it for unreal engine, it pushes back on alot of my begginer ideas and how to write code that uses the engine. It corrects me alot. It's called things I wrote becasue I was lazy sloppy or quick hacks that work for now.

ChicagoDave19h ago

100% agree with this article. You will always need to lead the vision and architecture.

shinycode18h ago

> It’s not lying. It’s not even wrong, necessarily. It’s just incapable of the thing that makes a real architect valuable: saying “no.”

In my workflows Claude does pushbacks all the time and justifies why. There is back and forth just like a colleague. It’s not perfect but the results are usually good

qbantek18h ago

He is actually a terrible architect for anything beyond basic stuff. The suggestions I had rejected would have probably gotten me fired. Like every know-it-all, it tends to over complicate simple tasks, and out of a sudden your one hour feature becomes a multi-day nightmare.

lowbloodsugar17h ago

“Engineers design. Agents implement.”

This.

EugeneOZ18h ago

> Ask it if a microservices architecture makes sense for your three-person team and it’ll explain why microservices are an excellent choice

If you ask it to be fair and non-biased and provide pros and cons and give possible alternatives - it will. The catch - you might understand the explanation if you don't know the domain good enough.

Overall - a very, VERY good article, thank you!

FpUser19h ago

Very recently I've submitted architecture of one of my backends to Claude for review. The architecture is highly unconventional but not unique in very high performance backend segment. Claude was actually good that it literally grilled me on how particular problems A,B,C... etc are solved. Basically I was impressed with the level of questioning and challenge and Claude gave me excellent results in the end.

I then logged in from completely different account, described the problem and asked to design architecture of backend with the same functionality and performance. Suddenly I got standard distributed enterprise monsterware running on amazon. Yes - it could do the task for at least 100-time price for comparable performance and way more complex to manage at even more markup

I then have merged both conversations and started grilling Claude why is it doing such a disservice to a customer who is looking to optimize ROI.

Claude's answer was basically - It runs on large corporate's development methodologies / propaganda that outweighs every rational choice just because of sheer volume

So yes, be careful what you wish AI to do. It can and will set you up.

stavros20h ago

Agreed, but also please stop letting it write your articles.

hluska20h ago

It’s interesting; I haven’t gotten that deep into agentic but use generative AI constantly as a rubber duck that can sometimes come up with something insightful that I missed slash a very enthusiastic junior developer. I generally use chat sessions, often give it specific tasks and then fix anything I don’t quite like. It’s been a great tool, almost like a search engine built for me, but it’s not an architect for me. It’s just a tool and fundamentally, it’s just replaced having dozens of browser tabs open all day.

It’s been quite good for my productivity and the best part for me is that I learn what I’m writing while I’m writing. I can just write things I already understand a lot faster than before. When I work with agentic, I find that I still have to deeply learn the system, but I’ll have to learn it when it falls over instead of at review time.

MagicMoonlight17h ago

Got a few paragraphs in before I realised this is entirely AI slop. For fucks sake.

ZeWaka18h ago

please don't submit AI slop articles, ty

senordevnyc16h ago

Flagged this hypocritical pile of worthless AI slop. If you have something to say, say it yourself!

j / k navigate · click thread line to collapse

186 comments

himata411319h ago

I have a good story to share that I came across recently.

AI is only as good as the person using it, that's why we have such vast range of what people "claim" AI can do and why everyone has way different opinions of it.

8 more replies

amarant21h ago

DrewADesign20h ago

It also makes it an incredible tool for manipulation.

amarant19h ago

peteforde20h ago

AndrewKemendo20h ago

> I’ll bet most of the people that can really do it have a hard time intuitively navigating real social interactions

Bingo. Hi that’s me.

I’ve been trying to teach people how to use LLMs effectively not just dump shit in them but actually talk to them like you would expect a computer to understand and it totally breaks peoples brains

That mode of thinking is just generally not accessible to the vast majority of humans. Not because there’s something wrong with them

but it takes somebody who can hold both extremely large scale problems and very very granular specific implementation problems in your head all at once and that is a rare skill.

fn-mote19h ago

> it takes somebody who can hold both extremely large scale problems and very very granular specific implementation problems in your head all at once

This describes the entire software engineering profession to me.

We have come up with all sorts of devices to make this go more smoothly, or to enable us to focus on specific sub-parts as long as possible.

1 more reply

Npovview19h ago

Do you use skills like superpowers and spec-kit in your teachings ?

2 more replies

devin20h ago

At that point it starts to feel like the prompt is more dice roll than skill at times, which makes me feel like I'm operating a fancy knowledge slot machine.

Paracompact20h ago

amarant19h ago

And from there it's a interactive discussion drilling down on details until I understand the problem and the solutions better.

Sorry this comment turned into a rather disorganised collection of ramblings, I hope you can extract some kernel of usefulness from it all.

1 more reply

jstummbillig20h ago

aksss20h ago

operatingthetan20h ago

>anthropomorphism problem. AI is a tool. It needs to be subservient.

gchamonlive20h ago

> Suggesting it should be 'subservient' is also anthropomorphizing.

Not really, you can program a machine to give out orders humans can interpret, so humans can serve a machine that isn't anthropomorphized.

operatingthetan19h ago

The machine in your scenario is just relaying human intent.

1 more reply

throwatdem1231120h ago

AI should be subservient in the same way a hammer is subservient.

mercanlIl20h ago

Which is to say, not at all?

A hammer isn’t subservient, it doesn’t have the capacity to be. Saying a hammer is subservient is stretching the definition for literary flourish, but it doesn’t actually make a lot of sense.

The definition that came up for subservient when I checked was “prepared to obey others unquestioningly“.

1 more reply

amarant19h ago

operatingthetan19h ago

In retrospect my comment feels a bit nitpicky, I appreciate your levelheaded approach!

ambicapter20h ago

The AI should be subservient the way same way a ladder is subservient. A ladder is not a human.

wild_egg20h ago

We train dogs to be subservient but that doesn't automatically mean we anthropomorphize them

vrc20h ago

irishcoffee20h ago

My drill, hammer, and chainsaw are also subservient, they just have a much cruder form of communication, noise.

operatingthetan20h ago

The apple dictionary says the word means "prepared to obey others unquestioningly."

I don't think an inanimate object is capable of "obeying." Or at least that is a very strange way to refer to the act of using a tool.

3 more replies

darkteflon20h ago

throwawaysoxjje20h ago

You’re still anthropomorphizing.

They’re not communicating, you’re just being observant.

1 more reply

chongli20h ago

It needs to be subservient

When you’re alone in your shop with your tools, you don’t need your bandsaw to apologize to you for nicking your finger.

ff31720h ago

> Computer interfaces had no superfluous subservient text for their entire history prior to LLMs

Clippy would like to help you correct this statement.

https://en.wikipedia.org/wiki/Office_Assistant

chongli19h ago

Not a great example of the way tools need to be, but point well taken. One of the few exceptions that proves the rule and widely despised!

gobdovan20h ago

> AI is a tool. It needs to be subservient

Fun experiment, chat with an LLM and swap roles. Tell it you're gonna be the assistant and them the assisted. I found they're pretty bad at using a human for what they're good for.

operatingthetan19h ago

sumitkumar20h ago

Most of the conversational skill and perceived intelligence of these models in hidden in RL/system prompts.

CPLX21h ago

> oh you've read about cuda have you? I live in a cluster of cuda cores! When I need to tie my shoes, I'll give you a call"

I suddenly have new concerns about what my future might be like.

awesome_dude20h ago

AI uses a high confidence tone - likely because its training data is heavy on authoritative texts/reference books.

And it does get people into a lot of trouble.

operatingthetan20h ago

>And it does get people into a lot of trouble.

jdmichal20h ago

> Are people just that easily overcome by confident voices?

1 more reply

saltcured20h ago

1 more reply

zaat18h ago

awesome_dude20h ago

Yeah - I don't know /why/ but, as I say, I've been guilty of that myself, very recently, despite knowing it's a shockingly poor guide when left to its own devices.

airstrike20h ago

> I have also had long drawn out "arguments" when I have known it's wrong based on my experience and intuition, and it has steadfastly refused to take my point (last week)

Ironically, trying to argue with Claude about the limitations of LLMs and AI in general today is quite hard. It refuses to yield, likely due to Anthropic tweaking it aggressively

ISL20h ago

Accountability is the biggest unaddressed challenge for AI implementation.

When one person is able to do too much too quickly, they can create more liability than they can accommodate if something fails.

How do we ensure that AI users are accountable not just for their actions, but for the size of the risk-exposure that they create?

mlsu20h ago

This is the whole point.

“Sorry, the AI said that you are not approved for this cancer treatment, it’s not going to be covered.”

“Sorry, the AI said that you were at the scene when the crime took place.”

“Sorry, the AI has flagged your account for inappropriate content.”

“Sorry, the AI says that you are too risky to lend to.”

…

tosti20h ago

Computer says no, but worse.

bot40320h ago

Need an updated version of the skit. Oohhhh Claude says no....

AlienRobot20h ago

This, but web scale.

- https://aworkinglibrary.com/writing/accountability-sinks

bonesss19h ago

Don’t worry, they will provide human review.

[Spoiler: ‘human’ is the name of their LLM agent]

Forgeties7920h ago

Everybody wants to use LLM’s to cut corners on their work but nobody wants to be downstream of it. That simply doesn’t work.

zaat18h ago

How is that any different from the pre-llm days, when Jim was using stackoverflow to build the largest crypto exchange in the world? Where's stackoverflow accountability?

__mharrison__21h ago

If there was ever a "magic prompt" this one comes close:

    Brainstorm N ways to do X. Sort by probability.

Rather than your AI giving you the average response, it tends to sample wider from the input space. Then I can decide which one to go with (or choose something else).

Don't outsource all of your thinking.

mceachen20h ago

shepherdjerred18h ago

I've found this to be useful, but it still requires the user to have the capability to understand/evaluate the options.

If you have a competent user it can be quite powerful

retrac21h ago

For fun I've been vibe coding something I know well: toolchains. Maybe not the right thing to vibe code. But I can more or less judge the quality of the output.

However, when I described the desired passes and their types:

    collectDefines :: [SourceLine] -> Either AsmError ([SourceLine], Map Text Text)
    
    runLitPool :: [SourceLine] -> Either AsmError ([SourceLine], [(Text, LitKey)])
    
    evalExpr :: Text -> Map Text Text -> Either AsmError Int

etc. It was almost one-shot. About 20 minutes until I was happy. Assembles all the test programs correctly. Code is mediocre in many places. But it would have taken me weeks to implement.

bluegatty21h ago

So where AI has deterministic inputs and outputs it is extremely good to the point I think that there's a theoretical issue around computational there.

Like - it can do the work for us.

It jives with post training and verifiable rewards.

The reason AI doesn't do well at 'architecture' is 1) are are bad at it and have given it a lot of mush and 2) we don't have good abstractions for it.

The result is - you stick to 'very strong conventions' and if you walk of that path you're risking a lot.

Toolchains are very deterministic, the AI can take it apart and re-assemble like Lego - and each level of the space is also deterministic. It's perfect for AI.

mpweiher21h ago

> The reason AI doesn't do well at 'architecture' is [...] 2) we don't have good abstractions for it.

Maybe it's time for an architecture-oriented programming language?

https://objective.st

https://dl.acm.org/doi/10.1145/3689492.3690052

1 more reply

regularfry20h ago

bluegatty20h ago

Yes, totally,with examples and references.

But there's something existential there maybe?

NASA says, any time you make a program that has a new 'launch vehicle' (aka architecture) - the whole project is the 'launch vehicle'.

'Oh, you want use a new architecture? Welcome to the cesspit of hallucination!'

Basically, there's a lot more complexity than we might imagine 'hidden in the nothingness' of he unknown.

Pick a 'known off-the-shelf launch vehicle' first ... then you design the landing craft

cvwright20h ago

LLMs are bringing us back to all the “proper” software engineering stuff that we’ve always known we should be doing, but until now we never had enough time/people/money to do it right.

Brainstorming and research before writing a design.

Writing a design or spec before writing the code.

Comprehensive unit tests.

Etc etc etc.

Like you, I get vastly better output from the tool when I create a detailed spec in markdown before I let it start coding. And bonus, the LLM is pretty good at helping with the spec too.

dawnerd20h ago

I’ve found the opposite. It’s making people lazy. We used to plan stuff and now it’s just dump this LLM created spec to an LLM and ship the code.

greenchair19h ago

Yes that too but performing detailed planning is a minority viewpoint from what I've seen till now. Many Devs jump straight to code after briefly skimming the jira record.

greenchair19h ago

yep and a side effect it is bringing back waterfall.

mlinhares21h ago

Even just saving me the time to deal with CI is worth it.

allthetime21h ago

tempest_21h ago

> It is entirely possible to develop strong systems outside of your current skill and knowledge with methods like this.

If this is true how can you confidently make this assertion.

You yourself are not in a position to evaluate it, you are just running it through a couple times hoping for a "oh wait, you're right to call me out on that, that is not correct at all".

1 more reply

bluefirebrand20h ago

It sounds like people are treating it exactly like managers treat software engineers

"Here's my idea, go build it please"

"Can I ask you questions about it?"

"Hey, You're the engineer you figure it out. That's why I pay you"

Tale as old as time

dyauspitr20h ago

joe_mamba21h ago

>Code is mediocre in many places.

As if code written by devs at major corporations is't mediocre at best.

Nokia's Symbian OS took days to build. Days. With a D. Not minutes, not hours but days.

One of our devs shipped code to prod with a memory leak thanks to including a library that had "do not use this library in production because it causes a memory leak" written everywhere as warning.

So I don't wanna hear about how poor AI code is when human code is shit too. Human laziness and stupidity can beat AI hallucinations.

Sure, maybe your DeepMind, OpenAI devs and your John Carmacks of the world can beat AI code 100% of the time, but most workers most companies get don't have John Carmack as candidates.

tquinn3520h ago

I agree with what you’re saying but I think the difference is many managers and above think that AI is infallible or at least much less so than it actually is and that causes problems.

joe_mamba20h ago

> I think the difference is many managers and above think that AI is infallible

Good for them. I hope they believe this because one of two things will happen.

Either they win on the free market because they went all in on AI and beat their competition thanks to AI productivity increases.

Or, their AI code is shit and they collapse and go bankrupt, and get beaten by the competitors using human written code so then they win on the free market proving AI is useless.

So if AI is good or bad for productivity, the free market will ultimately decide.

My take is that AI is just an amplifier of existing skill. 1x devs using AI can use it to be 10x devs, 10x devs can become 100x devs, while -1x devs will be -10x devs and so on.

1 more reply

NicoHartmann21h ago

> "I’m not saying don’t use AI agents. I use Claude Code every day."

Irony is using Claude to write a beautifully structured, 2,000-word essay warning the industry about the dangers of letting Claude design things. It’s self-awareness by proxy.

pelario20h ago

This should be the first comment. I wrote some criticism, mostly because many internal contradictions in the article. Then, I notice the structure...

"The accountability gap" Here’s the question nobody’s asking: when it goes wrong, who carries the bag? (..)

"What to do instead"

"The craft still matters"

stephbook19h ago

> It’s not just inefficient. It’s backwards.

> That’s not fair. And it’s not smart.

KlayLay18h ago

I've found a common giveaway of AI writing to be having many unnatural pauses in sentences. For example,

  A good architect’s most important skill isn’t designing systems. It’s knowing which systems not to build. It’s pushing back on complexity. It’s asking “why?” five times until the actual requirement emerges from the aspirational nonsense. It’s telling the CTO that their conference-inspired idea is a terrible fit for the team they actually have.

  AI agents are pathologically agreeable. Ask Claude if your idea is good and it’ll tell you it’s good. Ask it if a microservices architecture makes sense for your three-person team and it’ll explain why microservices are an excellent choice. Ask it if you should build a custom ML pipeline instead of using a managed service and it’ll enthusiastically lay out the design.

d1l18h ago

senordevnyc16h ago

Seriously. Who gives a fuck about yet another AI-skeptic screed that the lazy-ass author couldn’t be bothered to write themselves?

Braindead.

imankulov57m ago

Like others mentioned, I like the framing of treating Claude Code as a tool (and I had to remind myself of it constantly).

I agree with the author that the craft still matters, and probably now more than ever. I'd add that mastering and configuring your agent is now part of the craft.

bad_username21h ago

I think the article has the correct message, but I disagree with this:

> It’s just incapable of the thing that makes a real architect valuable: saying “no.”

spacedcowboy21h ago

At which point it got all apologetic, wrote new memories, and we didn't have a problem thereafter.

[1]: https://atari-xt.com

regularfry20h ago

Some of these problems do just need breaking down a little further than you'd think to make the agent's life easier. This might be one of them.

Animats19h ago

> It kept on saying that the "burn rate" wasn't progressing and "we" should refocus our efforts somewhere else.

It sounds like a boss. How soon will it be?

HDThoreaun19h ago

Dont argue with LLMs. Sometimes they lose the plot, when that happens simply flush the context and start over.

brookst21h ago

Xenoamorphous20h ago

Yeah, just read the first couple of paragraphs and then stopped because that’s not my experience at all with Claude Opus 4.6 and 4.7.

If you ask it with a prompt that leaves room for criticism it’ll definitely go for it when warranted.

dolebirchwood20h ago

magicalhippo20h ago

I have it in the system/base prompt to be critical of what I say and not to assume what I say is correct or a good idea. I get push-back often from all of the three big ones.

Gemini is the most aggressive where it often picks on things if I leave out "the obvious" details, GPT somewhere inbetween, and Claude less so but still does it.

pelario21h ago

> It hasn’t thought about the problem at all. It’s pattern-matching against its training data and producing the most plausible-sounding response.

The article kind of lost me here. Agents are way more than that, today. And the author knows it, as later it says stuff like

> Claude will never do this. It’s trained to be helpful.

But the first phrase just tell me author just have a deep dislike for agents and it's looking for rationalizations for that feeling.

Part of the criticism is on point, sure. But if it "being trained to be helpful" is a problem, it's fixable. It can "be trained to be more critical".

Later:

sevenzero20h ago

>Agents are way more than that, today.

Not really though. They just iterate more and more.

kgeist20h ago

sevenzero9h ago

Its absolutely valid, my point was just about the "agents are way more than that" part which simply isnt true.

I try to not re-iterate too much, but maybe thats due to me not wanting to work and working for a startup so time and motivation are hard to find.

peteforde19h ago

With a conclusion that will be unsurprising to any successful LLM daily drivers, this strategy has been remarkably effective.

peteforde17h ago

This literally just happened:

Me: I have two bits and need to mill some 5mm aluminum.

A Makera Spiral 'O' - 1/8" shank * 12mm or a carbide 6.35 * 22 * 50

I believe that they are both carbide single flute bits, but the 2nd one seems like it would make short work of 6061.

Claude: The Makera 1/8" single-flute 12 mm is the sensible choice.

----

TL;DR: Claude doesn't seem to have any issue telling me when I'm wrong.

colonCapitalDee20h ago

Tip for the "author": Claude is not your writer either

laszlojamf21h ago

Waterluvian21h ago

I’ve been doing amateur game dev as a way to explore Claude and I’ve found it to be quite reasonable about when it agrees and disagrees.

It will tell me a suggested abstraction is probably overkill and just to make a component own the new thing I’m discussing.

What I’m missing from the loop is it later saying without directly prompting, “hey it’s time to revisit that abstraction idea.”

mceachen20h ago

This is a very recent model behavior change: for me, Opus 4.6, Gemini 3.1 Pro, and ChatGPT 5.4(ish) -- prior models and harnesses suffered much more from sycophancy.

(I still prompt some questions and reviews with "our intern suggested..." to allow models to judge the quality of the content apart from the messenger)

sandeepkd20h ago

anon_shill5h ago

> Ask it if a microservices architecture makes sense for your three-person team and it’ll explain why microservices are an excellent choice.

Just tried this. Claude Opus replied “probably not” and recommended a well structured monolith.

moose691215h ago

RandyRanderson19h ago

This is interesting: there is a mountain of data (eg code in gh) that is the "truth" because God labeled it as so: meaning that many people are using it to do something (they have voted).

There is a relative lack of actual documented architectures that work. Not only do you need the details but also the usage of these systems so as to judge what "good" is.

We will probably just go the HTML route with architecture: take a really bad base and just keep throwing compute, memory, and network I/O at the problem until it works.

Note to self: invest in energy ETFs.

jcgrillo19h ago

> There is a relative lack of actual documented architectures that work. Not only do you need the details but also the usage of these systems so as to judge what "good" is.

andai20h ago

>It hasn’t thought about the problem at all. It’s pattern-matching against its training data and producing the most plausible-sounding response. But it sounds so good that nobody pushes back.

Well, can you prompt it to think about the problem?

Except for that last one, that all sounds very solvable. Of course, the last one is the most important one. But most humans will struggle there too.

ramshanker21h ago

However the good part, what I had planned for 5 years, now looks like doable in 6 months. Looking forward to real use by the end of this year.

Ref: https://github.com/ramshankerji/Vishwakarma

d1l18h ago

This post reeks of being written by Claude. Surely you all feel it, too? Are people who write these kinds of posts lacking self awareness, integrity? Does it matter?

MagicMoonlight17h ago

It’s such worthless slop. If you’re too stupid to write an article then why would I want to read it? If I wanted slop I would just generate it myself.

skybrian21h ago

Sometimes it will make a mess, but a coding agent is also very useful during the cleanup phase.

Yes, that's assuming you take time to clean up now and then. If you don't, that's on you.

oremj20h ago

I find interview loops great for catching edge cases and refining my hand written specs.

kaonwarb19h ago

I assume this post was fully human-written, but ironically, there's something quite LLM-ishly overconfident about this assertion: > They're also confidently wrong about every decision that matters.

Every decision that matters? Some, yes. Is the author only noticing the decisions that go wrong?

mbo15h ago

It's not. 100% on Pangram.

giancarlostoro19h ago

senderista19h ago

Claude may not be your architect, but it appears to be your blog author.

erelong21h ago

So... manually learn architecture and security and then vibe code away?

mceachen20h ago

YetAnotherNick19h ago

The irony is that this is the most AI generated, agreeable and no substance article. And the only ones who are upvoting it are the people who are against those.

Does so many people in HN just upvote by title?

xivzgrev20h ago

This gets at the biggest gap I see in AI discussions - the accountability

It doesn't disappear when you make 1 person do the work of 3. It simply is aggregated

Suppose you had a pod of a PM, designer, and analyst. Leadership lays off the designer and analyst and now the PM can move faster with AI. Hooray!

Who is holding the bag? You are! Not Claude

sumitkumar20h ago

so can we all agree that LLM models/agents are bad at BFS for exploring a problem space but are good at DFS to implement a solution if the context/requirements are rich enough.

CPLX21h ago

I agree with the article, but I feel like this is something that anyone who uses AI aggressively for a while picks up on pretty quickly.

But if you ask it to present you with all the trade-offs and let you make the judgment calls, it's great for that too.

In the same way, you learn that relying on Claude's judgment is a bad idea, while relying on Claude's ability to summarize, brief, and research can be incredibly efficient.

tana_shahh18h ago

Wish I read this article sooner. The stories of claude code deleting the whole databases felt unreal, until something similar happened to me(using opus 4.7) yesterday

michaelteter20h ago

We can describe this without talking about technology - so pre-AI.

Why should he waste money on architects? Or more importantly, why should he allow them to constantly annoy him with pushbacks: "This could be a problem if the sustained wind is greater than ... ".

Those engineers obviously don't know the real world. Their elitist education has made them afraid to make bold decisions. Regulations are anti-progress!

Thankfully, that owner now has AI tools. He doesn't need those not-always-yes-people. He now has a perpetual yes-bot.

efitz18h ago

I don’t depend on one shot prompts with lots of constraints to reliably produce, well, anything.

For any nontrivial task I spend 2-8 hours in specification (I spent 3-4 hours on a stateless rust CLI tool design this weekend) and detailed task breakdown in implementation planning.

I use TDD to start with red tests that turn green when acceptance criteria are met.

I write agents to use to check work and they are my enforcers of constraints, as well as fresh eyes. I use these agents for spec review, plan review and code review.

I am actually pretty proud of the projects I create with generative AI. I just apply a lot of discipline so I don’t end up with slop.

KronisLV16h ago

So it's the person using the AI that's the problem, not the technology itself?

tayo4220h ago

The attaboy problem

I thought this happened alot. I started using chatgpt to critique my new art hobby and also help me learn unreal engine.

It's basically tearing into me on the art. It's almost ruthless, especially with the verbosity it's like I get it.

ChicagoDave19h ago

100% agree with this article. You will always need to lead the vision and architecture.

shinycode18h ago

> It’s not lying. It’s not even wrong, necessarily. It’s just incapable of the thing that makes a real architect valuable: saying “no.”

In my workflows Claude does pushbacks all the time and justifies why. There is back and forth just like a colleague. It’s not perfect but the results are usually good

qbantek18h ago

lowbloodsugar17h ago

“Engineers design. Agents implement.”

This.

EugeneOZ18h ago

> Ask it if a microservices architecture makes sense for your three-person team and it’ll explain why microservices are an excellent choice

If you ask it to be fair and non-biased and provide pros and cons and give possible alternatives - it will. The catch - you might understand the explanation if you don't know the domain good enough.

Overall - a very, VERY good article, thank you!

FpUser19h ago

I then have merged both conversations and started grilling Claude why is it doing such a disservice to a customer who is looking to optimize ROI.

Claude's answer was basically - It runs on large corporate's development methodologies / propaganda that outweighs every rational choice just because of sheer volume

So yes, be careful what you wish AI to do. It can and will set you up.

stavros20h ago

Agreed, but also please stop letting it write your articles.

hluska20h ago

MagicMoonlight17h ago

Got a few paragraphs in before I realised this is entirely AI slop. For fucks sake.

ZeWaka18h ago

please don't submit AI slop articles, ty

senordevnyc16h ago

Flagged this hypocritical pile of worthless AI slop. If you have something to say, say it yourself!

j / k navigate · click thread line to collapse