undefined | Better HN

0 pointsbonesss2mo ago0 comments

Past the sea change: half the reason those prompt and harness solutions seem to work are LLM-lies, the testing is gassing you about how it works and the efficacy, defaulting to ‘yes’.

If you test specific features of those solutions over time you see very inconsistent results, lots of lies, and seemingly stable solutions that one-shot well but suddenly experience behaviour changes due to tweaks on the backend. Tuesdays awesome agent stack that finally works is loading totally different on Thursday, and debugging is “oh, sorry, it’s better now” even when it isn’t. Compression, lies, and external hosting are a bad combo.

Sometimes I imagine a world where computers executed programs the same way each time. You could write some code once and run it a whole calendar month later with a predictable outcome. What a dream, we can hope I guess.

0 comments

skydhash2mo ago

People are doing toy projects and praising them, while some are testing them in real world situations and not findings them that useful. But the former is labelling the latter as luddites and telling them they will be left behind.

abustamam2mo ago

As someone on the intersection of both (I've built a lot of vibe coded toy projects and lead a vibe coding initiative at work), they're both right and both wrong.

For a single dev team, vibe coding is great. Write specs, write plans, write code. I know what the project wants and needs because I'm the target market.

At work, I haven't written more than a few lines of code since December. But I work with other people vibe coding this same project. Lots of changing requirements and rapid iteration. Lots of mistakes were made by everyone involved. Lots of tech debt. Sure, we built something in 2 mos that would have otherwise taken us 6 mos, but now I'm fixing the mess that we caused.

I think the critical difference is the attitude towards our situation. My boss said to fix the AI harness so we can vibe code more confidently and freely. But other bosses might cut their losses and ban vibe coding. Who's right? I dunno. In both cases I'd just do what my boss wants me to do. But it's not that I don't want to be left behind. I don't want to lose my job. There's a difference.

patrick4512mo ago

> Sure, we built something in 2 mos that would have otherwise taken us 6 mos, but now I'm fixing the mess that we caused.

You didn't actually build it in 2 months.

abustamam2mo ago

Even if it takes me a month to get us to fix (likely a week tbh), then it took us 3 months to build.

1 more reply

j / k navigate · click thread line to collapse

0 comments

skydhash2mo ago

abustamam2mo ago

As someone on the intersection of both (I've built a lot of vibe coded toy projects and lead a vibe coding initiative at work), they're both right and both wrong.

For a single dev team, vibe coding is great. Write specs, write plans, write code. I know what the project wants and needs because I'm the target market.

patrick4512mo ago

> Sure, we built something in 2 mos that would have otherwise taken us 6 mos, but now I'm fixing the mess that we caused.

You didn't actually build it in 2 months.

abustamam2mo ago

Even if it takes me a month to get us to fix (likely a week tbh), then it took us 3 months to build.

1 more reply

j / k navigate · click thread line to collapse