undefined | Better HN

0 pointsfreeone30001mo ago0 comments

If it could do anything that a junior dev could, that’d be a valid point of comparison. But it continually, wildly performs slower and falls short every time I’ve tried.

0 comments

rahimnathwani1mo ago

  But it continually, wildly performs slower and falls short every time I’ve tried.

If it falls short every time you've tried, it's likely that one or more of these is true:

A. You're working on some really deep thing that only world-class expects can do, like optimizing graphics engines for AAA games.

B. You're using a language that isn't in the top ~10 most popular in AI models' training sets.

C. You have an opportunity to improve your ability to use the tools effectively.

How many hours have you spent using Claude Code?

RollAHardSix1mo ago

Trying to make a media player, media server, all by using ffmpeg and a pre-built media streaming engine as it's core. Python and SQLite. About a week's worth of effort every time until it begins to go too far off the rails to be reliable to continue to develop with. It never did get the ffmpeg commands right, I had to go back to crafting those by hand, it never did get the streaming engine to play in the browser's video player in the supported hls and dash formats. Asked it to build a file and file metadata caching layer and then had to continue to re-prompt it to poll the caching layers before trying to get values from the database. Never even got to the library, metadata, or library image functionality. Had to ask it to create the rbac permissions model I wanted despite it being very junior-level common sense (super-admin, user-admin, metadata admin, image admin).

Not exactly world-class software.

frankc1mo ago

I recently built something in the same universe - using ffmpeg to receive streams from obs to capture audio and video - don't want to get into details beyond except to say it involved a fairly involved pipeline of ray actors and a significant admin interface with nicegui. I had no problem doing this with claude. You need to give it access to look up how do things, like context7. If you are doing something very specific, you need to have a session that does research to build a skill so it doesn't need to redo that research every time. And yes, you do need to tell it the architecture and be fairly detailed with something like how you want rbac.

Using these tools takes quite a bit of effort but even after doing all those steps to use the tool well, I still got this project done in a few days when it otherwise would have taken me 1-2 months and likely simply would never happened at all.

rahimnathwani1mo ago

I'm curious which harness and which model(s) you've been using.

And whether you have a decent PRD or spec. Are you trying to prompt the harness with one bit at a time, or did you give it a complete spec and ask it to analyze it and break it down into individual issues with dependencies (e.g. using beads and beads_viewer)?

I'm not looking for reasons to criticize your approach or question your experience, but your answers may point to opportunities for you to get more out of these tools.

If you're using Claude Code and you have a friend who has had more success with these tools, consider exporting your transcripts and letting them have a look: https://simonwillison.net/2025/Dec/25/claude-code-transcript...

astrange1mo ago

> A. You're working on some really deep thing that only world-class expects can do, like optimizing graphics engines for AAA games.

This is a relatively common skill. One thing I always notice about the video game industry is it's much more globally distributed than the rest of the software industry.

Being bad at writing software is Japan's whole thing but they still make optimized video games.

freeone3000OP1mo ago

It’s a simple compiler optimization over bayesian statistics. It’s masters-level stuff at best, given that I’m on it instead of some expert. The codebase is mixed python and rust, neither of which are uncommon.

The issues I ran into are primarily “tail-chasing” ones - it gets into some attractor that doesn’t suit the test case and fails to find its way out. I re-benchmark every few months, but so far none of the frontier models have been able to make changes that have solved the issue without bloating the codebase and failing the perf tests.

It’s fine for some boilerplate dedup or spinning up some web api or whatever, but it’s still not suitable for serious work.

rahimnathwani1mo ago

Would you expect a junior engineer to perform better than this?

imiric1mo ago

The possibility that the performance of these tools still isn't at the level some people need it to be is not an option?

It's insulting that criticism is often met with superficial excuses and insinuation that the user lacks the required skills.

indemnity1mo ago

When really solid programmers who started skeptical (and even have a ban policy if PR submitters don’t disclose they used AI) now show how their workflows have been improved by AI agents, it may be worth trying to understand what they are doing and you are not.

https://mitchellh.com/writing/my-ai-adoption-journey

My experience mirrors that of Mitchell. It absolutely is at the level now where AI can free up time to do the really interesting stuff.

rahimnathwani1mo ago

That possibility is covered by A and B.

GP said 'falls short every time I’ve tried'. Note the word 'every'.

bryanlarsen1mo ago

> like optimizing graphics engines for AAA games.

Claude would be worse than an expert at this, but this is a benchmarkable task. Claude can do experiments a lot quicker than a human can. The hard part would be ensure that the results aren't just gaming your benchmark.

buzzerbetrayed1mo ago

I am way more productive with $200/month of AI than I would be with $5,000/month of junior developer. And it isn’t close.

poslathian1mo ago

What if you are going to spend 5400 either way, you go all agent or get an apprentice and an agent for them too.

andkenneth1mo ago

Companies are not comparing it straight to juniors. They're more making a comparison between a Senior with the assistance of one more more juniors, vs a Senior with the assistance of AI Agents.

I feel like comparison just to a junior developer is also becoming a fairly outdated comparison. Yes, it is worse in some ways, but also VASTLY superior in others.

taurath1mo ago

It’s funny so many companies making people RTO and spending all this money on offices to get “hallway” moments of innovation, while emptying those offices of the people most likely to have a new perspective.

j / k navigate · click thread line to collapse

0 comments

rahimnathwani1mo ago

  But it continually, wildly performs slower and falls short every time I’ve tried.

If it falls short every time you've tried, it's likely that one or more of these is true:

A. You're working on some really deep thing that only world-class expects can do, like optimizing graphics engines for AAA games.

B. You're using a language that isn't in the top ~10 most popular in AI models' training sets.

C. You have an opportunity to improve your ability to use the tools effectively.

How many hours have you spent using Claude Code?

RollAHardSix1mo ago

Not exactly world-class software.

frankc1mo ago

rahimnathwani1mo ago

I'm curious which harness and which model(s) you've been using.

I'm not looking for reasons to criticize your approach or question your experience, but your answers may point to opportunities for you to get more out of these tools.

astrange1mo ago

> A. You're working on some really deep thing that only world-class expects can do, like optimizing graphics engines for AAA games.

This is a relatively common skill. One thing I always notice about the video game industry is it's much more globally distributed than the rest of the software industry.

Being bad at writing software is Japan's whole thing but they still make optimized video games.

freeone3000OP1mo ago

It’s fine for some boilerplate dedup or spinning up some web api or whatever, but it’s still not suitable for serious work.

rahimnathwani1mo ago

Would you expect a junior engineer to perform better than this?

imiric1mo ago

The possibility that the performance of these tools still isn't at the level some people need it to be is not an option?

It's insulting that criticism is often met with superficial excuses and insinuation that the user lacks the required skills.

indemnity1mo ago

https://mitchellh.com/writing/my-ai-adoption-journey

My experience mirrors that of Mitchell. It absolutely is at the level now where AI can free up time to do the really interesting stuff.

rahimnathwani1mo ago

That possibility is covered by A and B.

GP said 'falls short every time I’ve tried'. Note the word 'every'.

bryanlarsen1mo ago

> like optimizing graphics engines for AAA games.

buzzerbetrayed1mo ago

I am way more productive with $200/month of AI than I would be with $5,000/month of junior developer. And it isn’t close.

poslathian1mo ago

What if you are going to spend 5400 either way, you go all agent or get an apprentice and an agent for them too.

andkenneth1mo ago

Companies are not comparing it straight to juniors. They're more making a comparison between a Senior with the assistance of one more more juniors, vs a Senior with the assistance of AI Agents.

I feel like comparison just to a junior developer is also becoming a fairly outdated comparison. Yes, it is worse in some ways, but also VASTLY superior in others.

taurath1mo ago

j / k navigate · click thread line to collapse