undefined | Better HN

0 pointsjstummbillig1mo ago0 comments

I don't understand how people measure how much more or less work they need to do. It's not that gpt-4o was incapable of exuding enormous amounts of code quickly, it's that the tokens were relativ garbage.

How do you have an opinion on 4.6/4.7 here? It's less clear but I could totally see that 4.7 or beyond leads to project completion 20% faster, by removing dead ends, foot guns, less backtracking, etc.

How to tell / measure effectively? No clue.

0 comments

1 comments · 1 top-level

uberman1mo ago

My personal opinion here based on observations not empirical tested. 4.5 could generate code, but I often ran out of context and the results were regularly incomplete. The result was that I had to spend as much time proofing and debugging as I did making direct progress.

4.6 has what in practice seems to an almost unlimited context window and rarely produces incomplete or flat out wrong results. That is a big step forward though i do burn through quota much faster.

I have not formed an option yet how what 4.7 does for me other than to say I have observed my quota being consumed faster. To be fair, I have not put 4.7 to a challenging task yet.

It honestly surprises me that someone who regularly uses Claude would not have an opion about 4.6 or even Opus vs Sonnet at this point. The lift at least for me was obvious.

j / k navigate · click thread line to collapse