Many would argue this feels restrictive. Isn’t it just better to edit an earlier message or tweak the prompt and move on? And this is what most coding agents allow, and on the front, it looks convenient. But realistically, editing conversation history throws away the most valuable signal you have.

When you edit the past prompts, it's basically erasing the part on how the agent actually reasoned. You lose the failed attempts, the wrong assumptions, the dead ends, and the recovery steps. That history is exactly what you need to understand why something worked or didn’t.

So instead of allowing you to edit a past prompt and erase the following history, we worked on allowing you to fork from the point instead. With our append-only model approach, every attempt becomes a checkpoint. You can fork from different moments, try alternative approaches, and compare outcomes side by side.

That way, you don’t have to pretend the first attempt was correct; you acknowledge that AI coding is inherently trial-and-error.

If you give an agent one shot, you get one outcome. If you give it five attempts from the same starting point, your chances of getting what you want go up dramatically. And append-only history makes that exploration cheap and safe, allowing you to treat failure as data and not something to hide or overwrite.

And to look at it, isn’t that how we already work as engineers? We don’t rewrite Git history every time an experiment fails. We branch instead. This allows us to explore and keep a record of what happened so we can reason about it later.

Thoughts?

1wsxiaoys5mo ago1

Ask HN: Do coding agents optimize the wrong review step

This is a personal opinion, but I think current coding agents requires human reviews AI's artficats at the wrong moment. Most tools focus on creating and reviewing the plan before execution.

So the idea behind this is to approve intent before letting the agent touch the codebase. That sounds reasonable, but in practice, it’s not where the real learning happens.

The "plan mode" takes place before the agent has paid the cost of reality. Before it’s navigated the repo, before it’s run tests, before it’s hit weird edge cases or dependency issues. The output is speculative by design, and it usually looks far more confident than it should.

What will actually turn out to be more useful is reviewing the walkthrough: a summary of what the agent did after it tried to solve the problem.

Currently, in most coding agents, the default still treats the plan as the primary checkpoint and the walkthrough comes later. That puts the center of gravity in the wrong place.

My experience with SWE is that we don’t review intent and trust execution. We review outcomes: the diff, the test changes, what broke, what was fixed, and why. That’s effectively a walkthrough.

So I feel when we give feedback on a walkthrough, we’re reacting to concrete decisions and consequences, and not something based on hypotheticals. This feedback is clearer, more actionable, and closer to how we, as engineers, already review work today. Curious if others feel the same when using plan-first coding agents. The reason is that I’m working on an open source coding agent, and have decided to keep less emphasis on approving plans upfront and more emphasis on reviewing what the agent actually experienced while doing the work.

But this is something we’re heavily debating internally inside our team, and would love to have thoughts so that it can help us implement this in the best way possible.

2wsxiaoys5mo ago0

A shift towards engineering-native RL for coding agents (opens in new tab)

(docs.getpochi.com)

2wsxiaoys5mo ago0

How we built context management for tab completion (opens in new tab)

(docs.getpochi.com)

7wsxiaoys5mo ago3

Creating a Tab completion model from scratch (opens in new tab)

(docs.getpochi.com)

4wsxiaoys6mo ago1

Show HN: Run parallel agents in VSCode tabs (opens in new tab)

(docs.getpochi.com)

2wsxiaoys6mo ago0

Show HN: Turn Claude Code sessions into beautiful web links (opens in new tab)

(github.com)

6wsxiaoys8mo ago0

Rank Fusion for improved code context in RAG (opens in new tab)

(tabby.tabbyml.com)

3wsxiaoys1y ago1

Recent submissions

Managing Library Migrations Across Multiple Repositorie (opens in new tab)

(docs.getpochi.com)

3wsxiaoys2mo ago0

Show HN: jj-benchmark – Evaluating AI agents on Jujutsu version control (opens in new tab)

(tabbyml.github.io)

5wsxiaoys2mo ago2

A 4-part deep dive on building AI code edits inside VS Code (opens in new tab)

(docs.getpochi.com)

2wsxiaoys4mo ago1

Working Around VS Code APIs to Render LLM Suggestions (opens in new tab)

(docs.getpochi.com)

1wsxiaoys4mo ago1

More Context Won't Fix Bad Timing in Tab Completion for Coding Agents (opens in new tab)

(docs.getpochi.com)

2wsxiaoys4mo ago1

Ask HN: Change my mind) should AI coding conversations be append-only?

I work on an open-source coding agent, and one design choice we made is to keep conversations append-only. No editing past prompts.

That way, you don’t have to pretend the first attempt was correct; you acknowledge that AI coding is inherently trial-and-error.

Thoughts?

1wsxiaoys5mo ago1

Ask HN: Do coding agents optimize the wrong review step

This is a personal opinion, but I think current coding agents requires human reviews AI's artficats at the wrong moment. Most tools focus on creating and reviewing the plan before execution.

So the idea behind this is to approve intent before letting the agent touch the codebase. That sounds reasonable, but in practice, it’s not where the real learning happens.

What will actually turn out to be more useful is reviewing the walkthrough: a summary of what the agent did after it tried to solve the problem.

Currently, in most coding agents, the default still treats the plan as the primary checkpoint and the walkthrough comes later. That puts the center of gravity in the wrong place.

My experience with SWE is that we don’t review intent and trust execution. We review outcomes: the diff, the test changes, what broke, what was fixed, and why. That’s effectively a walkthrough.

But this is something we’re heavily debating internally inside our team, and would love to have thoughts so that it can help us implement this in the best way possible.

2wsxiaoys5mo ago0