> why is it sending a retry?
may be two clients tries to do it? Or there's a bug with the client in how they do it?
Isn't the point of idempotency meant to enable clients to retry again, without fear that a 2nd request somehow breaking things?
Not necessarily - there are different transaction isolation and conflict resolution methods provided by every database built for this purpose. You just have to ensure that only one request actually commits to the database, and that one sends a success response while the other sends a 409. The database or another lock provider can either help enforce serialization up-front - or the app can use optimistic locks based on data in the request that will only block if there is actually a conflict, and this won't delay the first transaction at all.
Solving these kinds of issues are exactly the purposes of idempotency keys and database transactions and using them in the intended way is really the only sound way to build a distributed system. Making things more complicated to "improve DevX" is just going to make them unsound. That is what Stripe chose to do. Their 24-hour replay idea is fine but why not send 409s after that rather than accept those transactions? If "that will never happen" then the 409s will never happen. It would have cost approximately nothing (if designed that way upfront) and inconvenienced their clients not at all.
You absolutely must wait for one request to finish before any other request can return a 409. 409 is a signal to the client that they can stop retrying, the job is done. If some request returns 409 early and the "original" request fails, you will not get further retries and the message will be lost.
Stripe's approach requires serialization as well. Only one request can succeed. If you send multiple conflicting requests in simultaneously, some of those have to block.
The good news is that we have been solving this problem for decades and we have incredibly well refined tools - database transactions and isolation levels - for solving this problem.
And you haven't considered multiple servers in your scenario - what if two requests meant to be idempotent with each other arrived at different servers?
And at the sake of repeating the above commenter, you solve the multiple server by serializing somewhere, because you ultimately need a lock on something. You can also perform the operation in both places and then reconcile the state later but that’s a lot more complex.
When you are using TCP, and you send the same data twice because of a delayed ack, you likewise don't care if the ACK is for the first time or the second time you sent the data. You just know the other side got the data, and that's all you care about.
By sending a third request and getting a response that reveals the state of the system.