403, 404—log error, maybe retry later. I don’t think there’s a reasonable way for the server to distinguish transient vs permanent failures here.
429—client should back off, throttle requests.
5xx—client should back off, throttle requests.
The big problem here is that from the server side, if you try to figure out whether an error is transient or permanent, you often get the wrong answer. It’s a diabolical problem. The distinction between “failed” and “overloaded” is something that you might figure out in a post-mortem once humans take a look, but while it is happening, I would not expect the two to be distinguished.
What I do want to transmit from server to client are things like:
- Try again at a different URL (307).
- Try again at a different URL, and update your config to point at this other URL (301, 308).
- The request should be authenticated (401). Try again if you can authenticate.
- I understood the request and had the resources to process it, but the request failed (403, 404, 405, 412, etc) due to the system state. Retry if the user asks for it.
- There is something wrong with the request itself (400, 422, etc). This is a bug, get a programmer to look at it.
- Slow down (429). Retry automatically, but not too quickly.
- As above, but take a look at the server or proxy to see if it’s working correctly. (503, 504)
- Go look at the server logs. (500)
As a rule, I would say that any error can be transitory, and I would tend to write clients with the ability to retry any request. Not as a hard rule, just as a tendency. A “permanent” status code isn’t necessarily a statement of fact, but just a belief.
No comments yet.