Don't test it.
Only do unit tests with the connection mocked out.
Test against production.
Try it a few times with a delay, and if it works then you know your code is good and you can move on with your deployment. Which is what flaky and pytest-retry do.
Maybe I'm missing something, but out of those 4 options retrying the test seems like the best one, with the big caveat that it is only viable if the test does indeed work after trying a few times. I really don't see any downside.
edit:
Maybe another option is to put the retry functionality directly in the client code, which would make your code more robust overall. but that is definitely more complex than using one of these libraries just for testing.