In our case, Python (because of the GIL) required us to have a single python process per core in order to take advantage of multiple cores, and so we needed Redis to maintain a unified memory state across all the cores, but Go can automatically span across multiple cores.
We also saw about a 10x speedup by moving all caching into the server process, and since it was all in the same process, we no longer had to compress and encrypt data before sending to Redis. We still checkpoint the moving server state, encrypted and compressed, to disk every sixty seconds, just like Redis would do with BGSAVE, so we can start back up within a few seconds (actually faster than the old Redis after a restart.)