These are problems that can be solved in an afternoon, and then never repeated ever again.
You need to be using automation.
We have something like 16-17 servers, of which 4-5 are actually production servers. Every single one of them was provisioned, configured, and are pushed to via my code.
Everything from haproxy, to the frontend web servers, to the backend web servers, to the frontend assets, the CDN, the image server, the backups, our staging cluster, experimental cluster, databases...all of it is automated.
Do it once, do it right, do it in code, never do it manually again.
The idempotence is just so you can repeat the same job over and over if it fails halfway between without causing any breakage.