You have to adapt parts of your app to handle the fact that two machines might be handling the service (either at the same time, or in succession.)
This has impact on how you use memory, how you persist stuff, etc...
None of which is rocket science, probably - but even things that look "obvious" to lots of people get their O'Reilly books, so...
But you're right that a part of the "distribution" of a system is in the hands of ops more than devs.
Edit: Thinking on this, if I want to scale something it'd be specific to the problem I'm having so some sort of debugging process like https://netflixtechblog.com/linux-performance-analysis-in-60... to find the root cause would be generic advice. Then you can decide to scale vertically/horizontally/refactor to solve the problem and move on.