The installer could pull the images, create the stack and run migrations, then shut down. The app could then up the stack, show a loading screen that would likely be shorter than any Adobe program, then open a webview. When the last window is closed, down the stack.
You wouldn't be able to tell the difference. And what is the difference exactly? Most big apps are composed of multiple components doing IPC, it just so happens to be TCP/IP here.
All in all, the overhead would probably be equal to two tabs in Chrome.
And calling it a "small graphics tool" is just nonsense. If you're doing UI/UX design, this is THE thing you spend your time in. It's like Premiere for a video editor or AutoCAD for an engineer. If it takes 30 seconds to load and eats half your system memory, who cares, it's what you bought the computer for.
Besides a specific subset of programmers that only use Vim or Emacs, every other professional is used to giving whatever tool they're using to do their job all the resources their computer has. You don't see video editors complaining Davinci Resolve runs a whole PostgreSQL instance in the background.