I think that's a limitation of our implementations. In principle, it's just bytes that we shoving down the pipe to the browser, so it shouldn't matter for performance whether those bytes are 'inline' or in 'external resources'.
In principle, you could imagine the server packing all the external resources that the browser will definitely ask for together, and just sending them together with the original website. But I'm not sure how much re-engineering that would be.
Simple models are still useful: understanding exactly how and why they fail is instructive. There's a reason spherical cows in a vacuum come up again and again.