M a
Then add a way to put things in the container. a -> M a
Then add a way to use the thing in the container. M a -> (a -> M b) -> M b(A nearly-exact parallel can be seen in the Iterator interface. You can describe it as "a thing that walks through a container presenting the items in order"... and yeah, that's the majority use case and where the idea came from... but it's also wrong. What it really is is just "a thing that presents items in some order". It doesn't have to be from "a container". You can have an iterator that produces integers in order, or strings in lexigraphic order, or yields bytes from a socket as they come in, or other things that have no "container" anywhere to be found. If you have "from a container" in your mental model then those things are confusing; if you understand it simply as "presenting items in order" then having an iterator that just yields integers makes perfect sense. A lot of the Monad confusion comes from adding extra clauses to what it is. Though by no means all of it.)
The "aha" realization that the "container" can be an ephemeral concept and not resident at run time can come later.
FWIW, I think of IO as a container: it contains the risk of side-effects within. All the examples you gave are containers in their own way.
f(x)
is the same as
a = x; f(a)
and the same as
g = f; g(x);
That's the monad laws. Whatever craziness you want to put in the semantics, those are properties you probably would like to preserve in your language.
Functional languages are really weird, for instance it's possible to switch line order of statements and the compiler will still figure out how to stitch that together. I think even JS in parts has or at least had that behaviour. (Actually that's useful when having mathematical formulas that are interdependent and you're too lazy to order them topologically by dependence)
On the other hand, just executing a sequence of commands in order to do I/O is only a normal thing to do since recently as far as I understand. The sweet spot for FP is IMHO something like React where state is strictly separated from the functions. (Imagine writing Hello World using Normal Maths)
(Please correct me if I'm wrong, which is probably quite likely ;))