Therefore, when two processes (let's name them A and B) want to output something at the same time, the result is going to be AAAABBBB or BBBBAAAA (depending on the order in which the messages arrive in the mailbox), but never BBAABBAA or anything similar.
IMHO this is perfectly acceptable behaviour for a `map` function, since that name has gained the connotation that its purpose is to transform one 'collection' (Functor; whatever) into another, by pointwise, independent applications of the given function. Providing a function/action which breaks this independence (by writing to the same handle) breaks this implicit meaning. Heck, I'd consider it a code smell to combine interfering actions like this using a non-concurrent `map` function; I would prefer to define a separate function to make this distinction explicit, e.g.
-- Like 'map', but function invocations may interfere with each other (you've been warned!)
runAtOnce = map
When using `map` functions (which is a lot!) I subconsciously treat it as if it will be executed concurrently, in parallel, in any order. Consider that even an imperative languages like Javascript provide a separate `forEach` function, to prevent "abuses" of `map`. Even Emacs Lisp, not the most highly regarded language, provides separate `mapcar` and `mapc` functions for this reason.With that said, I recognise that there's a problem here; but the problem seems to be 'mapping a self-interfering function'. If we try to make it non-interfering, we see that it's due to the use of a shared global value (`stdout`); another code smell! Whilst stdout is append-only, it's still mutable, so I'd try to remove this shared mutable state. Message passing is one alternative, where we can have each call/action explicitly take in the handle, then pass it along (either directly, or via some sort of "trampoline", like an MVar). This way we get the "concurrent from the outside, single-threaded on the inside" behaviour of actor systems like Erlang. In particular, it's easy to make sure the handle only get passed along when we're 'finished' with it (i.e. we've written a complete "block" of output).
Surely if you can pass your IO handles to all functions that need them, you can decide on a mutexing/buffering strategy at the top of your program, wrap the standard IO interface with a delegate that does so, and pass it on. Then, for all libraries called thereafter to use it consistently isn't just a no-brainer, it's an outright given, isn't it? There's no global (impure, non-functional) handle to stdout, is there?
This distinction is often inconsequential, but this case seems to rely on how we combine those "actions" together (which, due to laziness, may happen far away from where/when the functions are called; see 'lazy IO'). For sequential IO we can combine things with Applicative and Monad, which gives us a definite order, but using these in a concurrent setting would cause too much synchronisation and determinism to be useful. I've not done enough concurrent Haskell to know how the various alternatives stack up; although I did play with Arrow many years ago, before it fell out of favour (seemingly for Profunctors?).
Even if you do as you say, you can still bypass it by writing to stdout directly outside of your stable buffering mechanism. But at that point the language isn't to be blamed here.
EDIT: this is an honest question, I was shocked to read what was in the article.
Haskell's built-in string type is a list of characters. This is mostly for historical reasons, but it's also handy in education (installing extra packages is a barrier for learners; list processing is common in introductory courses, but lists are polymorphic/generic in their element type; lists of characters are a nice concrete type, which follows on easily from "hello world"); also there are arguments about the theoretical elegance of linked lists, KISS for the builtins, whether there's concensus on what the best alternative is, etc.
Anyone who cares about Haskell performance will have hit this early on, and be using a different string implementation, as mentioned in the article. In particular there's ByteString for C-like arrays of bytes, and there's Text which is just a ByteString with extra metadata like character encoding. In fact, ByteString doesn't have to be a single contiguous array: it can be a list of "chunks", where each chunk contains a pointer to an array, an offset and a length; this speeds up many operations, e.g. we can append ByteStrings by adding chunks to the list (pointing to existing arrays), we can take substrings by manipulating the offsets and lengths, etc. This is all perfectly safe and predictable since the data is immutable, where other languages which allow mutation might prefer to make copies of the data to reduce aliasing.
The other aspect is the buffering mode of the handle, which is discussed a little in the article and its comments (e.g. line-based buffering, etc.).