the coroutines are run sequentially by an "event loop"/"coroutine runner" that wakes them up and lets them run for a bit when appropriate, kind of like an OS's scheduler (on a single core machine). if the runner is the OS, `await foo(..)` is kind of like a syscall, where the coroutine is suspended and control is handed back to the runner to do whatever is requested.
i guess the difference from normal threads (preemptive multitasking) is that you explicitly mark your "yield points" – places where your code gives control back to the runner – with `await` (cooperative multitasking). some believe that this makes async stuff easier to reason about, since in theory you can see the points where stuff might happen concurrently
honestly i'm out of my depth re: async IO, haven't used it all that much :/ but if you're comfortable with python and want to dig into the mechanism of async/await, i really recommend this article: https://snarky.ca/how-the-heck-does-async-await-work-in-pyth...
it's long, but i found it very helpful – it actually explains how it all works without handwaviness. at the end the author implements a toy "event loop" that can run a few timers concurrently, which really made it click for me!