The delay is all in compiling exs files. It uses Kernel.ParallelCompiler to compile every .exs file, so it's very CPU/core dependent. On my weaker laptop, `mix test` takes nearly 10 seconds to just start.
I've looked into this in more details in the past. We've had success just writing our own test runner and avoiding exs files. But re-implementing things like running tests based on line number, or integrating with external tools (like excoveralls) has been a dealbreaker.