That's C's limitation: it can only be as fast as the compiler geniuses can make it, or as concurrent as its users can explicitly make it. Given that almost all computers where you care about raw performance are multiprocessing now, that's a huge deal. Languages with primitives like async/parallel map() are going to have a hard time keeping up with those that do.
Sure, you can write ultra fast code in C or assembly. The fastest way to do so might be to write it first in Erlang, Go, or Rust and reverse engineer it from the resulting machine language.