But "most" users can live with a bit of overhead in return for safe parallelism. It's just a handful that wants to squeeze the last bit of power out of a CPU.
The other day, Intel revealed a processor with 66 thread support per core. 64 of those threads were called "slow", because there's no prefetching and speculative execution, as they are supposed to be waiting (mainly for memory, but networking could be another option). Perhaps very many cheap hardware threads is a way out of this.