The compiler doesn't do a whole lot to try and minimize LLVM-IR, and monomorphization produces quite a lot of it. This makes LLVM do a lot of work. (EDIT: maybe that's being too harsh, but what I mean is, there's been some work towards doing this but more that could possibly be done, but it's not a trivial problem.)
On my current project, "cargo check" takes ten seconds, and "cargo build" takes 16. That's 62.5% of the total compilation time taken by code generation, roughly (and linking, if you consider those two to be separate).
In my understanding, there can sometimes be problems with -Z time-passes, but when checking my main crate, type_check_crate takes 0.003 seconds, and llvm_passes + codegen_crate take 0.056 seconds. Out of a 0.269 second total compilation time, most things take less than 0.010 seconds, but other than the previous codegen mentioned, monomorphization_collector_graph_walk takes 0.157s, generate_crate_metadata takes 0.171 seconds, and linking takes 0.700 seconds total.
This general shape of what takes longest is consistent every time I've looked at it, and is roughly in line with what I've seen folks who work on compiler performance talk about.