We have to print much more to the console due to the ANSI escape codes and we also have to do some conditional checks ON EACH BYTE in order to colorize them correctly.A few extra comparisons and output for each byte shouldn't be that much slower; fortunately the function of this program is extremely well-defined, so we can calculate some estimates. Assuming a billion instructions per second, taking ~1.5s to hexdump ~1 million bytes means each byte is consuming ~1500 instructions to process. In reality the time above is probably on a faster CPU, so that number maybe 2-3x more. That is a shockingly high number just to split a byte into two nybbles (expected to be 1-3 instructions), convert the nybbles into ASCII (~3 instructions), and decide on the colour (let's be very generous and say ~100 instructions.)
The fact that the binary itself is >1MB is also rather surprising, especially given that the source (not familiar with Rust, but still understandable) seems quite small and straightforward.