There were a bunch of reasons.
One important one is that accessing the PCI config space via IO ports 0xCF8/0xCFC is racy with the kernel, since a read or write requires writing the BDF address to 0xCF8, and then reading/writing the data from 0xCFC. If the kernel tries to do this dance while the X server is doing it as well one of them is going to read or write the wrong address.
Interestingly, this design required in the X server to run under binary translation in VMware's monitor, even though it was userspace code, because it had to elevate its IOPL to be able to read/write the IO ports. CSRSS.EXE in windows also ran in BT, since it too was driving the graphics card before NT4. After NT4 moved the graphics code into the kernel, no one remembered to take out the IOPL elevation code, so at least until XP (and probably later) CSRSS.EXE runs with elevated privileges that it didn't need.