Have a look at libdrm from the mesa project (the AMDGPU submodule), then it will give you pointers where to look into the kernel-DRM via the right IOCTLs.
Basically, the kernel code is initialization, quirks detection and restoration (firmware blobs are failing hard here), setting up of the various vram virtual address spaces (16 on latest GPUs) and the various circular buffers. The 3D/compute pipeline programing is done from userspace via those circular buffers.
If I am not too much mistaken, on lastest GPU "everything" should be in 1 PCIE 64bits bar (thx to bar size reprograming).
The challenge for AMD is to make all that dead simple and clean while keeping the extreme performance (GPU is all about performance). Heard rumors about near 0-driver hardware (namely "rdy" at power up).