Output was bit-banged VGA signal into a cheap-ass monitor.
Like the first Voodoo-era 3d accelerators, it was a triangle rasterizer with Z-buffering and perspective correct texture mapping. Ie. there was no 3d transformations done on the chip, they were done on the CPU. The limiting factor in the demos was actually the ARM CPU (synthetic on the FPGA) which couldn't push enough triangles to keep the GPU busy.
It was a tile based rasterizer (in two stages: coarse and fine) rather than a scanline rasterizer (like SW rasterizers in the Quake era).
This is pretty much all I can remember about it.