The distinction that seems to be important is the warp-thread architecture: multiple compute units sharing a single program counter, but instead of the SIMD abstraction they are presented as conceptually separate threads.
Also they tend to lack interrupt mechanisms and virtualization, at least at the programmer API level (usually NVIDIA systems have these but managed by the proprietary top level scheduler).