There cannot be "video compression artifacts" because it hasn’t even seen any compressed video during training, as far as I can see.
Seriously, how is this even a discussion? The article is clear that the novel thing is that this is real-time frame generation conditioned on the previous frame(s) AND player actions. Just generating video would be nothing new.