Focal depth is one of the cues your brain uses to perceive distance, in addition to (potentially more than, depending on which cognitive scientist you listen to) binocular vision. You don't mind that monitors are a fixed distance from your eyes because you don't expect them to give you real depth (your eyes can just focus on that distance). If, however, you want something to be "indistinguishable from reality" you need to emulate changing focal depth, which means (I guess) changing the angles that rays hit your eyeballs at.
IMO that's one of the reasons that 3D movies always looked so fakey; they could emulate the binocular vision, but they couldn't emulate the focal depth, causing a perceptual dissonance.