The people I've talked to who can visualize say they can consciously see the objects in mind, confirming it looks as if they are looking at it on a phone (as a rough analogy). There is apparently a wide range of ability regarding this, and the abilities seem to be about as common as the limitations, such as not visualizing faces, and seeing historic data in black and white or blocks like Minecraft. Some people can't do motion, some can't do high detail. My mom can do detailed trees with leaves and see landscapes from anywhere she has been, but no creative ones. She also doesn't do faces, nor does she have echoic recall. A recent conversation with a visualizer confirmed they were doing creative rendering of scenes (mesh + texture + lighting) but had low detail on things like tree leaves. They said they were for sure "seeing" the imagery, as if it were a type of "screen" in mind.
I've questioned maybe 200 people about this over the years and when someone starts talking about spatial understanding and not seeing pixels, they aren't really talking about seeing anything in mind, but more understanding it. I can do the "feeling" thing, which I reference as "I have the mesh, not the map". People with Aphantasia appear to hold facts about objects, but not actually generate the imagery where they are conscious of seeing it. It would be a little like having Dalle3 generate an image from a Claude Opus 3 prompt, then uploading it to Claude to look at. It can't do generative images, but can look at them and inference.
Maybe someone that visualizes strongly here can confirm that what you are indeed seeing has aspects of light, shadows, color and the things we consider attributes of "pixels"?