Camera pixels are only one color at a time:
GGRR
BBGG
(quad-Bayer; Fujifilm uses a weirder one called X-Trans. And some of them will be missing because they're damaged or are focus pixels.)
And then you still have to do white balance and tone mapping, because your eyes do that and the camera sensor doesn't.
You need to do this if you want to see the image at all, and it involves a lot of subjective choices. The objective auto white balance algorithm usually described is objectively quite bad; for instance it's always described as a single transformation on the image, which doesn't make sense if there are multiple light sources.
The reason you'd want to render humans differently in the image is that a) if you don't get skin tones just right they'll look like corpses b) in real life you can choose to focus on a subject in a scene and this will cause them to appear brighter (because your eyes will adapt to them) but in an image there isn't that flexibility and so it helps to guess what the foreground of the image is and expose for that.
I forgot to say recent iPhone cameras let you turn off the sharpening effects anyway, just move the photographic style control down to Natural. It is true that the sharpening is kind of bad. This is because someone taught everyone that digital images are bandlimited so they use frequency-based sharpening algorithms, but they aren't, so those just give you ringing artifacts. For some reason nobody knows about warp-sharpen anymore.