As you say, N qubits can be thought of as a distribution over its possible 2^N values, but you can't ignore the
starting distribution, which may be very far from the set of classical bits that you want to operate over (and therefore may take an exponential number of quantum operations to realize).
So one very simplified model to think of what quantum computation is that you can do parallel operations to a distribution over your 2^N vector of classical bits, but you don't get to specify an arbitrary starting distribution, you can only start with relatively 'simple' distributions.
This restriction is what makes the claim 'you can compute over 2^N values simultaneously' at best very incomplete — yes, you can do that, as long as you're fine not being able to start from arbitrary starting values. But this is a big restriction!