I don't think that's true.
My understanding is that any image generated by Stable Diffusion has been influenced by every single parameter of the model - so literally EVERY image in the training data has an impact on the final image.
How much of an impact is the thing that's influenced by the prompt.
One way to think about it: the Stable Diffusion model can be as small as 1.9GB (Web Stable Diffusion). It's trained on 2.3 billion images. That works out as 6.6 bits of data per image in the training set.