If you had enough tokens to completely wash out the parent latent distribution you would have just trained a new model instead of fine tuning. That means by definition your model is still inheriting properties of the parent, and for your business customers who want predictable, understandable systems, knowing the inherited properties of your model is going to be useful.
I think the real reason is that it's a Chinese model (I mean, come on) and your parent company doesn't want any political blowback.