You can't even define toxicity in an objective and verifiable way, because it's inherently subjective.
Trying to make rules for a machine to behave in a decidedly nontoxic way is a fool's errand, then.
You're also assuming that AI is going to be used to heavily influence people's lives, but there's a good chance that all it's good for is ripping off copyrighted material and generating clipart that's good enough for powerpoint presentations.
AI is probably going to change the world in the way that NFTs did. And self driving cars. And the Alexa.