Think of the AI optimizing for a stated goal while ignoring implied constraints (e.g. eradicating humanity to stop wars).
This can be the case for A/B testing. Sure, you can increase ad clicks by 30% ... if you trick the user into clicking it through a carefully timed layout jump.
I think GP's argumentation may go in this direction. I'd probably not say A/B testing is the problem itself, it is a tool after all, but I could imagine it's sometimes not used very well.
Another point: Spotify's core flow changes so much (feels like almost daily) that I've lost all confidence in using it.