I mean that information is being lost
https://arxiv.org/abs/1906.08237
See xlnet for the rethoric
https://www.microsoft.com/en-us/research/publication/mpnet-m...
Or mpnet which attempt to combine the best of both worlds information wise but still find that masked modeling is much less useful than autoregressive.