I think a lot of it depends on what you mean by “large enough”.
In principle, a data set could be infinitely large in size, but not cover little edge cases here and there due to repetition. So you might be OK if you had infinite size and infinite diversity.
Even if you had very large finite data, let’s say all language ever conceived by mankind… The second you finish training, what your overfit model knows is locked in.
The world as we know it would continue to generate vast amounts of new data that you might not be able to generalize to.