Open training data is hard to the point of impracticality. It requires excluding private and proprietary data.
Meanwhile, the term "open source" is massively popular. So it will get used. The question is how.
Meta et al would love for the choice to be between, on one hand, open weights only, and, on the other hand, open training data, because the latter is impractical. That dichotomy guarantees that when someone says open source AI they'll mean open weights. (The way open source software, today, generally means source available, not FOSS.)