However the gray area is that the massive data set of which it is a part will spit out new code that has, in some way big or small, been influence by the AGPL code, which... well, I don't think that sort of use was anticipated by the terms of AGPL. I see can reasonable arguments in both directions. Personally though, I would favor an interpretation that limits GitHub's use for commercial purposes, if not for strictly licensing restrictions then at least for the spirit of these licenses.
In truth, I would very much have liked GitHub to gone out big & loud with an aggressive awareness campaign asking repo owner to opt-in to the use of their code for this. Again, for pure opens source licenses I don't thing that would be required, but I still think it would be the right thing to do. And certainly less damaging to their reputation & future hesitancy for project maintainers to trust GitHub with their code.
I don't think this will be a tipping point by itself, but if this behavioral pattern continues I could imagine devs big & small shifting to hosted or on-prem instances of things like GitLab.