My impression (NOT A LAWYER) is that by hosting your code in a public repo on GitHub, you agree to their terms and give them the right to "read" your code including training AI models on it. Or at least that's what they're banking on.
Go host on Sourcehut or self-host with Gitea, and I would think it unlikely (but not impossible) that any big company would use your code to train their AI.