> Not just the training code but the training data as well, should be under a permissive license, otherwise you cannot call the project itself Open Source, which Facebook does here.
Does FB even have the capability to do that? I'd assume there's a bunch of data that's not theirs and they can't even release it. Let alone some data that they might not want to admit is in the source.
If not, it is questionable if they should train on such data anyway.
Also, that doesn't matter in this discussion - if you are unable to release the source under appropriate licence (for whatever reason), you should not call it Open Source.