eg:
GPT-2: paper https://d4mucfpksywv.cloudfront.net/better-language-models/l... with code: https://github.com/openai/gpt-2, output dataset https://github.com/openai/gpt-2-output-dataset and a bunch of blog posts with all manner of details
GPT-3: paper https://arxiv.org/abs/2005.14165 dataset https://github.com/openai/gpt-3
etc etc.