But it doesn't mean, idea is worthless.
You could have said same about Transformers, Google released it, but didn't move forward, turns out it was a great idea.
I don't think you can, Google looked at the research results, and continued researching Transformers and related technologies, because they saw the value for it particularly in translations. It's part of the original paper, what direction to take, give it a read, it's relatively approachable for being a machine learning paper :)
Sure, it took OpenAI to make it into an "assistant" that answered questions, but it's not like Google was completely sleeping on the Transformer, they just had other research directions to go into first.
> But it doesn't mean, idea is worthless.
I agree, they aren't, hope that wasn't what my message read as :) But, ideas that don't actually pan out in reality are slightly less useful than ideas that do pan out once put to practice. Root commentator seems to try to say "This is a great idea, it's all ready, only missing piece is for someone to do the training and it'll pan out!" which I'm a bit skeptical about, since it's been two years since they introduced the idea.
The core insight necessary for chatgpt was not scaling (that was already widely accepted): the insight was that instead of finetuning for each individual task, you can finetune once for the meta-task of instruction following, which brings a problem specification directly into the data stream.
It was fun to come up with creative ways to get it to answer your question or generate data by setting up a completion scenario.
I guess "chat" became the universal completion scenario. But I still feel like it could be "smarter" without the RLHF layer of distortion.
Google released transforms as research because they invented it while improving Google Translate. They had been running it for customers for years.
Beyond that, they had publicly-used transformer based LMs ("mums") integrated into search before GPT-3 (pre-chat mode) was even trained. They were shipping transformer models generating text for years before the ChatGPT moment. Literally available on the Google SERP page is probably the widest deployment technology can have today.
Transformers are also used widely in ASR technologies, like Google Assistant, which of course was available to hundreds of millions of users.
Finally, they had a private-to-employees experimental LLMs available, as well as various research initatives released (meena, LaMDA, PaLM, BERT, etc) and other experiments, they just didn't productize everything (but see earlier points). They even experimented with scaling (see "Chinchilla scaling laws").