No, we can’t few shot it and we don't get there faster (but we develop a lot of other capabilities on the way.) We train on a lot more data; the human brain, unlike an LLM, is training on all that data in processes for ”inference”, and it receives sensory data estimated on the order of a billion bits per second, which means by the time we start using language we’ve trained on a lot of data (the 15 trillion tokens from a ~17 bit token vocabulary that Llama3 is something like the size of a few days of human sense data.) Humans just are trained on and process vastly richer multimodal data instead of text streams.