Stepping one level out in the metacognition hierarchy is the key. "Learning to learn" as it were. It is only the relative ease of implementation and deployment of feedforward models like Transformers that makes it seem like we have reached an optimum but we desperately need to move beyond it before it's entrenched too thoroughly.