It might be possible to train a big generalist that is a composition of modules, some of which can be dropped dynamically at inference time, depending on the prompt.
Cool. Thanks for sharing. I am thinking about creating a series of smaller models for specific purposes and then orchestrating them so they mirror the human brain which is a bunch of subsystems that give multiple opinions about the same stimulus