The general consensus imo is that fine-tuning is more for tone and style vs accuracy. People use vector DBs to grab relevant data to throw into the prompt and call it Retrieval Augmented Generation.
From what this seems to do is host multiple deltas fine-tunings and hot swap as needed. Incredible optimization. It's like going from AMIs to ECS or Kubernetes.