What I want is to have a directory with models and bind mount that readonly into inference containers. But Ollama would force me to either prime the pump by importing with Modelfiles (where do I even get these?) every time I start the container, or store their specific version of files?
I had trying out vLLM and llama.cpp as my next step in this, I'm glad to hear you are able to share a directory between them.