No inference engine does all of:
- Model switching
- Unload after idle
- Dynamic layer offload to CPU to avoid OOM