Python Docker images with CUDA, Python, Pytorch,... are 5GB to 10GB of third-party code. Here is an example of bringing this down to 1.13GB thanks to WasmEdge, 10MB of LlamaEdge API Server [compatible with ChatGPT] and 1.17GB for TinyLlama.
If you are not actually running ML code, of course you can remove Python and ML libraries. You could get this down to MB by removing WASM and using a Go binary