This creates two image layers - the first layer has all the added foo, including any intermediate artifacts. Then the second layer removes the intermediate artifacts, but that's saved as a diff against the previous layer:
RUN ./install-foo
RUN ./cleanup-foo
Instead, you need to do them in the same RUN command: RUN ./insall-foo && ./cleanup-foo
This creates a single layer which has only the foo artifacts you need.This why the official Dockerfile best practices show[1] the apt cache being cleaned up in the same RUN command:
RUN apt-get update && apt-get install -y \
package-bar \
package-baz \
package-foo \
&& rm -rf /var/lib/apt/lists/*
[1] https://docs.docker.com/develop/develop-images/dockerfile_be...https://docs.docker.com/engine/reference/commandline/build/#...
The downside of trying to jam all of your commands into a gigantic single RUN invocation is that if it isn't correct/you need to troubleshoot it, you can wind up waiting 10-20 minutes between each single line change just waiting for your build to finish.
You lose all the layer caching benefits and it has to re-do the entire build.
Just a heads up for anyone that's not suffered through this before.
I’m confused why they haven’t implemented a COMMIT instruction.
It’s so common to have people chain “command && command && command && command” to group things into a single layer. Surely it would be better to put something like “AUTOCOMMIT off” at the start of the Dockerfile and then “COMMIT” whenever you want to explicitly close the current layer. It seems much simpler than everybody hacking around it with shell side-effects.
> "--squash" is only supported on a Docker daemon with experimental features enabled
Up til now, our biggest improvement was with "FROM SCRATCH".
Thanks for the tip!
RUN ./setup.sh
I have seen that in some cases as a way to reduce layer count while avoiding complex hard to read RUN commands. Also seen it as a way to share common setup across multiple Docker images: RUN ./common_setup_for_all_images.sh
RUN ./custom_setup_for_this_image.sh
However this approach of doing most of the work in scripts does not seem common, so I'm wondering if there is a downside to doing that.As a concrete example... if your setup.sh were:
#!/bin/bash
./update_static_assets.sh
./install_libraries.sh
./clone_application_repo.sh
then any time a static asset is updated, a library is changed, or your application code changes, the digest of the Docker layer for `RUN ./setup.sh` will change. Your team will then have to re-download the result of all three of those sub-scripts next time they `docker pull`.However, if you found that static assets changed less often than libraries, which changed less often than your application code, then splitting setup.sh into three correspondingly-ordered `RUN` statements would put the result of each sub-script its own layer. Then, if just your application code changed, you and your team wouldn't need to re-download the library and static asset layers.
Personally I would not merge steps, which have nothing to do with each other, unless I am sure, that they are basically set in stone forever.
With public and widely popular base images, which are not changed once they are released, the choices might be weighed differently, as all the people, who make use of your image, will want to have fast download and small resulting images building on top of it.
Simply put: Don't make your development more annoying than necessary, by unnecessarily introducing long wait times for building docker images.
In particular cache mounts (RUN --mount-type=cache) can help the package manager cache size issue, and heredocs are a game-changer for inline scripts. Forget doing all that && nonsense, write clean multiline run commands:
RUN <<EOF
apt-get update
apt-get install -y foo bar baz
etc...
EOF
All of this works right now in plain old desktop docker you have installed right now, you just need to use the buildx command (buildkit engine) and reference the docker labs buildkit frontend image above. Unfortunately it's barely mentioned in docs or anywhere else other than their blog right now.> Distroless images are very small. The smallest distroless image, gcr.io/distroless/static-debian11, is around 2 MiB. That's about 50% of the size of alpine (~5 MiB), and less than 2% of the size of debian (124 MiB).
docker run --rm -it --pid=container:distroless-app ubuntu:20.04
You can then see processes in the 'distroless-app' container from the new container, and then you can install as many debugging tools as you like without affecting the original container.
Alternatively distroless have debug images you could use as a base instead which are probably still smaller than many other base images:
https://github.com/GoogleContainerTools/distroless#debug-ima...
I was an early adopter of distroless, though, so I'm probably just used to not having a shell in the container. If you use it everyday I'm sure it must be helpful in some way. My philosophy is as soon as you start having a shell on your cattle, it becomes a pet, though. Easy to leave one-off fixes around that are auto-reverted when you reschedule your deployment or whatever. This has never happened to me but I do worry about it. I'd also say that if you are uncomfortable about how "exec" lets people do anything in a container, you'd probably be even more uncomfortable giving them root on the node itself. And of course it's very easy to break things at that level as well.
(This benefit compounds the more frequently you rebuild your app containers.)
Either way as long as all your containers share the same base layer it doesn't really matter since they will be deduplicate.
So if you have N containers on a host you only end up with one set of tooling across all of them, and it's compressed until you need it.
You can decouple your test tooling from your images/containers, which has a number of benefits. One that's perhaps understated is reducing attacker capabilities in the container.
With log4j some of the payloads were essentially just calling out to various binaries on Linux. If you don't have those they die instantly.
https://github.com/wagoodman/dive
I've found 100MB fonts and other waste.
All the tips are good, but until you actually inspect your images, you won't know why they are so bloated.
The UX is great for the tool, gives me absolutely everything I need to see, in such a clear fashion, and with virtually no learning curve at all for using it.
Ex: https://gist.github.com/sigma/9887c299da60955734f0fff6e2faee...
Since it captures exact dependencies, it becomes easier to put just what you need in the image. Prior to nix, my team (many years ago) built a redis image that was about 15MB in size by tracking the used files ans removing unused files. Nix does that reliably.
[1]: https://github.com/PostgREST/postgrest/tree/main/nix/tools/d...
[0]: https://github.com/jvolkman/bazel-nix-example/blob/e0208355f...
For example, a Dockerfile containing 'my-package-manager install foo' will create an image with foo and my-package-manager (which usually involves an entire OS, and at least a shell, etc.). An image built with Nix will only contain foo and its dependencies.
Note that it's actually quite easy to make container images "externally", using just `tar`, `jq` and `sha256sum`. The nice thing about using Nix for this (rather than, e.g. Make) is the tracking of dependencies, all the way down to the particular libc, etc.
Lots of edge cases around specific libraries come up that you don't expect. I spent hours tearing my hair out trying to get Selenium and python working on an alpine image that worked out-of-the-box on the Ubuntu image.
Also, are you going to update those libraries as soon as a security issue arises? Debian/Ubuntu and friends have teams dedicated to that type of thing.
I start all my projects based on Alpine (alpine-node, for example). I'll sometimes need to install a few libraries like ImageMagic, but if that list starts to grow, I'll just use Ubuntu.
This way, the Node process itself will run as PID 1 of the container (instead of just being a child process of NPM).
The same can be found in other collections of best practices such as [2].
What I do is a bit more complex: an entrypoint.sh which ends up running
exec node main.js "$*"
Docs then tell users to use "docker run --init"; this flag will tell Docker to use the Tini minimal init system as PID 1, which handles system SIGnals appropriately.[0]: https://github.com/nodejs/docker-node/blob/main/docs/BestPra...
[1]: https://nodejs.org/en/docs/guides/nodejs-docker-webapp/
[2]: https://dev.to/nodepractices/docker-best-practices-with-node...
Edit: corrected the part about using --init for proper handling of signals.
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "/path/to/main/process.js"]
[0]: https://github.com/krallin/tiniedit: formatting, sorry.
Edit: Consequently this should make the container logs a bit more useful, beyond better signal handling/respect
https://fedoramagazine.org/build-smaller-containers/
I don't avoid large images because of their size, I avoid them because it's an indicator that I'm packaging much more than is necessary. If I package a lot more than is necessary then perhaps I do not understand my dependencies well enough or my container is doing too much.
Starting with: Use the ones that are supposed to be small. Ubuntu does this by default, I think, but debian:stable-slim is 30 MB (down from the non-slim 52MB), node has slim and alpine tags, etc. If you want to do more intensive changes that's fine, but start with the nearly-zero-effort one first.
EDIT: Also, where is the author getting these numbers? They've got a chart that shows Debian at 124MB, but just clicking that link lands you at a page listing it at 52MB.
I've been on both sides of this argument, and I really think it's a case-by-case thing.
A highly compliant environment? As minimal as possible. A hobbyist/developer that wants to debug? Go as big of an image as you want.
It shouldn't be an expensive operation to update your image base and deploy a new one, regardless of size.
Network/resource constraints (should) be becoming less of an issue. In a lot of cases, a local registry cache is all you need.
I worry partly about how much time is spent on this quest, or secondary effects.
Has the situation with name resolution been dealt with in musl?
For example, something like /etc/hosts overrides not taking proper precedence (or working at all). To be sure, that's not a great thing to use - but it does, and leads to a lot of head scratching
Hah, I go the other way; at work hardware is cheap and the company wants me to ship yesterday, so sure I'll ship the big image now and hope to optimize later. At home, I'm on a slow internet connection and old hardware and I have no deadlines, so I'm going to carefully cut down what I pull and what I build.
Our development teams at work have a lot of [vulnerability scanning] trouble from bundling things they don't need. In that light, I suggest keeping things small - but that's the 'later' part you alluded towards :)
In the quest of the smallest image possible, one can bring about many unwarranted problems.
stargz is a gamechanger for startup time.
kubernetes and podman support it, and docker support is likely coming. It lazy loads the filesystem on start-up, making network requests for things as needed and therefore can often start up large images very fast.
Take a look at the startup graph here:
The only thing they didn't seem to cover is consider your target. My general policy is dev images are almost always going to be whatever lets me do one of the following:
- Easily install the tool I need
- All things being equal, if multiple image base OS's satisfy the above, I go with alpine, cause its smallest
One thing I've noticed is simple purpose built images are faster, even when there are a lot of them (big docker-compose user myself for this reason) rather than stuffing a lot of services inside of a single container or even "fewer" containers
EDIT: spelling, nuisance -> nuance
I'd highly suggest not to do that. If you do this, you directly throw away reproducibility, since you can't simply revert back to an older image if something stops working - you need to also check the node_modules directory. You also can't simply run old images or be sure that you have the same setup on your local machine as in production, since you also need to copy the state. Not to mention problems that might appear when your servers have differing versions of the folder or the headaches when needing to upgrade it together with your image.
Reducing your image size is important, but this way you'll loose a lot of what Docker actually offers. It might make sense in some specific cases, but you should be very aware of the drawbacks.
By chance, did you mean nuance? Because while I can agree it you can quickly get into some messy weeds optimizing an image...hearing someone call it a "nuisance" made me chuckle this afternoon
So here it is.
As a process for getting stuff done, a standard buildpack will get you a better result than a manual Dockerfile for all but the most extreme end of advanced users. Even for those users, they are typically advanced in a single domain (e.g. image layering, but not security). While buildpacks are not available for all use cases, when available I can't see a reason to use a manual Dockerfile for prod packaging
For our team of 20+ people, we actively discourage Dockerfiles for production usage. There are just too many things to be an expert on; packers get us a pretty decent (not perfect) result. Once we add the packer to the build toolchain it becomes a single command to get an image that has most security considerations factored in, layer and cache optimization done far better than a human, etc. No need for 20+ people to be trained to be a packaging expert, no need to hire additional build engineers that become a global bottleneck, etc. I also love that our ops team could, if they needed, write their own buildpack to participate in the packaging process and we could slot it in without a huge amount of pain
Is there anything one can do to help this issue?
With uwsgi you can control which file to watch. I usually just set it to watch the index.py so when I want to restart it, I just switch to that and save the file.
Similarly you could do this with "entr" https://github.com/eradman/entr
Do you mean container? So you'd like to have your long running dev container, and a separate test container that keeps running but you only use it every now and then, right? Because you neither want to include the test stuff in your dev container, nor use file watchers for the tests?
Then while I don't know your exact environment and flow, could you start the container with `docker run ... sh -c "while true; do sleep 1; done"` to "keep it warm" and then `docker exec ...` to run the tests?
Supports Go, Python, Java, out of the box.
Am I being paranoid? Is it reasonable to connect my images to a random third party service like this?
Depends on your line of work I suppose
On the flip side, it's a shell script that calls various buildah sub-commands rather than a nicer declarative DSL. Also you don't have the implicit auto cache reuse behaviour of Dockerfiles, since everything runs anew in next invocation. You would have to implement your own scheme for that, iirc breaking down the script into segments for each commit, writing the id to a file at the end of it, combining that with make worked for me.