One cause of bloated images is where additional packages are required to a base image. The developer will install them as they would on a real system not realising that the downloaded packages are being cached and get included in the final image.
The solution to this is pretty simple, either tell the package manager not to cache them or ensure the package installation files are removed once installation has been completed.
Alpine
When using the alpine
base images, or images based on it, you can simply
tell the apk
command not to cache by passing the --no-cache
parameter:
1FROM golang:alpine as build
2
3RUN apk add --no-cache curl git tzdata zip
Debian/Ubuntu
For images that use apt
or apt-get
it's a little bit more complicated.
Here you need to do an update first, install and then finally remove the packages in the same command:
1FROM debian:11-slim
2RUN apt-get update &&\
3 apt-get install -y ca-certificates chromium nodejs npm &&\
4 rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
Here it runs an update first as the image would not have the current remote repository state available. Then it performs the installation of the required packages. Finally, it removes all traces of the files written by apt.
This last step is the important part here but keeping them all on the same RUN
command ensures that
just one layer is generated for this step.
Don't try to do too much in this step as the result will be cached so subsequent builds will use the same image until either something changes earlier in the Dockerfile. This will save a lot of repeated downloading.
See Stage Ordering for another example of this and why it's better to do package installation early on in a Dockerfile.
When using apt
or apt-get
in Dockerfiles I'd advice you always use
apt-get
as it can handle being run from a script.
The apt
command doesn't like running without a tty so will write a warning to the output stating so.
Also with apt-get install
and related commands,
always include the -y
parameter so that it doesn't try to prompt asking if you want to continue.
This would apply to any type script not just Dockerfile.