One cause of bloated images is where additional packages are required to a base image. The developer will install them as they would on a real system not realising that the downloaded packages are being cached and get included in the final image.
The solution to this is pretty simple, either tell the package manager not to cache them or ensure the package installation files are removed once installation has been completed.
When using the
alpine base images, or images based on it, you can simply
apk command not to cache by passing the
1FROM golang:alpine as build 2 3RUN apk add --no-cache curl git tzdata zip
For images that use
apt-get it's a little bit more complicated.
Here you need to do an update first, install and then finally remove the packages in the same command:
1FROM debian:11-slim 2RUN apt-get update &&\ 3 apt-get install -y ca-certificates chromium nodejs npm &&\ 4 rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
Here it runs an update first as the image would not have the current remote repository state available. Then it performs the installation of the required packages. Finally, it removes all traces of the files written by apt.
This last step is the important part here but keeping them all on the same
RUN command ensures that
just one layer is generated for this step.
Don't try to do too much in this step as the result will be cached so subsequent builds will use the same image until either something changes earlier in the Dockerfile. This will save a lot of repeated downloading.
See Stage Ordering for another example of this and why it's better to do package installation early on in a Dockerfile.
apt-get in Dockerfiles I'd advice you always use
apt-get as it can handle being run from a script.
apt command doesn't like running without a tty so will write a warning to the output stating so.
apt-get install and related commands,
always include the
-y parameter so that it doesn't try to prompt asking if you want to continue.
This would apply to any type script not just Dockerfile.