How to Significantly Reduce Your Docker Images with Multi-Stage Builds

Reducing Docker image sizes is crucial for optimizing storage, improving deployment speed, and enhancing overall performance.

Multi-stage builds offer an effective solution to achieve smaller, more efficient Docker images.

In this article, we’ll explore how to implement this technique.

Layering: A Prerequisite

Before we try and understand multi-stage builds, I do want to preface it with a note on layering.

Docker images are composed of multiple read-only layers, each representing a set of filesystem changes.

When you build an image, every instruction in the Dockerfile creates a new layer. These layers are stacked on top of each other to form the final image filesystem.

While Docker uses a caching mechanism to store the intermediate layers and speed up build time, as the layers get stacked, so does the size of the image.

And some images can be pretty big!

Multi-stage builds can help significantly eliminate these excess layers.

How? Keep reading.

Understanding Multi-stage Builds

Multi-stage builds allow you to use multiple FROM statements in your Dockerfile, each representing a new build stage.

This approach enables you to copy artifacts from one stage to another, leaving behind unnecessary components.

So if your image produces a single binary, no need to keep all the excess packages needed in the build process. Just include, say, your runtime and the single binary in your final image.

Implementing Multi-stage Builds

Let’s walk through an example of how to implement a multi-stage build for a Go application.

1. Create a simple Go app

After installing Go, run:

go mod init example/hello

And create a file in the same directory called hello.go and add this code:

package main

import "fmt"

func main() {
    fmt.Println("Hello, World!")
}

Step 2: Create a Basic Dockerfile

Now let’s start with a single-stage Dockerfile. Create a file called Dockerfile in your directory and add the following:

FROM golang:1.23-alpine

WORKDIR /app

COPY . .

RUN go build -o myapp

CMD ["./myapp"]

And build it with:

docker build -t myapp .

This Dockerfile produces a large 277MB image containing the Go toolchain and build dependencies.

Step 3: Implement Multi-stage Build

Now, let’s refactor the Dockerfile to use multi-stage builds:

# Build stage
FROM golang:1.19-alpine AS builder

WORKDIR /app

COPY . .

RUN go build -o myapp

# Final stage
FROM alpine:latest

WORKDIR /root/

COPY --from=builder /app/myapp .

CMD ["./myapp"]

This Dockerfile produces a much smaller image containing only the single binary and the Alpine image to run it resulting in a 96.75% decrease at only 9MB!

That’s huge!

Step 3: Understanding our Multi-stage Process

The first stage uses golang:1.23-alpine to build the application.
The second stage starts with a clean alpine:latest image, discarding the bulk produced from the build process (the first stage).
Only the compiled binary is copied from the builder stage to the final image.

Benefits of Multi-stage Builds

Smaller image sizes: By discarding intermediate build artifacts, the final image contains only what’s necessary to run the application.
Enhanced security: Fewer components mean a reduced attack surface, making your containers more secure.
Faster deployments: Smaller images lead to quicker transfer times and improved CI/CD performance.
Simplified Dockerfile: Multi-stage builds eliminate the need for complex shell scripts to clean up artifacts.

A Few Considerations

Base Image Selection: Choose minimal base images like Alpine for smaller footprints.
.dockerignore: Use a .dockerignore file to exclude unnecessary files from the build context like a node_modules folder or .dist folder.
Combine RUN Commands: Reduce layers by combining, or chaining, related RUN commands.

Results and Benefits

By implementing multi-stage builds, you can significantly reduce your Docker image size. For example, our Go application above was able to be reduced by 96.75% due to the multi-stage build approach.

Conclusion

Multi-stage builds are a powerful technique for creating lean, efficient Docker images. By separating the build environment from the runtime environment, you can achieve smaller image sizes, improved security, and faster deployments. This approach is particularly effective for compiled languages like Go but can be applied to various technology stacks.

Remember, optimizing Docker images is an ongoing process. Regularly review and refine your Dockerfiles to ensure you’re leveraging the latest best practices and technologies for image optimization.