Container Competition

13 minute read

Last year, Nathan invited us to take part in a little internal competition. The idea was to get people to play with Docker in a way that they might not otherwise get to during their day to day.

The challenge

To build and publish a Docker image that prints the phrase “hello IDBS engineering” (or similar) when run. At a minimum, you will need to build a Docker image, run that image locally to test and publish to a Docker registry. The idea is to do this as a bit of fun, so the effort should be relatively low and whenever you have a bit of downtime.

How will entries be judged?

We’ll keep it simple and the following dimensions will be taken into account: The image runs and prints to console. Image size - the smaller, the better.

Challenge accepted!


Humble beginnings

We don’t have to do much work at all to submit our first entry.
docker run ubuntu:latest echo hello world is a perfectly valid command and the following Dockerfile would have you on your way with the first 73.9MB submission:

FROM ubuntu:latest
CMD ["echo", "Hello IDBS Engineering! RE"]

Obviously there are smaller base images available than Ubuntu. Alpine for starters.. or for our purpose, Busybox is the Docker equivalent of a swiss army knife and would serve us equally well:

REPOSITORY          TAG           SIZE
cheating            ubuntu        73.9MB
cheating            alpine        5.57MB
cheating            busybox       1.22MB

Our first submission could be as small as 1.22MB, with not much work at all.

Using base images like Alpine or Busybox still comes with varying amounts of bloat that we don’t need for the simple purpose of printing a string, of course.

To get the image any smaller, the solution must be a base image that has less bloat - ideally one with nothing in it.

This is also known as a scratch image (because you can create images FROM scratch) See here for more details

scratch

Scratch images are purposefully empty - they contain nothing and are exactly 0 Byte in size. This means also no Operating System - here be dragons..

We’ve decided that we don’t need a full blown OS for our container, we just want to print a string.. all we need is echo:

ls -lh /bin/echo
-rwxr-xr-x  1 root  wheel    31K 23 Jan 13:59 /bin/echo

31KB in size! Well, hello there..

Lets try this:

FROM scratch
COPY bin/echo /echo
ENTRYPOINT [ "/echo" ]
CMD [ "hello IDBS Engineering! RE" ]
reschenburg:echo $ mkdir bin/ && cp /bin/echo ./bin/
reschenburg:echo $ docker build -t container-comp:echo . && docker run --rm container-comp:echo
Sending build context to Docker daemon  34.82kB
Step 1/3 : FROM scratch
 ---> 
Step 2/3 : COPY bin/echo /echo
 ---> Using cache
 ---> 820ddcfe419d
Step 3/3 : CMD [  "/echo", "hello IDBS Engineering! RE" ]
 ---> Running in 78051797374c
Removing intermediate container 78051797374c
 ---> 381e664419f1
Successfully built 381e664419f1
Successfully tagged container-comp:echo
standard_init_linux.go:211: exec user process caused "exec format error"

Now what is going on here?

Depending on how familiar you are with binaries and compiling them, you may have realised straight away that a 31kB binary is a bit too good to be true. echo is in fact a dynamically linked binary. This means the binary relies on the OS to provide its dependancies (libraries) to be able to run. As we said earlier, a scratch image is completely empty except for the stuff we copy in, so the libraries are not there and echo is missing its dependancies.

So now what?

We need a statically linked binary to work in a scratch image. This means the binary is packaged with its dependancies build in.

Note: If you have come to this stage, the Docker part of the competition is pretty much over. Getting the image size any smaller is no longer down to Docker optimisations, but down to binary size.

Making our own binaries

Down the rabbit hole we go

So at this point, we are looking at writing and compiling our own binary. There are a few options to choose from as you will see.

To make the process easier, I don’t want to bother setting up lots of languages/dependancies/build environments on my laptop, so I want to do this in Docker too. Obviously we said I can’t do this in a scratch image as it doesn’t have the required build tools - but if I go back to Alpine or Ubuntu, I get way more stuff than I need. To solve that problem, I’m going to use multi-stage builds.

Multi-stage Docker builds

The trick here is, that I can run my build in independent stages - which have different FROM statements. This means I don’t have to care about optimising my Docker layers for the build stage as it is not used for the final image.

Golang

Golang is modern, popular and most importantly a compiled language. This means we can statically link it. Since I have prior experience with Go, this was my obvious first choice.

All we have to do is print a string, so we can keep it very simple:

package main

func main() {
    println("hello IDBS engineering from reschenburg")
}

The multi-stage Dockerfile looks like this:

FROM golang:alpine AS builder

WORKDIR $GOPATH/src/hello
COPY app/*.go ./
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags="-w -s" -o /go/bin/hello

FROM scratch
COPY --from=builder /go/bin/hello .

CMD ["./hello"]

The trick for size with go binaries are the linker flags -s and -w to remove debug bloat, see the difference without and with the flags:

comptest-go     noflags         95a55df9a483    42 seconds ago  1.13MB
comptest-go     linkerflags     3f3be5150bc7    6 seconds ago   793kB

793kB is a good start, it’s pushed us under the 1MB line and we didn’t have to setup anything Go related on the laptop either.

I haven’t really found a way of getting the Go binary smaller using native methods. Enter upx

UPX

UPX is a free, portable, extendable, high-performance executable packer for several executable formats.

The idea is, that the binary can be compressed and is only de-compressed at runtime. This of course adds runtime overhead, so not a great idea for commandline tools like ls - but startup time isn’t important for the competition, so… ¯\_(ツ)_/¯

Now we’re talking:

comptest-go     upx             df1dfe7d3f11    4 seconds ago   257kB

Great result - and the first time my image was smaller than the benchmark image provided by Nathan!

C++

Naturally, at this point I was curious and hooked. Can we go smaller? Last I dabbled in C was back in school, some 15 years ago, but I figured C++ might be a good candidate to try and squeeze a bit more bloat out of the binary.

#include <stdio.h>
int main(){
    printf("hello IDBS engineering from reschenburg\n");
  return 0;
}

first attempt

I’m using ubuntu as a builder image, because I can:

FROM ubuntu as builder
RUN apt-get update
RUN apt-get install build-essential upx -y
COPY app/hello.cc /hello.cc
RUN gcc hello.cc -o hello --static
# RUN upx --brute hello

FROM scratch
COPY --from=builder hello hello
CMD ["/hello"]
comptest-cpp    ubuntu-no-upx   608c5b668a22    12 seconds ago  872kB

Another trick I’ve read about to optimise binary size is -O3 in the gcc command - to get the linker stage to optimise the binary size.

comptest-cpp    o3-no-upx       3b6f665976a8    12 seconds ago  872kB

No change :( Lets try upx at least..

comptest-cpp    ubuntu-upx      0cec9c817466    2 seconds ago   274kB

Not bad! but still slightly bigger than Golang. If you ever wondered which language is better.. here’s your obvious answer :)

second attempt

At this point I thought, it’s a competition about image size - Why am I using ubuntu as a builder image? I should set an example and use alpine:

FROM alpine as builder
RUN apk update && apk add --no-cache build-base upx
COPY app/hello.cc /hello.cc
RUN gcc hello.cc -o hello --static -O3
RUN upx --brute hello

FROM scratch
COPY --from=builder hello hello
CMD ["/hello"]

Interesting:

Step 5/8 : RUN upx --brute hello
 ---> Running in c43a1e7d185a
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2018
UPX 3.95        Markus Oberhumer, Laszlo Molnar & John Reiser   Aug 26th 2018

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
upx: hello: NotCompressibleException

Whats happening here? Well, lets not use upx..

comptest-cpp    alpine          475901b336bb    3 hours ago     93.3kB

O_O now look at that.. 93.3KB - statically linked and only 3x as big as echo!

As it turns out, for upx to add it’s compression/decompression stuff, it has to bloat the binary a bit and at some point, that bloat would make the binary bigger than it’s original size, which is a bit pointless. Hence upx fails when binaries are too small.

More interesting is the fact that the same build command from alpine made the same c++ code 9x smaller. Any ideas why? I’m assuming that build-essentials on ubuntu installs a different version of gcc and dependancies than build-base on alpine. Or there are some default settings build into alpine and not into ubuntu (-O3 makes no difference in alpine either btw)

Assembly

Can we go smaller though? At this point, I needed more. My last idea was straight up ASM - all we need to do is echo some stuff, right?

I won’t claim to know assembly - also last touched it in school ~15 years ago.
But when has that ever stopped anybody? ¯\_(ツ)_/¯

This took a fair amount of googling to get it to compile and run..

FROM alpine as builder
RUN apk update && apk add --no-cache build-base nasm
COPY app/hello.asm /

RUN nasm -f elf64 -F dwarf -g hello.asm 
RUN ld -m elf_x86_64 -o hello hello.o

FROM scratch
COPY --from=builder hello hello
CMD ["/hello"]

Suffice to say.. machine code is way smaller than even our nice alpine C++ binary:

comptest-asm    alpine64        ee4959d737f8    2 hours ago     9.94kB

And it still works:

reschenburg:assembly $ docker run --rm comptest-asm:alpine64
hello IDBS engineering by reschenburg

All my images up to this point

comptest-asm    alpine64            9.94kB
comptest-cpp    alpine              93.3kB
comptest-cpp    alpine-no-O3        93.3kB
comptest-go     upx                 257kB
comptest-cpp    ubuntu-upx          274kB
comptest-go     linkerflags         793kB
comptest-cpp    o3-no-upx           872kB
comptest-cpp    ubuntu-no-upx       872kB
comptest-go     noflags             1.13MB

And then the competition started to heat up

Looking at my original ASM Dockerfile now, it’s painfully obvious how clueless I was. I googled nasm command until I found one that worked and produced a working binary together with ld. When I saw the results, I was super happy - my image was several times smaller than the benchmark or anything anybody else had done at this point.

Then the competition started.

Phil managed to produce an image that was 8.82kB - while I had achieved the same with a first attempt at 32Bit, I couldn’t get it to run in the scratch image so gave up on it.

So back to my asm I went. Well actually, I went back to my compiler and linker as I was convinced the asm was as small as it is ever going to be. (Narrator: It wasn’t.)

Just for Nathan to throw a spanner into the works..

09:55
quay.io/idbs/container-competition nkuroczycki ea49ccb4c7c0 4 minutes ago 8.48kB

So after some more googling and a lot of muttering, I managed to match it:

10:11
quay.io/idbs/container-competition reschenburg e5a20b5c0fc2 49 seconds ago 8.48kB

But then, disaster struck:

10:38
quay.io/idbs/container-competition pjones 19e9228c5859 4 minutes ago 976B

Phil managed to get his image smaller than a kB - we are in Byte size territory.

At this point, I was still using the same asm but actually knee deep in nasm and ld documentation to see if I could optimise my build process.. and I made progress

11:15
4.54kB

I was getting somewhere.. until I struck gold (I thought):

11:19
comptest short fd79fdceaa9e 3 seconds ago 464B

I thought this was a mic-drop moment.

Until a little while later:

13:07
quay.io/idbs/container-competition pjones c06e3ef0251d 2 minutes ago 360B

I was ready to give up at this point, whilst I can’t remember the last time I had this much fun on a computer, I thought I had exhausted my linker flags.

..But of course I didn’t

13:53
quay.io/idbs/container-competition reschenburg 4f2d0ced3219 48 seconds ago 336B

Then it went quiet for the day.. we did have to do some work after-all and I was convinced I had it in the bag. Surely you can’t go smaller than that.

Until Phil send us into the weekend with this scorcher:

17:47
quay.io/idbs/container-competition pjones 3b27b4c79f92 2 minutes ago 73B

How is that possible?

In Phil’s own words:

I have to hand credit to another source but I have adjusted the source code to do what it needs to do. I had to change how I was compiling it too… Without shortening the message, I don’t think there’s going to be anything smaller…

Absolutely mad what’s possible though!

ELF VS BIN

ELF

From my 9.94kB image down to the 336B image, it was a battle of the compiler/linker. Phil was using musl, I was using nasm. musl produces way smaller binaries out of the box, which was a very clever discovery - Phil looked at the ‘hello-world’ docker image, which echos a Hello from Docker message and is only 13.3kB. He looked at their github and found them using musl. Resourceful and effective!

My optimisations from 9.94kB onwards were done in the build stage exclusively. It was a matter of finding a combination of linker flags that removed as much un-needed bloat as possible. What I ended up with was:

RUN nasm -f elf -Ox hello.asm
RUN ld -m elf_i386 -N -s -x -nostdlib -nostartfiles --exclude-libs ALL -o hello hello.o
RUN strip --strip-all hello

As you can see, the last step to get it below Phil’s 360B image, was to link to elf_i386, i.e. 32Bit.

At this point, we were still compiling to an elf objects file and statically linking it to make a linux compatible elf binary with ld

BIN

When you’re furiously googling how to optimise your elf binary build, eventually you’ll come across this:

http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html

The premise is, that you can use machine code to write out CPU instructions directly. To make them linux compatible, you have to follow a certain standard, which is the ELF format, which requires a bunch of headers to be in the right place for linux to be able to run your binary. This way, you compile straight to bin and no longer need to link your file:

FROM alpine as builder
RUN apk update && apk add --no-cache build-base nasm
COPY app/hello.asm /hello.asm

RUN nasm -f bin hello.asm -o hello
RUN chmod +x hello

FROM scratch
COPY --from=builder hello hello
CMD ["/hello"]

The main problem with the teensy example is, that the binary’s main purpose is to return 42 - which isn’t a lot of CPU instructions when compared to echoing a string, which requires a bit more memory and two kernel calls (assuming you want your binary to return 0) opposed to just one.

Since the author is using all manner of ungodly tricks to get the elf headers to overlap with the section and program header tables AND squeeze his return 42 instructions into some empty space in the header as well, we can’t fully use his example (which is 45Bytes in the end) to simply add our print string code.

Whilst I have managed to adapt some of his code to do my bidding:

reschenburg:assembly $ docker images tiny-asm
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
tiny-asm            3                   0c176011814e        21 hours ago        117B

I have not managed to get it down to 73Bytes.

What I failed to realise for some time was, that the same Author published a hello-world teensy binary too! It is using the same tricks, but differs slightly to make it print a string.

That code, when adapted to print our message opposed to his, results in a 73Bytes image.

And there you have it folks. 73Byte is as small as it is going to get, assuming the same message (Hi IDBS Engineering! RE).

Here are all my submitted images over time:

REPOSITORY                           TAG                 IMAGE ID            SIZE
quay.io/idbs/container-competition   reschenburg         1f74237c8786        73B
quay.io/idbs/container-competition   <none>              4f38daa04077        129B
quay.io/idbs/container-competition   <none>              ee606221f610        137B
quay.io/idbs/container-competition   <none>              4f2d0ced3219        336B
quay.io/idbs/container-competition   <none>              c06e3ef0251d        360B
quay.io/idbs/container-competition   <none>              9ec5a3ce5495        464B
quay.io/idbs/container-competition   <none>              fd79fdceaa9e        448B
quay.io/idbs/container-competition   <none>              01c9e3bfa844        4.54kB
quay.io/idbs/container-competition   <none>              19e9228c5859        976B
quay.io/idbs/container-competition   <none>              e5a20b5c0fc2        8.48kB
quay.io/idbs/container-competition   <none>              ccea700d2b8a        8.94kB
quay.io/idbs/container-competition   <none>              922195853cc3        13kB
quay.io/idbs/container-competition   <none>              b20ef1cc232e        9.91kB
quay.io/idbs/container-competition   <none>              6dc857ab74bd        8.82kB
quay.io/idbs/container-competition   reschenburg1        ee4959d737f8        9.94kB
quay.io/idbs/container-competition   reschenburgold      475901b336bb        93.3kB
quay.io/idbs/container-competition   reschenburgCpp      885117221990        274kB

Conclusion

Phil wins - hands down.

Also, you may have noticed that 3/4 of this article were a bit of a decent into madness - by the time you find yourself googling linux kernel instructions and binary standards to win what started as a Docker competition, you may have taken the wrong turn in the rabbit hole - or the right turn, if you’re into that kind of thing (I am! :) )