Container Competition
Last year, Nathan invited us to take part in a little internal competition. The idea was to get people to play with Docker in a way that they might not otherwise get to during their day to day.
The challenge
To build and publish a Docker image that prints the phrase “hello IDBS engineering” (or similar) when run. At a minimum, you will need to build a Docker image, run that image locally to test and publish to a Docker registry. The idea is to do this as a bit of fun, so the effort should be relatively low and whenever you have a bit of downtime.
How will entries be judged?
We’ll keep it simple and the following dimensions will be taken into account: The image runs and prints to console. Image size - the smaller, the better.
Challenge accepted!
Humble beginnings
We don’t have to do much work at all to submit our first entry.
docker run ubuntu:latest echo hello world
is a perfectly valid command and the following Dockerfile
would have you on your way with the first 73.9MB submission:
FROM ubuntu:latest
CMD ["echo", "Hello IDBS Engineering! RE"]
Obviously there are smaller base images available than Ubuntu. Alpine for starters.. or for our purpose, Busybox is the Docker equivalent of a swiss army knife and would serve us equally well:
REPOSITORY TAG SIZE
cheating ubuntu 73.9MB
cheating alpine 5.57MB
cheating busybox 1.22MB
Our first submission could be as small as 1.22MB, with not much work at all.
Using base images like Alpine or Busybox still comes with varying amounts of bloat that we don’t need for the simple purpose of printing a string, of course.
To get the image any smaller, the solution must be a base image that has less bloat - ideally one with nothing in it.
This is also known as a scratch
image (because you can create images FROM scratch
) See here for more details
scratch
Scratch images are purposefully empty - they contain nothing and are exactly 0 Byte in size. This means also no Operating System - here be dragons..
We’ve decided that we don’t need a full blown OS for our container, we just want to print a string.. all we need is echo
:
ls -lh /bin/echo
-rwxr-xr-x 1 root wheel 31K 23 Jan 13:59 /bin/echo
31KB in size! Well, hello there..
Lets try this:
FROM scratch
COPY bin/echo /echo
ENTRYPOINT [ "/echo" ]
CMD [ "hello IDBS Engineering! RE" ]
reschenburg:echo $ mkdir bin/ && cp /bin/echo ./bin/
reschenburg:echo $ docker build -t container-comp:echo . && docker run --rm container-comp:echo
Sending build context to Docker daemon 34.82kB
Step 1/3 : FROM scratch
--->
Step 2/3 : COPY bin/echo /echo
---> Using cache
---> 820ddcfe419d
Step 3/3 : CMD [ "/echo", "hello IDBS Engineering! RE" ]
---> Running in 78051797374c
Removing intermediate container 78051797374c
---> 381e664419f1
Successfully built 381e664419f1
Successfully tagged container-comp:echo
standard_init_linux.go:211: exec user process caused "exec format error"
Now what is going on here?
Depending on how familiar you are with binaries and compiling them, you may have realised straight away that a 31kB binary is a bit too good to be true.
echo
is in fact a dynamically linked binary. This means the binary relies on the OS to provide its dependancies (libraries) to be able to run.
As we said earlier, a scratch
image is completely empty except for the stuff we copy in, so the libraries are not there and echo
is missing its dependancies.
So now what?
We need a statically linked binary to work in a scratch image. This means the binary is packaged with its dependancies build in.
Note: If you have come to this stage, the Docker part of the competition is pretty much over. Getting the image size any smaller is no longer down to Docker optimisations, but down to binary size.
Making our own binaries
Down the rabbit hole we go
So at this point, we are looking at writing and compiling our own binary. There are a few options to choose from as you will see.
To make the process easier, I don’t want to bother setting up lots of languages/dependancies/build environments on my laptop, so I want to do this in Docker too. Obviously we said I can’t do this in a scratch image as it doesn’t have the required build tools - but if I go back to Alpine or Ubuntu, I get way more stuff than I need. To solve that problem, I’m going to use multi-stage builds.
Multi-stage Docker builds
The trick here is, that I can run my build in independent stages - which have different FROM
statements. This means I don’t have to care about optimising my Docker layers for the build stage as it is not used for the final image.
Golang
Golang is modern, popular and most importantly a compiled language. This means we can statically link it. Since I have prior experience with Go, this was my obvious first choice.
All we have to do is print a string, so we can keep it very simple:
package main
func main() {
println("hello IDBS engineering from reschenburg")
}
The multi-stage Dockerfile looks like this:
FROM golang:alpine AS builder
WORKDIR $GOPATH/src/hello
COPY app/*.go ./
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags="-w -s" -o /go/bin/hello
FROM scratch
COPY --from=builder /go/bin/hello .
CMD ["./hello"]
The trick for size with go binaries are the linker flags -s and -w
to remove debug bloat, see the difference without and with the flags:
comptest-go noflags 95a55df9a483 42 seconds ago 1.13MB
comptest-go linkerflags 3f3be5150bc7 6 seconds ago 793kB
793kB
is a good start, it’s pushed us under the 1MB line and we didn’t have to setup anything Go related on the laptop either.
I haven’t really found a way of getting the Go binary smaller using native methods. Enter upx
UPX
UPX is a free, portable, extendable, high-performance executable packer for several executable formats.
The idea is, that the binary can be compressed and is only de-compressed at runtime.
This of course adds runtime overhead, so not a great idea for commandline tools like ls
- but startup time isn’t important for the competition, so… ¯\_(ツ)_/¯
Now we’re talking:
comptest-go upx df1dfe7d3f11 4 seconds ago 257kB
Great result - and the first time my image was smaller than the benchmark image provided by Nathan!
C++
Naturally, at this point I was curious and hooked. Can we go smaller? Last I dabbled in C was back in school, some 15 years ago, but I figured C++ might be a good candidate to try and squeeze a bit more bloat out of the binary.
#include <stdio.h>
int main(){
printf("hello IDBS engineering from reschenburg\n");
return 0;
}
first attempt
I’m using ubuntu as a builder image, because I can:
FROM ubuntu as builder
RUN apt-get update
RUN apt-get install build-essential upx -y
COPY app/hello.cc /hello.cc
RUN gcc hello.cc -o hello --static
# RUN upx --brute hello
FROM scratch
COPY --from=builder hello hello
CMD ["/hello"]
comptest-cpp ubuntu-no-upx 608c5b668a22 12 seconds ago 872kB
Another trick I’ve read about to optimise binary size is -O3
in the gcc command - to get the linker stage to optimise the binary size.
comptest-cpp o3-no-upx 3b6f665976a8 12 seconds ago 872kB
No change :( Lets try upx at least..
comptest-cpp ubuntu-upx 0cec9c817466 2 seconds ago 274kB
Not bad! but still slightly bigger than Golang. If you ever wondered which language is better.. here’s your obvious answer :)
second attempt
At this point I thought, it’s a competition about image size - Why am I using ubuntu as a builder image? I should set an example and use alpine:
FROM alpine as builder
RUN apk update && apk add --no-cache build-base upx
COPY app/hello.cc /hello.cc
RUN gcc hello.cc -o hello --static -O3
RUN upx --brute hello
FROM scratch
COPY --from=builder hello hello
CMD ["/hello"]
Interesting:
Step 5/8 : RUN upx --brute hello
---> Running in c43a1e7d185a
Ultimate Packer for eXecutables
Copyright (C) 1996 - 2018
UPX 3.95 Markus Oberhumer, Laszlo Molnar & John Reiser Aug 26th 2018
File size Ratio Format Name
-------------------- ------ ----------- -----------
upx: hello: NotCompressibleException
Whats happening here? Well, lets not use upx
..
comptest-cpp alpine 475901b336bb 3 hours ago 93.3kB
O_O now look at that.. 93.3KB - statically linked and only 3x as big as echo
!
As it turns out, for upx
to add it’s compression/decompression stuff, it has to bloat the binary a bit and at some point, that bloat would make the binary bigger than it’s original size, which is a bit pointless.
Hence upx
fails when binaries are too small.
More interesting is the fact that the same build command from alpine made the same c++ code 9x smaller. Any ideas why? I’m assuming that build-essentials on ubuntu installs a different version of gcc and dependancies than build-base on alpine. Or there are some default settings build into alpine and not into ubuntu (-O3 makes no difference in alpine either btw)
Assembly
Can we go smaller though?
At this point, I needed more. My last idea was straight up ASM
- all we need to do is echo some stuff, right?
I won’t claim to know assembly - also last touched it in school ~15 years ago.
But when has that ever stopped anybody? ¯\_(ツ)_/¯
This took a fair amount of googling to get it to compile and run..
FROM alpine as builder
RUN apk update && apk add --no-cache build-base nasm
COPY app/hello.asm /
RUN nasm -f elf64 -F dwarf -g hello.asm
RUN ld -m elf_x86_64 -o hello hello.o
FROM scratch
COPY --from=builder hello hello
CMD ["/hello"]
Suffice to say.. machine code is way smaller than even our nice alpine C++ binary:
comptest-asm alpine64 ee4959d737f8 2 hours ago 9.94kB
And it still works:
reschenburg:assembly $ docker run --rm comptest-asm:alpine64
hello IDBS engineering by reschenburg
All my images up to this point
comptest-asm alpine64 9.94kB
comptest-cpp alpine 93.3kB
comptest-cpp alpine-no-O3 93.3kB
comptest-go upx 257kB
comptest-cpp ubuntu-upx 274kB
comptest-go linkerflags 793kB
comptest-cpp o3-no-upx 872kB
comptest-cpp ubuntu-no-upx 872kB
comptest-go noflags 1.13MB
And then the competition started to heat up
Looking at my original ASM Dockerfile now, it’s painfully obvious how clueless I was. I googled nasm
command until I found one that worked and produced a working binary together with ld
.
When I saw the results, I was super happy - my image was several times smaller than the benchmark or anything anybody else had done at this point.
Then the competition started.
Phil managed to produce an image that was 8.82kB - while I had achieved the same with a first attempt at 32Bit, I couldn’t get it to run in the scratch image so gave up on it.
So back to my asm I went. Well actually, I went back to my compiler and linker as I was convinced the asm was as small as it is ever going to be. (Narrator: It wasn’t.)
Just for Nathan to throw a spanner into the works..
09:55
quay.io/idbs/container-competition nkuroczycki ea49ccb4c7c0 4 minutes ago 8.48kB
So after some more googling and a lot of muttering, I managed to match it:
10:11
quay.io/idbs/container-competition reschenburg e5a20b5c0fc2 49 seconds ago 8.48kB
But then, disaster struck:
10:38
quay.io/idbs/container-competition pjones 19e9228c5859 4 minutes ago 976B
Phil managed to get his image smaller than a kB - we are in Byte size territory.
At this point, I was still using the same asm but actually knee deep in nasm
and ld
documentation to see if I could optimise my build process.. and I made progress
11:15
4.54kB
I was getting somewhere.. until I struck gold (I thought):
11:19
comptest short fd79fdceaa9e 3 seconds ago 464B
I thought this was a mic-drop moment.
Until a little while later:
13:07
quay.io/idbs/container-competition pjones c06e3ef0251d 2 minutes ago 360B
I was ready to give up at this point, whilst I can’t remember the last time I had this much fun on a computer, I thought I had exhausted my linker flags.
..But of course I didn’t
13:53
quay.io/idbs/container-competition reschenburg 4f2d0ced3219 48 seconds ago 336B
Then it went quiet for the day.. we did have to do some work after-all and I was convinced I had it in the bag. Surely you can’t go smaller than that.
Until Phil send us into the weekend with this scorcher:
17:47
quay.io/idbs/container-competition pjones 3b27b4c79f92 2 minutes ago 73B
How is that possible?
In Phil’s own words:
I have to hand credit to another source but I have adjusted the source code to do what it needs to do. I had to change how I was compiling it too… Without shortening the message, I don’t think there’s going to be anything smaller…
Absolutely mad what’s possible though!
ELF VS BIN
ELF
From my 9.94kB image down to the 336B image, it was a battle of the compiler/linker. Phil was using musl
, I was using nasm
. musl
produces way smaller binaries out of the box, which was a very clever discovery - Phil looked at the ‘hello-world’ docker image, which echos a Hello from Docker
message and is only 13.3kB. He looked at their github and found them using musl
. Resourceful and effective!
My optimisations from 9.94kB onwards were done in the build stage exclusively. It was a matter of finding a combination of linker flags that removed as much un-needed bloat as possible. What I ended up with was:
RUN nasm -f elf -Ox hello.asm
RUN ld -m elf_i386 -N -s -x -nostdlib -nostartfiles --exclude-libs ALL -o hello hello.o
RUN strip --strip-all hello
As you can see, the last step to get it below Phil’s 360B image, was to link to elf_i386, i.e. 32Bit.
At this point, we were still compiling to an elf objects file and statically linking it to make a linux compatible elf binary with ld
BIN
When you’re furiously googling how to optimise your elf binary build, eventually you’ll come across this:
http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html
The premise is, that you can use machine code to write out CPU instructions directly. To make them linux compatible, you have to follow a certain standard, which is the ELF format, which requires a bunch of headers to be in the right place for linux to be able to run your binary. This way, you compile straight to bin
and no longer need to link your file:
FROM alpine as builder
RUN apk update && apk add --no-cache build-base nasm
COPY app/hello.asm /hello.asm
RUN nasm -f bin hello.asm -o hello
RUN chmod +x hello
FROM scratch
COPY --from=builder hello hello
CMD ["/hello"]
The main problem with the teensy example is, that the binary’s main purpose is to return 42
- which isn’t a lot of CPU instructions when compared to echoing a string, which requires a bit more memory and two kernel calls (assuming you want your binary to return 0) opposed to just one.
Since the author is using all manner of ungodly tricks to get the elf headers to overlap with the section and program header tables AND squeeze his return 42
instructions into some empty space in the header as well, we can’t fully use his example (which is 45Bytes in the end) to simply add our print string code.
Whilst I have managed to adapt some of his code to do my bidding:
reschenburg:assembly $ docker images tiny-asm
REPOSITORY TAG IMAGE ID CREATED SIZE
tiny-asm 3 0c176011814e 21 hours ago 117B
I have not managed to get it down to 73Bytes.
What I failed to realise for some time was, that the same Author published a hello-world teensy binary too! It is using the same tricks, but differs slightly to make it print a string.
That code, when adapted to print our message opposed to his, results in a 73Bytes image.
And there you have it folks. 73Byte is as small as it is going to get, assuming the same message (Hi IDBS Engineering! RE
).
Here are all my submitted images over time:
REPOSITORY TAG IMAGE ID SIZE
quay.io/idbs/container-competition reschenburg 1f74237c8786 73B
quay.io/idbs/container-competition <none> 4f38daa04077 129B
quay.io/idbs/container-competition <none> ee606221f610 137B
quay.io/idbs/container-competition <none> 4f2d0ced3219 336B
quay.io/idbs/container-competition <none> c06e3ef0251d 360B
quay.io/idbs/container-competition <none> 9ec5a3ce5495 464B
quay.io/idbs/container-competition <none> fd79fdceaa9e 448B
quay.io/idbs/container-competition <none> 01c9e3bfa844 4.54kB
quay.io/idbs/container-competition <none> 19e9228c5859 976B
quay.io/idbs/container-competition <none> e5a20b5c0fc2 8.48kB
quay.io/idbs/container-competition <none> ccea700d2b8a 8.94kB
quay.io/idbs/container-competition <none> 922195853cc3 13kB
quay.io/idbs/container-competition <none> b20ef1cc232e 9.91kB
quay.io/idbs/container-competition <none> 6dc857ab74bd 8.82kB
quay.io/idbs/container-competition reschenburg1 ee4959d737f8 9.94kB
quay.io/idbs/container-competition reschenburgold 475901b336bb 93.3kB
quay.io/idbs/container-competition reschenburgCpp 885117221990 274kB
Conclusion
Phil wins - hands down.
Also, you may have noticed that 3/4 of this article were a bit of a decent into madness - by the time you find yourself googling linux kernel instructions and binary standards to win what started as a Docker competition, you may have taken the wrong turn in the rabbit hole - or the right turn, if you’re into that kind of thing (I am! :) )