Go in a scratch VM

distributionless linux.

Mon, Jun 4, 2018

Many of us know that you can run Go binaries in “scratch” containers. Your container doesn’t need to be based on Alpine or Ubuntu. It can be based on nothing and contain just the binary you built from Go source. This is largely because Go code can be statically linked, and so requires no installed libraries.

But what about VMs? Normally you start from Ubuntu, or Alpine or whatever and then you install your stuff on top. What would happen if you didn’t? Could you have a VM that’s just a linux kernel and your Go binary?

I thought I’d find out.

Getting started

When a linux machine starts, first some low-level magic happens to mount the root file system, and load and run the kernel. Once the kernel is ready it hands control to user-space by running /sbin/init as process ID 1. Everything else that happens on the machine then happens because /sbin/init makes it happen. Every other user-space process is started by init or by a process started by init. And the OS only keeps running while process 1 keeps running.

If I replace /sbin/init with a static Go binary I’ve effectively replaced all the user-space components of the distribution.

So, what happens if we replace /sbin/init with a statically linked Go binary that just prints “Hello World!” and then sleeps a lot?

We need a playground

I’m going to start with the simplest linux distribution I can find, replace /sbin/init with my Go binary, then try to work out what else I need to do to get a running system.

Vagrant gives me a very convenient way to do this. This Vagrant file is all I need to configure a local VM.

Vagrant.configure("2") do |config|
  config.vm.box = "alpine/alpine64"
  config.vm.network "forwarded_port", guest: 80, host: 8080, host_ip: "127.0.0.1"
end

This gives me an easy-to-recycle local VM to play with. I can start it with vagrant up, and if things go wrong I can completely delete it with vagrant destroy -f.

I chose Alpine linux as my distribution as it has a reputation for being small & simple, which hopefully will make it easier to understand.

Once I start experimenting with this I expect lots of things will stop working, so I won’t be able to look at logs written to file or connect to the VM over a network. My debugging is likely to depend on getting access to the VM console. So I use VirtualBox to run my VM, as I know that will show me the console via the VirtualBox app.

Attempt 1: Hello World

This is our first attempt at a new world of distribution-less linux. A simple “hello world” program that I’ll build as a statically-linked binary. The program repeatedly sleeps rather than exiting, as the kernel will panic if process 1 exits.

package main

import (
	"fmt"
	"time"
)

func main() {
	fmt.Printf("Hello World!\n")

	for {
		time.Sleep(time.Second)
	}
}

I can build a linux version of this on my Mac using GOOS=linux go build. Since I’ve called my directory scratchmachine the output binary is called scratchmachine. I then do vagrant up followed by vagrant ssh and suddenly I’m in the Alpine VM, with my Mac directory mounted as /vagrant. I then run sudo cp /vagrant/scratchmachine /sbin/init to replace the init binary, followed by sudo reboot to restart the machine.

When the machine reboots, first the linux kernel will load, then the kernel will start the first user-space process, process 1, using my “hello world” binary that it finds at /sbin/init.

Success! scratchmachine is running as process 1

If we open VirtualBox and look at the machine console we can see the output of this experiment. It’s a success!

But this is all our machine can do. Our new init is the only thing running in user-space on this machine. And all it does is says hello and goes to sleep.

What I’d really like to do is run a web-server. For that I need a network connection.

Attempt 2: Getting on the network

My new mantra is vagrant destroy -f; vagrant up; vagrant ssh, which quickly restores a fully working alpine machine.

To get the network working I know I will need an active network interface. Perhaps I should just copy what happens when running alpine normally? ifconfig -a shows me the interfaces on the VM.

alpine:~$ ifconfig -a
eth0      Link encap:Ethernet  HWaddr 08:00:27:9E:9E:E5
inet addr:10.0.2.15  Bcast:0.0.0.0  Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe9e:9ee5/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

The VM has a single network interface eth0 using IP address 10.0.2.15.

So, what would happen if I tried to just assign 10.0.2.15 to eth0 and set the UP and RUNNING flags from my code? Some digging turned up the linux netdevice interface was what I needed to do this.

This netdevice interface is extremely weird. To use it you open any old internet socket, then commands to the kernel using the SYS_IOCTL syscall referencing the socket (IOCTL stands for input/output control).

Luckily there’s support for making the syscalls and for some of the structures I needed in golang.org/x/sys/unix.

type socketAddrRequest struct {
	name [unix.IFNAMSIZ]byte
	addr unix.RawSockaddrInet4
}

type socketFlagsRequest struct {
	name  [unix.IFNAMSIZ]byte
	flags uint16
	pad   [22]byte
}

func configureEthernet() error {
	fd, err := unix.Socket(unix.AF_INET, unix.SOCK_DGRAM, 0)
	if err != nil {
		return errors.Wrap(err, "could not open control socket")
	}

	defer unix.Close(fd)

	// We want to associate an IP address with eth0, then set flags to
	// activate it

	sa := socketAddrRequest{}
	copy(sa.name[:], "eth0")
	sa.addr.Family = unix.AF_INET
	copy(sa.addr.Addr[:], []byte{10, 0, 2, 15})

	// Set address
	if err := ioctl(fd, unix.SIOCSIFADDR, uintptr(unsafe.Pointer(&sa))); err != nil {
		return errors.Wrap(err, "failed setting address for eth0")
	}

	// Set netmask
	copy(sa.addr.Addr[:], []byte{255, 255, 255, 0})
	if err := ioctl(fd, unix.SIOCSIFNETMASK, uintptr(unsafe.Pointer(&sa))); err != nil {
		return errors.Wrap(err, "failed setting netmask for eth0")
	}

	// Get flags
	sf := socketFlagsRequest{}
	sf.name = sa.name
	if err := ioctl(fd, unix.SIOCGIFFLAGS, uintptr(unsafe.Pointer(&sf))); err != nil {
		return errors.Wrap(err, "failed getting flags for eth0")
	}

	sf.flags |= unix.IFF_UP | unix.IFF_RUNNING
	if err := ioctl(fd, unix.SIOCSIFFLAGS, uintptr(unsafe.Pointer(&sf))); err != nil {
		return errors.Wrap(err, "failed getting flags for eth0")
	}

	return nil
}

func ioctl(fd int, code, data uintptr) error {
	_, _, errno := unix.Syscall(unix.SYS_IOCTL, uintptr(fd), code, data)
	if errno != 0 {
		return errno
	}
	return nil
}

Not so easy this time

Unfortunately it’s not that easy. The eth0 device I’ve tried to configure does not exist. /sbin/init must normally do something to make the device appear.

Finding eth0

I can now be heard muttering vagrant destroy -f; vagrant up; vagrant ssh as I stomp around trying to think how to make eth0 appear. It must be something /sbin/init does when the machine boots.

So what does /sbin/init do when the machine boots? Well, one thing it does is run “init scripts”. These are arcane scripts that have been handed down by the ancient ones to make machines start. The scripts usually live in /etc but the exact details vary between unixes. Using ancient wisdom, I go looking in /etc for files and directories related to “init”, to “rc” and to “run levels”.

/etc/runlevels has boot and sysinit subdirectories

And it turns out /etc/runlevels exists and has subdirectories sysinit and boot, each with a bunch of scripts that get run to start the system. I try deleting scripts and rebooting to see what’s crucial for setting up ethernet. Cutting a long story short, the interesting file is /etc/runlevels/sysinit/hwdrivers. This is a quite short script that boils down to the following.

find /sys -name modalias -type f -print0 | xargs -0 sort -u \
| xargs modprobe -b -a

This is looking for files under /sys and passing them to modprobe. man modprobe tells us

modprobe — program to add and remove modules from the Linux Kernel

So perhaps we need to load a driver for eth0? If we poke around in /sys for things related to eth0 we find /sys/class/net/eth0/device. And from there we can discover that the driver is called e1000

The eth0 driver is e1000

So how do we load the driver? I don’t want bash, find or modprobe in my final system, so I need to load the driver directly from my Go code.

Looking for clues, I found some source code for modprobe here. This shows modprobe reading the bytes of a driver binary, then calling init_module, which turns out to be another syscall. The man page says there’s a newer version called finit_module. So, obviously, I go with the f’ing one.

The modprobe code also contains another hint. It looks for modules under /lib/modules. A quick find /lib/modules -print | grep e1000 shows us the driver we want is /lib/modules/4.9.73–0-virthardened/kernel/drivers/net/ethernet/intel/e1000/e1000.ko. This is the driver I want to load. All I need to do is open this file and pass the file descriptor to the finit_module syscall.

var fakeString [3]byte

func addDriverModule() error {
	// We need a file descriptor for our file
	driverPath := "/lib/modules/4.9.73-0-virthardened/kernel/drivers/net/ethernet/intel/e1000/e1000.ko"
	f, err := os.Open(driverPath)
	if err != nil {
		return errors.Wrap(err, "open of driver file failed")
	}
	defer f.Close()
	fd := f.Fd()

	_, _, errno := unix.Syscall(unix.SYS_FINIT_MODULE, fd, uintptr(unsafe.Pointer(&fakeString)), 0)
	if errno != 0 && errno != unix.EEXIST {
		return errors.Wrap(errno, "init module failed")
	}

	return nil
}

Putting it all together

Ever optimistic, I add some code to start an HTTP server after the code to load the ethernet driver and configure the interface. This is what the code looks like now:

func run() error {
	fmt.Printf("Hello World!\n")

	// Before we can configure ethernet we need to load hardware drivers
	if err := addDriverModule(); err != nil {
		return errors.Wrap(err, "failed to add driver")
	}

	if err := configureEthernet(); err != nil {
		return errors.Wrap(err, "failed to configure ethernet")
	}

	fmt.Printf("Ethernet configured\n")

	http.ListenAndServe(":80", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		fmt.Fprintf(w, "Hello from Scratch Machine!\n")
	}))

	return nil
}

I rebuild, copy the binary over /sbin/init and reboot. And wait a minute. And then…

$ curl localhost:8080
Hello from Scratch Machine!

It works!

Attempt 3: Cutting it down

So I can build a Go web server and install it as /sbin/init in a linux VM. The web server is the only user-space process running on the VM, and I can convince myself that it’s really the only user-space code that counts. But I really wanted a VM with only my code & the kernel on it and nothing else. How can I achieve that?

This turns out to be really quite hard. Not many people do this kind of thing, so there aren’t many clues out in the world. And all the clues that are there are arcane and somewhat contradictory.

In the end (several weeks later!) I find a working formula.

I build a CD/DVD image (.iso) using the xorriso package.
I configure this to boot linux using what’s called an “initial RAM file system” and the isolinux boot loader.
Because I don’t need much else I stop there. Normally the “initial RAM file system” is just enough code to work out where the real root file system is, mount it and boot from it. In my case the “initial RAM file system” contains just my binaries and the ethernet driver, and I have no “real root” with additional stuff.
I boot a VirtualBox VM from this iso with no other hard drive configured.

To reiterate, the initial RAM file system contains the following.

e1000.ko (the ethernet driver).
My Go program, renamed to ‘init’ (the /sbin prefix is unnecessary for an initramfs).

My image just contains the following.

isolinux.bin & ldlinux.c32 (the ISOLINUX bootloader)
an isolinux.cfg configuration file.
vmlinuz-virtualhardened (the linux kernel copied from alpine).
initramfs.gz, which is the gzipped cpio archive of the initial RAM file system.

Building the initramfs

The “initial RAM file system” is just a gzipped cpio archive with the files I need. I can build it as follows. All these commands are run inside the alpine virtual machine.

# build our initial RAM file system
mkdir -p ramfs
cp /vagrant/scratchmachine ramfs/init
cp /lib/modules/4.9.73-0-virthardened/kernel/drivers/net/ethernet/intel/e1000/e1000.ko ramfs/e1000.ko
# Make our own initramfs, with just our binary
pushd ramfs
cat <<EOF | cpio -o -H newc | gzip > initramfs.gz
init
e1000.ko
EOF
popd

Building the ISO

To build the ISO I again just need to build a directory containing the files I need in my alpine VM and run a command.

mkdir cdroot
mkdir cdroot/dev
mkdir cdroot/kernel
# Copy the kernel from alpine
cp /boot/vmlinuz-virthardened cdroot/kernel
# Copy the initramfs.gz file just created
cp ramfs/initramfs.gz cdroot
# Copy in the ISOLINUX bootloader
mkdir -p cdroot/isolinux
cp /usr/share/syslinux/isolinux.bin cdroot/isolinux
cp /usr/share/syslinux/ldlinux.c32 cdroot/isolinux
# Create the ISOLINUX config file
cat <<EOF > cdroot/isolinux/isolinux.cfg
DEFAULT linux
SERIAL 0 115200
SAY Now booting the kernel from ISOLINUX...
LABEL linux
KERNEL /kernel/vmlinuz-virthardened
INITRD /initramfs.gz
APPEND root=/dev/ram0 ro console=tty0 console=ttyS0,115200
EOF

Finally we can build the iso.

mkisofs -o /vagrant/output.iso \
  -cache-inodes -J -l \
  -b isolinux/isolinux.bin -c isolinux/boot.cat \
  -no-emul-boot -boot-load-size 4 -boot-info-table \
  cdroot/

Results

My Go program is 6,749,734 bytes. My ISO boot image is 7,114,752 bytes, which compares well with the ~38 MB for the alpine VM iso. It takes about 24s to boot under VirtualBox on my laptop (which I think is far too slow!). I suspect it is not vulnerable to many known linux security issues as it contains zero standard user-space components.

On the down side it isn’t very configurable (hardwired IP address!) or debuggable.

Personally I think this might not be a crazy avenue to pursue. It wouldn’t be too difficult to add a few things like a DHCP client or a log forwarder as either libraries or additional executables. Then you might have a useful system that’s trivial to audit for known security vulnerabilities.

If you want to take a closer look, the code is on github.com

Other things to consider

None of this is a terribly new idea. If you’re interested in this area, you might want to take a look at some of the following.