Not the pointer you were expecting. Picture by Harold Meerveld https://www.flickr.com/photos/haroldmeerveld/17810348563
Not the pointer you were expecting. Picture by Harold Meerveld https://www.flickr.com/photos/haroldmeerveld/17810348563

What is a Go function variable?

I was surprised - perhaps you will be too!

Me: What’s the audience for this post?

Also Me: People who write code in Go and care what a function variable actually is.

Me: …

Also Me: I mean right down to the bits and bytes

Me: …

Also Me: …

What is it that got me interested in writing this blog post, given I think it’s likely to have an incredibly small audience? Well, I wrote some code recently that uses a list of function variables, and I wanted to test that every function in the list had a unit test, and that every unit tested function was included in the list. So I needed to be able to compare function variables.

But Go said “no”.

Well, actually it said ./prog.go:10:5: invalid operation: f1 == f2 (func can only be compared to nil), but that’s just a long-winded way of saying no.

How could I make it say “yes”? And what is it about function variables that you might not want to compare them?

So what is a function variable?

Our first clue is that it can be compared to nil. To my mind that makes it a pointer or something like a slice or interface variable which contains a pointer.

On 64 bit architectures, pointers are 8 bytes. How big is a function variable?

	var f func()
	fmt.Println(unsafe.Sizeof(f))

It turns out it is 8 bytes. So it is very likely a function variable (at least as Go is presently) is a pointer.

If we add some parameters and a return value it’s still 8 bytes.

	var f func(a int, b string) error
	fmt.Println(unsafe.Sizeof(f))

The obvious guess at this point is that a function variable is a pointer to the code in memory that implements the function. Can we prove that? Well, the runtime package allows us to extract information about the code that calls a piece of code. We can make a function that prints it’s own location in memory.

func a() {
	pc, _, _, _ := runtime.Caller(0)
	fun := runtime.FuncForPC(pc)
	fmt.Printf("a: entry 0x%x\n", fun.Entry())
}

runtime.Caller(0) returns information about the caller of runtime.Caller. The pc return value is the “program counter” - the location in memory of the code that called runtime.Caller. runtime.FuncForPC returns information about the function that includes a given program counter value. That information includes it’s entry point: the location of the start of the function.

We can also print the value of a function variable. So we can do the following.

package main

import (
	"fmt"
	"runtime"
)

func main() {
	a()
	f := a
	fmt.Printf("function variable: %p\n", f)
}

func a() {
	pc, _, _, _ := runtime.Caller(0)
	fun := runtime.FuncForPC(pc)
	fmt.Printf("a: entry 0x%x\n", fun.Entry())
}

The output is as follows

a: entry 0x482a00
function variable: 0x482a00

We have our answer

A function variable is a pointer to the code, and we can trick go into allowing us to compare them.

package main

import (
	"fmt"
	"unsafe"
)

func main() {
	fmt.Println(sameFunction(a, b))
	fmt.Println(sameFunction(a, a))
	fmt.Println(sameFunction(b, b))
}

func sameFunction(f1, f2 func()) bool {
	return *(*uintptr)((unsafe.Pointer)(&f1)) == *(*uintptr)((unsafe.Pointer)(&f2))
}

func a() {}

func b() {}

Errrr.. wait a minute

If it’s so easy why doesn’t Go let you make this comparison without all this palaver?

Let’s take a look at what happens when we pass a method as the function.

package main

import (
	"fmt"
	"unsafe"
)

func main() {
	one := integer(1)
	two := integer(2)
	oneAgain := integer(1)

	fmt.Printf("%p %p %p\n", one.v, two.v, oneAgain.v)

	fmt.Println("one matches two:", sameFunction(one.v, two.v))
	fmt.Println("one matches a different one:", sameFunction(one.v, oneAgain.v))
	fmt.Println("one matches itself:", sameFunction(one.v, one.v))
}

func sameFunction(f1, f2 func() int) bool {
	fmt.Printf("sameFunction: %p %p\n", f1, f2)
	return *(*uintptr)((unsafe.Pointer)(&f1)) == *(*uintptr)((unsafe.Pointer)(&f2))
}

type integer int

func (i integer) v() int { return int(i) }

Here’s the result. Everything about it is wrong. The fmt.Printf("%p") values all look the same, but the comparisons all fail.

0x482dc0 0x482dc0 0x482dc0
sameFunction: 0x482dc0 0x482dc0
one matches two: false
sameFunction: 0x482dc0 0x482dc0
one matches a different one: false
sameFunction: 0x482dc0 0x482dc0
one matches itself: false

Intuitively one.v should not be the same function as two.v as they always give different results. But they’re implemented with the same code, so in another interpretation they should be the same function.

We could expect one.v to equal oneAgain.v as they always give the same value. But here they’re different instances, so it would also be reasonable to say they are different.

We certainly would hope that one.v would equal one.v. Surely? Apparently not.

Well, that’s weird

Let’s make our function print its own location again. And let’s print the values we’re actually comparing as well as the “%p” values now we know they are different. And we’ll call our v method on each of our objects and get it to print where it thinks it is.

package main

import (
	"fmt"
	"runtime"
	"unsafe"
)

func main() {
	one := integer(1)
	two := integer(2)
	oneAgain := integer(1)

	fmt.Printf("%p %p %p\n", one.v, two.v, oneAgain.v)

	fmt.Println("one matches two:", sameFunction(one.v, two.v))
	fmt.Println("one matches a different one:", sameFunction(one.v, oneAgain.v))
	fmt.Println("one matches itself:", sameFunction(one.v, one.v))

	one.v()
	two.v()
	oneAgain.v()
}

func sameFunction(f1, f2 func() int) bool {
    f1Val := *(*uintptr)((unsafe.Pointer)(&f1))
    f2Val := *(*uintptr)((unsafe.Pointer)(&f2))
	fmt.Printf("sameFunction: %p(0x%x) %p(0x%x)\n", f1, f1Val, f2, f2Val)
	return f1Val == f2Val
}

type integer int

func (i integer) v() int {
	pc, _, _, _ := runtime.Caller(0)
	fun := runtime.FuncForPC(pc)
	fmt.Printf("v,%d: entry 0x%x\n", i, fun.Entry())
	return int(i)
}

There are a few things to notice here

  1. The values we’re comparing are nothing like the values displayed by "%p"
  2. The functions called are at slightly different locations to the "%p" of the function values we’re calling.
0x483200 0x483200 0x483200
sameFunction: 0x483200(0xc0000b2050) 0x483200(0xc0000b2060)
one matches two: false
sameFunction: 0x483200(0xc0000b2080) 0x483200(0xc0000b2090)
one matches a different one: false
sameFunction: 0x483200(0xc0000b20b0) 0x483200(0xc0000b20c0)
one matches itself: false
v,1: entry 0x483120
v,2: entry 0x483120
v,1: entry 0x483120

OK, OK, OK, let’s just back up for a minute

We believe our function variables are pointers, but they don’t seem to be pointing to the code to execute. Let’s try looking at what happens when you call a function variable.

func call(f func() int) {
	f()
}

We can disassemble this at the wonderful godbolt.org. Below is what we find (I’ve assumed amd64 architecture below - I’m sure other architectures will be largely similar).

        MOVQ    (AX), CX
        MOVQ    AX, DX
        PCDATA  $1, $1
        CALL    CX

Unfortunately this is a little hard to interpret. MOVQ means “move quad-word”. In this context a word is 2 bytes, so “quad-word” means this is about moving 4 * 2 = 8 byte values. AX, CX and DX are CPU registers. PCDATA appears to be annotation for the Go toolchain which we can ignore. CALL is where the function call is actioned.

(AX) means take the value of the AX register, go to that location in memory and read the value from there. If AX contains the value of the function variable, then MOVQ (AX), CX treats this as a pointer and puts the 8 byte value it points to in CX.

CALL CX calls that value.

So the function variable doesn’t point to the function code. It points to a pointer to the function code.

MOVQ AX, DX puts the value of AX into DX, which means the function variable value will be available to the function called by CALL CX (it can read the DX register). This is quite clever. Our function variable could be pointing to something larger than just a code pointer. It could be followed by, say, receiver and parameter values for method calls and closures. By passing the memory pointed to by the function variable to the called function, that called function could be a wrapper around the real function that knows how to apply the receiver and/or parameter values.

This does indeed appear to be how it works. If we disassemble code that puts a method in a function variable we can see this being set up. (Again, unfortunately the output is hideous to interpret - you’ll have to trust that I’ve pulled out the interesting things below!)

	one := integer(1)

    var f func() int = one.v
        LEAQ    type.noalg.struct { F uintptr; R "".integer }(SB), AX
        CALL    runtime.newobject(SB)
        MOVQ    AX, ""..autotmp_3+24(SP)
        LEAQ    "".integer.v-fm(SB), CX
        MOVQ    CX, (AX)
        MOVQ    $1, 8(AX)

The first line above gives us a huge clue what’s going on. The code is essentially as follows.

var f func() int = &struct {
    F uintptr
    R integer
} {
    F: integer.v-fm,
    R: one,
}

Note the function used isn’t integer.v, it is integer.v-fm. We can use Godbolt again to see that this is an autogenerated wrapper around integer.v which loads the integer receiver (one) from this struct and sets up a proper call to one.v(). I’ve pulled out the key part below. It pulls the receiver value from 8(DX) (8 bytes offset from the value of the DX register which we mentioned being set up above), puts it in AX and calls integer.v. (Go now has a register calling convention, and the receiver on a method call is placed in AX)

        MOVQ    8(DX), AX
        PCDATA  $1, $0
        CALL    "".integer.v(SB)
        MOVQ    8(SP), BP
        ADDQ    $16, SP
        RET

And finally

Finally we have our conclusions. In Go as it is presently…

  1. Function variables are pointers
  2. They point to small structs
  3. Those structs contain either
    1. just the pointer to the function code for simple functions
    2. or to pointers to autogenerated wrapper functions and receivers &/or function parameters in the case of method calls and closures.

And why can’t you compare them? I expect it was just too hard to agree to a convention about which pointers would be considered equal when there are closures and method calls and function wrappers around.