It ain't necessarily slow

Reflection isn't always slow

Tue, Jun 16, 2020

Don’t use reflection. Unless you really have to. But when you’re not using reflection, don’t think that it is because reflection is fundamentally slow. It doesn’t have to be slow.

Reflection allows you to obtain information about Go types at runtime. We’ll look at how you can use it to populate structs if you were ever foolish enough to try to write a new version of something like json.Unmarshal.

We’ll deal with a simple case. We’ll have a struct with two integer fields called A and B.

type SimpleStruct struct {
    A int
    B int
}

Imagine we’ve received some JSON {"B":42}, we’ve parsed it and we know we want to set field B to 42. We’re going to write some functions that do just that. They will all just set B to 42.

If our code only works on SimpleStruct this is totally trivial.

func populateStruct(in *SimpleStruct) {
    in.B = 42
}

But if we’re writing a JSON parser we don’t know the struct type in advance. Our parser code needs to accept any type. In Go, that usually means taking an interface{} parameter.

We can then use the reflect package to inspect the value passed in via that interface{} parameter, check that it is a pointer to a struct, find the field B and populate it with our value. Our code will look something like the following.

func populateStructReflect(in interface{}) error {
	val := reflect.ValueOf(in)
	if val.Type().Kind() != reflect.Ptr {
		return fmt.Errorf("you must pass in a pointer")
	}
	elmv := val.Elem()
	if elmv.Type().Kind() != reflect.Struct {
		return fmt.Errorf("you must pass in a pointer to a struct")
	}

	fval := elmv.FieldByName("B")
	fval.SetInt(42)

	return nil
}

Let’s see how fast that is. A quick benchmark.

func BenchmarkPopulateReflect(b *testing.B) {
	b.ReportAllocs()
	var m SimpleStruct
	for i := 0; i < b.N; i++ {
		if err := populateStructReflect(&m); err != nil {
			b.Fatal(err)
		}
		if m.B != 42 {
			b.Fatalf("unexpected value %d for B", m.B)
		}
	}
}

We get the following results

BenchmarkPopulateReflect-16   15941916	   68.3 ns/op	 8 B/op	    1 allocs/op

Is that good or bad? Well, allocations are never good. And you might wonder why you need to allocate memory on the heap to set a struct field to 42 (this issue is at the heart of it). But in the grand scheme of things 68ns isn’t a lot of time. You can fit a lot of 68ns in the time it takes to make a request of any kind over a network.

Can we do better? Well, normally the programs we run don’t just do one thing then stop. They do very similar things over and over again. Could we set something up once to make things faster for the repeats?

If we look carefully at the checks we’re doing we notice they all depend on the type of the value that’s been passed in. We can do these checks only once when we first see a type and cache the result.

We also need to track down that allocation. It turns out we call Value.FieldByName, that calls Type.FieldByName, which calls structType.FieldByName which calls structType.Field which allocates. Can we call FieldByName on the type and cache something to get the value’s ‘B’ field? It turns out if we cache Field.Index we can use that to get the field value without an allocation.

Here’s our new version.

var cache = make(map[reflect.Type][]int)

func populateStructReflectCache(in interface{}) error {
	typ := reflect.TypeOf(in)

	index, ok := cache[typ]
	if !ok {
		if typ.Kind() != reflect.Ptr {
			return fmt.Errorf("you must pass in a pointer")
		}
		if typ.Elem().Kind() != reflect.Struct {
			return fmt.Errorf("you must pass in a pointer to a struct")
		}
		f, ok := typ.Elem().FieldByName("B")
		if !ok {
			return fmt.Errorf("struct does not have field B")
		}
		index = f.Index
		cache[typ] = index
	}

	val := reflect.ValueOf(in)
	elmv := val.Elem()

	fval := elmv.FieldByIndex(index)
	fval.SetInt(42)

	return nil
}

The new benchmark is faster and we don’t have any allocations.

BenchmarkPopulateReflectCache-16  35881779	   30.9 ns/op   0 B/op   0 allocs/op

Can we do any better? Well, if we know the offset of field B in the struct and we know its an int we can just write to the memory directly. We can recover the pointer to the struct from the interface as we know an interface is actually syntactic sugar for a struct with two pointers: the first points to information about the type and the second points to the value. Getting this pointer gives us the start of the struct. We can then use the offset of field B in the struct to address field B in this value directly.

Here’s our new code.

var unsafeCache = make(map[reflect.Type]uintptr)

type intface struct {
	typ   unsafe.Pointer
	value unsafe.Pointer
}

func populateStructUnsafe(in interface{}) error {
	typ := reflect.TypeOf(in)

	offset, ok := unsafeCache[typ]
	if !ok {
		if typ.Kind() != reflect.Ptr {
			return fmt.Errorf("you must pass in a pointer")
		}
		if typ.Elem().Kind() != reflect.Struct {
			return fmt.Errorf("you must pass in a pointer to a struct")
		}
		f, ok := typ.Elem().FieldByName("B")
		if !ok {
			return fmt.Errorf("struct does not have field B")
		}
		if f.Type.Kind() != reflect.Int {
			return fmt.Errorf("field B should be an int")
		}
		offset = f.Offset
		unsafeCache[typ] = offset
	}

	structPtr := (*intface)(unsafe.Pointer(&in)).value
	*(*int)(unsafe.Pointer(uintptr(structPtr) + offset)) = 42

	return nil
}

The new benchmark shows this is quite a bit quicker.

BenchmarkPopulateUnsafe-16 	62726018    19.5 ns/op     0 B/op     0 allocs/op

Can we go even quicker? If we run a CPU profile we see most of the time is taken accessing the map. It also shows the map access is calling runtime.interhash and runtime.interequal. These are functions for hashing interfaces and checking if they are equal. Perhaps using a simpler key will speed things up? We could use the address of the type information from the interface rather than the reflect.Type itself.

var unsafeCache2 = make(map[uintptr]uintptr)

func populateStructUnsafe2(in interface{}) error {
	inf := (*intface)(unsafe.Pointer(&in))

	offset, ok := unsafeCache2[uintptr(inf.typ)]
	if !ok {
		typ := reflect.TypeOf(in)
		if typ.Kind() != reflect.Ptr {
			return fmt.Errorf("you must pass in a pointer")
		}
		if typ.Elem().Kind() != reflect.Struct {
			return fmt.Errorf("you must pass in a pointer to a struct")
		}
		f, ok := typ.Elem().FieldByName("B")
		if !ok {
			return fmt.Errorf("struct does not have field B")
		}
		if f.Type.Kind() != reflect.Int {
			return fmt.Errorf("field B should be an int")
		}
		offset = f.Offset
		unsafeCache2[uintptr(inf.typ)] = offset
	}

	*(*int)(unsafe.Pointer(uintptr(inf.value) + offset)) = 42

	return nil
}

Here’s the benchmark result for our new version. It’s quite a bit faster.

BenchmarkPopulateUnsafe2-16  230836136    5.16 ns/op    0 B/op     0 allocs/op

Can we go even faster still? Well, we could change the interface to our function. Often if you’re unmarshaling into a struct it’s always the same struct. We could split our function in two. We could have one function that checks the struct is correct for our purpose and returns a descriptor. We could then use that descriptor on future populate calls.

Here’s our new version. Our caller should call describeType on initialisation to obtain a typeDescriptor for later calls to populateStructUnsafe3. In this very simple case our typeDescriptor is just the offset of the B field in the struct.

type typeDescriptor uintptr

func describeType(in interface{}) (typeDescriptor, error) {
	typ := reflect.TypeOf(in)
	if typ.Kind() != reflect.Ptr {
		return 0, fmt.Errorf("you must pass in a pointer")
	}
	if typ.Elem().Kind() != reflect.Struct {
		return 0, fmt.Errorf("you must pass in a pointer to a struct")
	}
	f, ok := typ.Elem().FieldByName("B")
	if !ok {
		return 0, fmt.Errorf("struct does not have field B")
	}
	if f.Type.Kind() != reflect.Int {
		return 0, fmt.Errorf("field B should be an int")
	}
	return typeDescriptor(f.Offset), nil
}

func populateStructUnsafe3(in interface{}, ti typeDescriptor) error {
	structPtr := (*intface)(unsafe.Pointer(&in)).value
	*(*int)(unsafe.Pointer(uintptr(structPtr) + uintptr(ti))) = 42
	return nil
}

Here’s the new benchmark showing how the describeType call is used.

func BenchmarkPopulateUnsafe3(b *testing.B) {
	b.ReportAllocs()
	var m SimpleStruct

	descriptor, err := describeType((*SimpleStruct)(nil))
	if err != nil {
		b.Fatal(err)
	}

	for i := 0; i < b.N; i++ {
		if err := populateStructUnsafe3(&m, descriptor); err != nil {
			b.Fatal(err)
		}
		if m.B != 42 {
			b.Fatalf("unexpected value %d for B", m.B)
		}
	}
}

Here’s the benchmark results. It’s getting quite quick now.

BenchmarkPopulateUnsafe3-16  1000000000     0.359 ns/op    0 B/op   0 allocs/op

Just how good is this? We can see how fast we can populate this struct without using reflection if we write a benchmark for our original populateStruct function from the start of this article. Unsurprisingly this is a little faster than even our best reflection-based version, but there’s not much in it.

BenchmarkPopulate-16       	1000000000      0.234 ns/op    0 B/op   0 allocs/op

So, reflection isn’t necessarily slow at all. But you have to go to quite some effort and liberally sprinkle your code with unsafe and knowledge of Go internals to make it really quick.

If you’re interested in real-world uses of this approach, jsoniter uses reflect2 to implement a very similar approach, and I’ve used that as inspiration for plenc, which is a protobuf-like codec that uses Go structs to describe messages instead of proto files.