It ain't necessarily slow
Reflection isn't always slow
Don’t use reflection. Unless you really have to. But when you’re not using reflection, don’t think that it is because reflection is fundamentally slow. It doesn’t have to be slow.
Reflection allows you to obtain information about Go types at runtime. We’ll look at how you can use it to populate structs if you were ever foolish enough to try to write a new version of something like json.Unmarshal.
We’ll deal with a simple case. We’ll have a struct with two integer fields called A and B.
type SimpleStruct struct {
A int
B int
}
Imagine we’ve received some JSON {"B":42}
, we’ve parsed it and we know we want
to set field B to 42. We’re going to write some functions that do just that.
They will all just set B to 42.
If our code only works on SimpleStruct this is totally trivial.
func populateStruct(in *SimpleStruct) {
in.B = 42
}
But if we’re writing a JSON parser we don’t know the struct type in advance. Our
parser code needs to accept any type. In Go, that usually means taking an
interface{}
parameter.
We can then use the reflect package to inspect the value passed in via that interface{} parameter, check that it is a pointer to a struct, find the field B and populate it with our value. Our code will look something like the following.
func populateStructReflect(in interface{}) error {
val := reflect.ValueOf(in)
if val.Type().Kind() != reflect.Ptr {
return fmt.Errorf("you must pass in a pointer")
}
elmv := val.Elem()
if elmv.Type().Kind() != reflect.Struct {
return fmt.Errorf("you must pass in a pointer to a struct")
}
fval := elmv.FieldByName("B")
fval.SetInt(42)
return nil
}
Let’s see how fast that is. A quick benchmark.
func BenchmarkPopulateReflect(b *testing.B) {
b.ReportAllocs()
var m SimpleStruct
for i := 0; i < b.N; i++ {
if err := populateStructReflect(&m); err != nil {
b.Fatal(err)
}
if m.B != 42 {
b.Fatalf("unexpected value %d for B", m.B)
}
}
}
We get the following results
BenchmarkPopulateReflect-16 15941916 68.3 ns/op 8 B/op 1 allocs/op
Is that good or bad? Well, allocations are never good. And you might wonder why you need to allocate memory on the heap to set a struct field to 42 (this issue is at the heart of it). But in the grand scheme of things 68ns isn’t a lot of time. You can fit a lot of 68ns in the time it takes to make a request of any kind over a network.
Can we do better? Well, normally the programs we run don’t just do one thing then stop. They do very similar things over and over again. Could we set something up once to make things faster for the repeats?
If we look carefully at the checks we’re doing we notice they all depend on the type of the value that’s been passed in. We can do these checks only once when we first see a type and cache the result.
We also need to track down that allocation. It turns out we call
Value.FieldByName
, that calls Type.FieldByName
, which calls
structType.FieldByName
which calls structType.Field
which allocates. Can we
call FieldByName
on the type and cache something to get the value’s ‘B’ field?
It turns out if we cache Field.Index
we can use that to get the field value
without an allocation.
Here’s our new version.
var cache = make(map[reflect.Type][]int)
func populateStructReflectCache(in interface{}) error {
typ := reflect.TypeOf(in)
index, ok := cache[typ]
if !ok {
if typ.Kind() != reflect.Ptr {
return fmt.Errorf("you must pass in a pointer")
}
if typ.Elem().Kind() != reflect.Struct {
return fmt.Errorf("you must pass in a pointer to a struct")
}
f, ok := typ.Elem().FieldByName("B")
if !ok {
return fmt.Errorf("struct does not have field B")
}
index = f.Index
cache[typ] = index
}
val := reflect.ValueOf(in)
elmv := val.Elem()
fval := elmv.FieldByIndex(index)
fval.SetInt(42)
return nil
}
The new benchmark is faster and we don’t have any allocations.
BenchmarkPopulateReflectCache-16 35881779 30.9 ns/op 0 B/op 0 allocs/op
Can we do any better? Well, if we know the offset of field B in the struct and we know its an int we can just write to the memory directly. We can recover the pointer to the struct from the interface as we know an interface is actually syntactic sugar for a struct with two pointers: the first points to information about the type and the second points to the value. Getting this pointer gives us the start of the struct. We can then use the offset of field B in the struct to address field B in this value directly.
Here’s our new code.
var unsafeCache = make(map[reflect.Type]uintptr)
type intface struct {
typ unsafe.Pointer
value unsafe.Pointer
}
func populateStructUnsafe(in interface{}) error {
typ := reflect.TypeOf(in)
offset, ok := unsafeCache[typ]
if !ok {
if typ.Kind() != reflect.Ptr {
return fmt.Errorf("you must pass in a pointer")
}
if typ.Elem().Kind() != reflect.Struct {
return fmt.Errorf("you must pass in a pointer to a struct")
}
f, ok := typ.Elem().FieldByName("B")
if !ok {
return fmt.Errorf("struct does not have field B")
}
if f.Type.Kind() != reflect.Int {
return fmt.Errorf("field B should be an int")
}
offset = f.Offset
unsafeCache[typ] = offset
}
structPtr := (*intface)(unsafe.Pointer(&in)).value
*(*int)(unsafe.Pointer(uintptr(structPtr) + offset)) = 42
return nil
}
The new benchmark shows this is quite a bit quicker.
BenchmarkPopulateUnsafe-16 62726018 19.5 ns/op 0 B/op 0 allocs/op
Can we go even quicker? If we run a CPU profile we see most of the time is taken
accessing the map. It also shows the map access is calling
runtime.interhash and
runtime.interequal. These
are functions for hashing interfaces and checking if they are equal. Perhaps
using a simpler key will speed things up? We could use the address of the type
information from the interface rather than the reflect.Type
itself.
var unsafeCache2 = make(map[uintptr]uintptr)
func populateStructUnsafe2(in interface{}) error {
inf := (*intface)(unsafe.Pointer(&in))
offset, ok := unsafeCache2[uintptr(inf.typ)]
if !ok {
typ := reflect.TypeOf(in)
if typ.Kind() != reflect.Ptr {
return fmt.Errorf("you must pass in a pointer")
}
if typ.Elem().Kind() != reflect.Struct {
return fmt.Errorf("you must pass in a pointer to a struct")
}
f, ok := typ.Elem().FieldByName("B")
if !ok {
return fmt.Errorf("struct does not have field B")
}
if f.Type.Kind() != reflect.Int {
return fmt.Errorf("field B should be an int")
}
offset = f.Offset
unsafeCache2[uintptr(inf.typ)] = offset
}
*(*int)(unsafe.Pointer(uintptr(inf.value) + offset)) = 42
return nil
}
Here’s the benchmark result for our new version. It’s quite a bit faster.
BenchmarkPopulateUnsafe2-16 230836136 5.16 ns/op 0 B/op 0 allocs/op
Can we go even faster still? Well, we could change the interface to our function. Often if you’re unmarshaling into a struct it’s always the same struct. We could split our function in two. We could have one function that checks the struct is correct for our purpose and returns a descriptor. We could then use that descriptor on future populate calls.
Here’s our new version. Our caller should call describeType
on initialisation to
obtain a typeDescriptor
for later calls to populateStructUnsafe3
. In this very
simple case our typeDescriptor
is just the offset of the B field in the struct.
type typeDescriptor uintptr
func describeType(in interface{}) (typeDescriptor, error) {
typ := reflect.TypeOf(in)
if typ.Kind() != reflect.Ptr {
return 0, fmt.Errorf("you must pass in a pointer")
}
if typ.Elem().Kind() != reflect.Struct {
return 0, fmt.Errorf("you must pass in a pointer to a struct")
}
f, ok := typ.Elem().FieldByName("B")
if !ok {
return 0, fmt.Errorf("struct does not have field B")
}
if f.Type.Kind() != reflect.Int {
return 0, fmt.Errorf("field B should be an int")
}
return typeDescriptor(f.Offset), nil
}
func populateStructUnsafe3(in interface{}, ti typeDescriptor) error {
structPtr := (*intface)(unsafe.Pointer(&in)).value
*(*int)(unsafe.Pointer(uintptr(structPtr) + uintptr(ti))) = 42
return nil
}
Here’s the new benchmark showing how the describeType
call is used.
func BenchmarkPopulateUnsafe3(b *testing.B) {
b.ReportAllocs()
var m SimpleStruct
descriptor, err := describeType((*SimpleStruct)(nil))
if err != nil {
b.Fatal(err)
}
for i := 0; i < b.N; i++ {
if err := populateStructUnsafe3(&m, descriptor); err != nil {
b.Fatal(err)
}
if m.B != 42 {
b.Fatalf("unexpected value %d for B", m.B)
}
}
}
Here’s the benchmark results. It’s getting quite quick now.
BenchmarkPopulateUnsafe3-16 1000000000 0.359 ns/op 0 B/op 0 allocs/op
Just how good is this? We can see how fast we can populate this struct without
using reflection if we write a benchmark for our original populateStruct
function from the start of this article. Unsurprisingly this is a little faster
than even our best reflection-based version, but there’s not much in it.
BenchmarkPopulate-16 1000000000 0.234 ns/op 0 B/op 0 allocs/op
So, reflection isn’t necessarily slow at all. But you have to go to quite some effort and liberally sprinkle your code with unsafe and knowledge of Go internals to make it really quick.
If you’re interested in real-world uses of this approach, jsoniter uses reflect2 to implement a very similar approach, and I’ve used that as inspiration for plenc, which is a protobuf-like codec that uses Go structs to describe messages instead of proto files.