Photo by Sandrachile . on Unsplash
Photo by Sandrachile . on Unsplash

What’s all that memory for?

Perhaps it’s for storing strings?

If you actually want to use the memory on your computer with Go — really use it, with gigabytes of it allocated — then you may pay a big penalty for the Go garbage collector (GC). But there are things you can do about it.

The Go GC checks what parts of the memory you have allocated are still in use. It does this by looking at all the memory for references to other pieces of memory. If you’ve allocated millions of pieces of memory, then all that ‘looking’ necessarily takes some CPU time to do. So if you actually want to use the gigabytes of memory in your computer, you might want to be a little careful about how you do things.

How bad is it?

Imagine you have a desperate need to remember 100 million random 20 byte strings. What kind of overhead does the GC impose if you do this in a normal way?

Here’s some code to allocate those strings. This uses about 3.5 GB of RAM.

space := make([]string, 100*1000*1000)
for i := range space {
	space[i] = fmt.Sprintf(“%20.20d”, rand.Int())
}

So what impact does this have on GC? Well, one easy thing we can do to measure this is call the Go runtime to force GC, and measure how long that takes.

start := time.Now()
runtime.GC()
fmt.Printf(“GC takes %s\n”, time.Since(start))

How long does that take?

GC takes 730.93697ms

Oh. That’s quite a long time.

Well, it’s quite quick for looking at 100 million things (about 7ns a thing). But burning 700ms of CPU time each time the GC runs is definitely edging into the realm of “not ideal”.

And if we run the GC again, it takes approximately the same time again. We’re stuck with ~700ms of GC work every time the GC runs until we’re done with these strings.

How can we fix it?

Luckily for us the Go GC is so clever that it does not look at every piece of memory allocated. If it knows the memory does not contain any pointers, it does not look at it. Without pointers the memory cannot be referencing other pieces of memory, so the GC doesn’t need to look at it to determine which memory is no longer referenced and therefore can be freed.

If we can arrange things so we can store the strings without any pointers, we can save this GC overhead.

Oh, strings contain pointers?

Yes, strings contain pointers. The reflect package shows us what a string actually is.

type StringHeader struct {
	Data uintptr
	Len int
}

A string is a pointer to a piece of memory containing the bytes of the string, and a length of the string. So our slice of 100 million strings contains 100 million pointers and 100 million lengths. And 100 million separate allocations which hold the bytes for the strings.

Our solution

Instead of having 100 million separate allocations and 100 million pointers, we can allocate a single slice of bytes to contain all the bytes for all the strings, and make our own string-like objects that contain offsets into this slice.

We define a string bank to contain the string bytes.

type stringbank []byte

And this is our “banked” version of a string with offsets instead of pointers.

type bankedString struct {
	offset int
	len int
}

We can make a function to add a string to the string bank and return a bankedString. This copies the bytes from the string into our string bank, and saves the offset of the string and the length of the string.

func (sb *stringbank) store(in string) bankedString {
	offset := len(*sb)
	*sb = append(*sb, in…)
	return bankedString{
		offset: offset,
		len: len(in),
	}
}

This bankedString can then be used to retrieve the original string.

func (sb stringbank) get(in bankedString) string {
	return string(sb[in.offset : in.offset+in.len])
}

Storing our random strings needs just a little modification. We need to allocate a stringbank, which we make big enough to hold all our string data, and we keep a []bankedString instead of a []string.

sb := make(stringbank, 0, 20*100*1000*1000)
space := make([]bankedString, 100*1000*1000)
for i := range space {
	space[i] = sb.store(fmt.Sprintf(“%20.20d”, rand.Int()))
}

If we now time GC we get a marked improvement.

GC takes 108.166528ms

This is still quite a long time for GC, but if we run GC again we see a further big drop. The first run of the GC frees up temporary strings we’ve created (rather carelessly) while we build our slice of strings. Once this is done, the GC overhead is practically nil.

GC takes 348.923µs

Conclusions

I doubt it makes sense to do this kind of thing normally. It only really makes sense if you are going to keep the strings for the lifetime of your process as there’s no way to delete individual strings.

What does this say about other situations? Perhaps you don’t want to store a huge amount of data. Perhaps you’re building some kind of API service. Does this stuff apply? Well, if across all your goroutines and API handlers you use a significant amount of RAM then perhaps it does. If you can avoid using pointers here and there, perhaps some of your allocations will end up being pointer-free, and this may reduce the overall CPU usage of the GC. Which might make your program perform better, or cost less to run. Just make sure you measure things before and after any change to be sure you actually make an improvement.