Examining Evil
Evil is in the eye of the beholder
It feels trite to write about the famous quote about premature optimisation being the root of all evil (I double-checked the definition of “trite” before I wrote that sentence!). But I do have a strongly-held opinion about it. What I like to call “normal” levels of optimisation are not only not evil: they’re entirely necessary in many circumstances.
I keep hitting examples where performance was apparently not considered before something was released. Or it was considered but simplicity was preferred. One example is described in my recent rant about the Go encoding/json Marshaler & Unmarshaler interfaces. Another recent issue I came across was ~50,000 unnecessary allocations for a single execution of a TensorFlow model. Underneath this was because the TensorFlow Go library uses encoding/binary. And the documentation for encoding/binary says “This package favors simplicity over efficiency”.
You could blame the authors of the Go TensorFlow library for this. They should have cared more about performance. But I have sympathy for them assuming that an apparent building block in the Go standard library would be the right thing to use.
I really think if you are creating building blocks then your users implicitly expect you to have considered efficiency and done at least a normal level of basic optimisation.
What do I think of as “normal” optimisation and when should it be applied? The first question I ask myself is “does the performance of this code matter at all?”. The second is “do I have any idea how this code is likely to perform”. I’m increasingly thinking I want to know something about the performance of any code before it goes live. If the code is different from anything we’ve run before then perhaps it’s worth measuring. If it’s unlikely to surprise us then we can let it go.
If you’re producing a package that’s going to be used by others or is part of a platform on which a large amount of other code rests then I think it is important to think about performance.
- Small inefficiencies add up. If your software is built on a myriad of inefficent blocks then it stands a fair chance of being inefficient overall. And attempts to profile the system won’t show you a particular candidate to improve. I call this the “grey death” of efficiency.
- You don’t know if your component might be used in the future in a way where its performance & efficiency does matter. A little effort up-front may avoid performance problems in future.
- If you don’t think about performance up-front you may end up committing to an API with built-in efficiency issues that can’t easily be fixed. This is particularly true if you’re providing a library to a large number of users and provide API compatibility guarantees.
If you do think performance might matter then write a benchmark. With Go it’s pretty easy to write a simple benchmark that gives you some idea of what the performance of your code is like. And once you have a benchmark you can tell if any changes you make actually improve things.
Once you have your benchmark, review the results and the code. The big things to look for in a Go context are unnecessary allocations, but you can also think about whether there’s any work done unnecessarily or work that’s repeated. Traditional advice is to ‘pick the right algorithm’ - but when thinking about the performance of ordinary code there’s often no identifiable “algorithm” in play.
If you’re writing a package with an API then it’s worth thinking about whether it can be used efficiently when someone does care about performance, This might be as simple as adding a Reset
call to an API object so it can be re-used, or allowing the caller to pass in a slice that results are appended to instead of allocating a fresh slice each time.
And in nearly all cases I’d say stop there. If your code isn’t obviously wasteful then that’s normally good enough. I’ve written a fair amount of code that needs to perform well and fixed a fair number of performance issues and I’ve never had to write any assembly or ever found that unrolling a loop has made a measureable difference. Reserve clever tricks for when there’s an actual performance problem. But don’t ignore the everyday inefficiencies.