Improvements in Go 1.5

Diving into the changes in the latest Go release, particularly with runtime and garbage collector improvements

September 20, 2015

With the relatively recent release Go 1.5, there are a variety of interesting areas to look at in regards to what’s changed with the language. As one would expect based on both Go’s philosophy and it’s future compatibility guidelines, not much has changed from a language feature standpoint. Still, there a number of exciting under-the-hood enhancements in the latest release.

Package system

New functionality has been introduced to the packaging system in two ways. First, support for internal packages has been brought out of it’s experimental phase in Go 1.4 where it was being testing within GOROOT to now apply to packages within the GOPATH. This will allow Go packages to divide their different functions up into more logical components without exposing APIs in unwanted ways. This is a pattern that the App Engine SDK uses well, wrapping supporting functionality that is largely auto-generated in internal packages, making the logical code exposed through the API more readable and easier to maintain.

The second addition to the package system was experimental support for vendored packages, adding behavior to the go tools that will perform a local lookup of packages within a vendor/ directory before going into the GOPATH. This mirrors what tools such as godep have previously been doing through manipulation of the GOPATH environment variable. Additionally, because this is an experimental change, it is only enabled if GO15VENDOREXPERIMENT=1 is set in the environment the go command is executed in.

Compiler and Runtime written in Go

The entire go tool chain has been overhauled to be written entirely in Go. Previously, much of go was written in C because it provided a much easier path to get started with a new language. The bootstrapping process for Go was much easier because it could be built with just a C compiler. Still, the reasons were listed in a 2013 proposal of this change.

It is easier to write correct Go code than to write correct C code.

It is easier to debug incorrect Go code than to debug incorrect C code.

Work on a Go compiler necessarily requires a good understanding of Go. Implementing the compiler in C adds an unnecessary second requirement.

Go makes parallel execution trivial compared to C.

Go has better standard support than C for modularity, for automated rewriting, for unit testing, and for profiling.

Go is much more fun to use than C.

This change was primarily facilitated by an automated translation process, with certain code translated by hand. The process is detailed in Rob Pike’s Go to Go talk and slides which steps through the motivation, history, and how this translation took place.

There are also a variety of new tools in Go 1.5 that were much easier or newly possible due to this change. Some changes in 1.5, including stack maps, continuous stacks, and write barriers were nearly impossible to accomplish previously with C in the tool chain because of a lack of type safety and uncertainty that optimizations in C compilers introduce, limiting what’s possible for an external language (Go) to accomplish.

As one might expect, C code that is literally translated into Go is not optimally efficient. After some basic cleaning up of the raw translation using a purpose-specific tool called grind, the translation was still around ten times slower than the C compiler it was based on. Most of this slowdown was solved through further optimization of c-specific patterns such as complex for loops, treatment of stack variables, unions converted to bloated structs, misplaced declarations on unused variables, and more, which were converted into more idiomatic and efficient forms.

Still, in the short term this switch has slowed the Go build process by about a factor of two. However, what the translation process did preserve was the correctness of the compiler, thus avoiding the introduction of new bugs that would inevitably come with a rewrite. In this way, the translated code can be incrementally improved through more idiomatic use of the language to bring performance of the already lightning fast compilation up to where it was previously.

True Parallelism

The changes I am most excited about in Go 1.5 are the performance improvements. One large step forward has occurred in the ability to perform parallel computation. If you’re a more experienced Gopher, you’ve most likely seen Rob Pike’s talk, Concurrency is not Parallelism, which explains the important distinction between these concepts. Previously, Go offered a default of concurrency without parallelism. In the latest release, that default has changed.

The technical change that occurred was switching GOMAXPROCS, the maximum number of CPUs that can be executing simultaneously, from it’s previous default value of 1 to the number of logical CPUs. The default of one was the logical choice up until now. Due to some limitations in the goroutine scheduler, programs in earlier versions of Go would often get slower when parallelism was introduced by raising GOMAXPROCS. The documentation on this change illustrates how the performance of a number of reference algorithms has steadily improved when they are parallelized, including in worst case scenarios that are particularly difficult to deal with.

Because of Go’s emphasis on concurrency primarily as a design technique, it wasn’t a logical step to enable parallelism by default if it would punish those programs that used concurrency in ways that may not be inherently parallel, or are difficult to parallelize efficiently.

There were a number of improvements that enabled this shift…

GC improvements

The changes to the garbage collector that have come to fruition in Go 1.5 are primarily focused on reducing latency. Go has historically had a stop the world (STW) garbage collector. This type of collector prevents mutation of state while the garbage collection algorithm is taking place, which basically creates a binary set of applications states, one for program execution, and another for garbage collection. In 1.5, the garbage collection becomes concurrent. This means that instead of two distinct and wholly separate application states, the garbage collection phases happen concurrently with program execution.

Garbage collection was and still is activated by the growth of the heap, with a default trigger of a 100% increase in heap size, though that value is configurable. With concurrency added, a variety of things now occur during the GC phase. There is still a STW phase, but it is much shorter than before, on the order of 1ms, just setting up the GC process rather than executing it. With a call to internal runtime methods, the operation of the program is resumed, but now garbage collection is going on as well on a single dedicated CPU, while the rest remain focused on application work. Interestingly, the threads of the application itself are periodically requested to perform work by the GC, called an “assist”.

These changes are not all that needs to occur for concurrency to be enabled. We still have the problem of mutations to application state invalidating the information collected during the mark phase. The solution to this problem is called a “write barrier”, which doesn’t actually block writes from occurring, but rather notifies the GC algorithm of all changes to state that are made while GC is in progress. These notifications allow the GC process and mutations made by the application to coexist. Once the majority of the processing is complete, another short STW phase, on the order of 3ms, performs cleanup and bookkeeping, returning the program to normal execution state.

The performance improvements that were seen as a result of this are quite remarkable, with GC pause time for large, multi-gigabyte heaps dropping three orders of magnitude from seconds to milliseconds. With these changes, the Go runtime team hopes that wait times for GC will no longer be a barrier for the use of Go, expanding it’s applicability to more performance-critical systems. Additionally, these improvements allow Go programs to improve without modification as hardware improves, particularly through increases in the number of available cores.

Google App Engine support

At Meta we use Google App Engine to host the majority of our infrastructure as I’ve blogged about. We are very much looking forward to the availability of Go 1.5 on App Engine, and is planned, but unfortunately there is no definitive date as of yet. We’ll be moving to is as soon as it’s available!

Conclusions

Go 1.5 solidifies the Go tool chain by moving it to Go and brings performance improvements that increase it’s competitiveness with other high-performance languages. There are plenty of other changes that I didn’t get a chance to touch on, so be sure to take a look at the full Go 1.5 Release Notes for other details.

Improvements in Go 1.5

Diving into the changes in the latest Go release, particularly with runtime and garbage collector improvements

Package system

Compiler and Runtime written in Go

True Parallelism

GC improvements

Google App Engine support

Conclusions

References

Other posts

Speaking

Posts elsewhere

September 20, 2015