Death by a thousand error checks

You probably already knew this but I deeply realize the problem with golang errors only recently. But let me tell a story first.

Beginning

One day I was angry at Go for some reason I don’t remember exactly. This leads me to remember all nasty things that exists in the Go (and I think you will admit that there are plenty of them). One especially crazy example of unexpected go compiler behaviour is the following code snippet:

package main

import "testing"

type E struct{ Desc string }

func (e *E) Error() string { return e.Desc }
func OnlyZero(n int) *E {
	if n != 0 {
		return &E{Desc: "only zeroes allowed"}
	}
	return nil
}
func IsZero(n int) bool {
	var err error = OnlyZero(n)
	return err == nil
}

func TestIsZero(t *testing.T) {
	if IsZero(0) {
		t.Fail() // this Fail() is never called, surprisingly this test passes!
	}
}

You can find detailed description of the Go behavior for this case in the great blog post. In short, assignment on line 15 forces Golang compiler to “box” error and hide its concrete type behind the error interface. In order to not loose type information, Golang internally stores *E type nearby the pointer value. This makes Golang think that err not equal nil, because plain nil don’t have any type information attached to it!

(that’s why you should always use error interface in Go instead of more specific types despite that in other languages you usually return the most specific type, accept the most generic type)

Assembly

I discovered this nuance of Go by myself one year ago but this time I decided to share this information in internal Slack channel. We had nice discussion with coworkers about Go behaviour which forces me to dig into the example a bit more and look at the generated assembly for our snippet (by the way, Golang uses Plan9 assember syntax which is annoying sometimes as it’s very hard to find description of instructions):

TEXT     main.IsZero(SB), NOSPLIT|NOFRAME|ABIInternal, $0-8
FUNCDATA $0, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
FUNCDATA $1, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
FUNCDATA $5, main.IsZero.arginfo1(SB)
FUNCDATA $6, main.IsZero.argliveinfo(SB)
PCDATA   $3, $1
XORL     AX, AX
RET

Ha, that’s interesting! Compiler removes IsZero function code completely and just generates assembly which always produce true value. This is kind of expected behaviour: optimizing compiler should simplify code and remove “useless” operations – that’s why we love them. But from the developer perspective it’s obvious that removal of this code is a sign that something wrong happening: we wrote code return err == nil on purpose and it’s shouldn’t result in constant result independent of the function inputs.

Can we easily understand that Go removes some part of our code without Godbolt? Luckily, Go provides an easy way to inspect generated assembler code (the same one which Godbolt shows to us, actually): we can just pass -gcflags=-S argument to the go build command. The output of the compiler looks like following:

# command-line-arguments
main.(*E).Error STEXT nosplit size=11 args=0x8 locals=0x0 funcid=0x0 align=0x0
    0x0000 00000 (/home/sivukhin/code/gval/main.go:7)	TEXT	main.(*E).Error(SB), NOSPLIT|NOFRAME|ABIInternal, $0-8
    0x0000 00000 (/home/sivukhin/code/gval/main.go:7)	FUNCDATA	$0, gclocals·wgcWObbY2HYnK2SU/U22lA==(SB)
    0x0000 00000 (/home/sivukhin/code/gval/main.go:7)	FUNCDATA	$1, gclocals·J5F+7Qw7O7ve2QcWC7DpeQ==(SB)
    0x0000 00000 (/home/sivukhin/code/gval/main.go:7)	FUNCDATA	$5, main.(*E).Error.arginfo1(SB)
    0x0000 00000 (/home/sivukhin/code/gval/main.go:7)	FUNCDATA	$6, main.(*E).Error.argliveinfo(SB)
    0x0000 00000 (/home/sivukhin/code/gval/main.go:7)	PCDATA	$3, $1
    0x0000 00000 (/home/sivukhin/code/gval/main.go:7)	MOVQ	(AX), CX
    0x0003 00003 (/home/sivukhin/code/gval/main.go:7)	MOVQ	8(AX), BX
    0x0007 00007 (/home/sivukhin/code/gval/main.go:7)	MOVQ	CX, AX
    0x000a 00010 (/home/sivukhin/code/gval/main.go:7)	RET
    ...

Here we can see generated assembly and whats more important – every instruction annotated with the line of code which “produced” the sequence of assembler operations. That’s cool! Let’s look at which lines are actually in use in the final assembler code based on the compiler output:

$> go test -c -gcflags=-S main_test.go 2>&1 | grep -Po "main_test.go:\d+" | uniq
main_test.go:7
main_test.go:8
main_test.go:9
main_test.go:10
main_test.go:12
main_test.go:8
main_test.go:14
main_test.go:16
main_test.go:19
main_test.go:23

We can see that line 15 is missing in the output and lines 14, 16 are actually produces same assembly which we saw in the Godbolt.

That’s nice… So, what if we will try to utilize this combination of compiler output introspection and the fact that sometimes compiler can delete parts of the code which we definitely wanted to see in the binary in some form? Can we create a linter out from that idea which will allow us to find non-trivial bugs in the go code? How noisy this linter will be? Let’s find out!

naming is hard

Death by a thousand error checks

Beginning

Assembly

Govanish