Defer - 兜的破烂

defer的一些东西

看看个坑

我们先来看一个例子:

func() {
  var run func() = nil
  defer run()
  fmt.Println("runs")
}

结果是:

panic: runtime error: invalid memory address or nil pointer dereference

详情

在g和p上都有相关结构,其中p上的为一个defer pool//todo

一些相关结构

_defer的结构：

// A _defer holds an entry on the list of deferred calls.
// If you add a field here, add code to clear it in freedefer and deferProcStack
// This struct must match the code in cmd/compile/internal/gc/reflect.go:deferstruct
// and cmd/compile/internal/gc/ssa.go:(*state).call.
// Some defers will be allocated on the stack and some on the heap.
// All defers are logically part of the stack, so write barriers to
// initialize them are not required. All defers must be manually scanned,
// and for heap defers, marked.

type _defer struct {
    //参数和return的大小
	siz     int32 // includes both arguments and results
	started bool//标记开始
	heap    bool
	// openDefer indicates that this _defer is for a frame with open-coded
	// defers. We have only one defer record for the entire frame (which may
	// currently have 0, 1, or more defers active).
	//这里的openCoded是在编译时的一些优化，在编译时会直接将defer的方法插入到运行中函数尾部,避免deferproc和deferprocStack操作

    openDefer bool
    //栈的指针
	sp        uintptr  // sp at time of defer
    //调用方的程序计数器
    pc        uintptr  // pc at time of defer
    //defer传入的函数
    fn        *funcval // can be nil for open-coded defers
    //触发延迟调用的结构体，可能为空
    _panic    *_panic  // panic that is running defer，每次添加到链表头部
	link      *_defer //每次添加到链表头部

	// If openDefer is true, the fields below record values about the stack
	// frame and associated function that has the open-coded defer(s). sp
	// above will be the sp for the frame, and pc will be address of the
	// deferreturn call in the function.
	fd   unsafe.Pointer // funcdata for the function associated with the frame
	varp uintptr        // value of varp for the stack frame
	// framepc is the current pc associated with the stack frame. Together,
	// with sp above (which is the sp associated with the stack frame),
	// framepc/sp can be used as pc/sp pair to continue a stack trace via
	// gentraceback().
	framepc uintptr
}

//只允许在stack上
//go:notinheap
type _panic struct {
	argp      unsafe.Pointer // pointer to arguments of deferred call run during panic; cannot move - known to liblink
	arg       interface{}    // argument to panic
	link      *_panic        // link to earlier panic
	//标记return的位置
	pc        uintptr        // where to return to in runtime if this panic is bypassed
	sp        unsafe.Pointer // where to return to in runtime if this panic is bypassed
	recovered bool           // whether this panic is over，标记是否已经调用recover
	aborted   bool           // the panic was aborted
	goexit    bool
}

defer在整个程序时实现主要分成编译期和运行时有不同的动作：

编译期(有三种不同的编译方式,1.14新增一种)

defer 关键字转换成了deferproc,堆上分配 ;
还会在所有调用defer的函数末尾插入deferreturn，栈上分配(1.13新增), ssa会预留defer空间(1.13，在函数体内最多执行一次就会调用cmd/compile/internal/gc.state.call将结构体分配到栈并调用runtime.deferprocStack);
open-coded(1.14新增),只会在以下情况:
- 函数的 defer 数量少于或者等于 8 个；
- 函数的 defer 关键字不能在循环中执行(包括goto)；
- 函数的 return 语句与 defer 语句的乘积小于或者等于 15 个；编译时会根据以上条件判断是否开启;

转换defer关键字

cmd/compile/internal/gc.state.stmt会处理defer关键字

cmd/compile/internal/gc.state.call 该函数负责所有函数和方法调用生成中间代码:

获取需要执行的函数名、闭包指针、代码指针和函数调用的接收方；
获取栈地址并将函数或者方法的参数写入栈中；
使用 cmd/compile/internal/gc.state.newValue1A 以及相关函数生成函数调用的中间代码；
如果当前调用的函数是 defer，那么就会单独生成相关的结束代码块；
获取函数的返回值地址并结束当前调用；

针对open-coded，主要就是在某些情况下使用,如下，默认超过 maxOpenDefers = 8就不能使用Open-coded

func walkstmt(n *Node) *Node {
	...
	case ODEFER:
			Curfn.Func.SetHasDefer(true)
			Curfn.Func.numDefers++
			if Curfn.Func.numDefers > maxOpenDefers {
				// Don't allow open-coded defers if there are more than
				// 8 defers in the function, since we use a single
				// byte to record active defers.
				Curfn.Func.SetOpenCodedDeferDisallowed(true)
			}
			if n.Esc != EscNever {
				// If n.Esc is not EscNever, then this defer occurs in a loop,
				// so open-coded defers cannot be used in this function.
				Curfn.Func.SetOpenCodedDeferDisallowed(true)
			}
			fallthrough
	...
}

栈上分配(消耗更少)

// deferprocStack queues a new deferred function with a defer record on the stack.
// The defer record must have its siz and fn fields initialized.
// All other fields can contain junk.
// The defer record must be immediately followed in memory by
// the arguments of the defer.
// Nosplit because the arguments on the stack won't be scanned
// until the defer record is spliced into the gp._defer list.
//go:nosplit
func deferprocStack(d *_defer) {
	gp := getg()
	if gp.m.curg != gp {
		// go code on the system stack can't defer
		throw("defer on system stack")
	}
	// siz and fn are already set.
	// The other fields are junk on entry to deferprocStack and
	// are initialized here.
	d.started = false
	d.heap = false
	d.openDefer = false
	d.sp = getcallersp()
	d.pc = getcallerpc()
	d.framepc = 0
	d.varp = 0
	// The lines below implement:
	//   d.panic = nil
	//   d.fd = nil
	//   d.link = gp._defer
	//   gp._defer = d
	// But without write barriers. The first three are writes to
	// the stack so they don't need a write barrier, and furthermore
	// are to uninitialized memory, so they must not use a write barrier.
	// The fourth write does not require a write barrier because we
	// explicitly mark all the defer structures, so we don't need to
	// keep track of pointers to them with a write barrier.
	*(*uintptr)(unsafe.Pointer(&d._panic)) = 0
	*(*uintptr)(unsafe.Pointer(&d.fd)) = 0
	*(*uintptr)(unsafe.Pointer(&d.link)) = uintptr(unsafe.Pointer(gp._defer))
	*(*uintptr)(unsafe.Pointer(&gp._defer)) = uintptr(unsafe.Pointer(d))

	return0()
	// No code can go here - the C return register has
	// been set and must not be clobbered.
}

运行时

deferproc会将一个新的_defer结构体追加到当前goroutine的链表头(分配一个_defer对象并加入延迟参数)
deferreturn会从goroutine的链表(已经在函数调用栈中)取出_defer并执行

创建延迟调用(堆上)

// Create a new deferred function fn with siz bytes of arguments.
// The compiler turns a defer statement into a call to this.
//go:nosplit
func deferproc(siz int32, fn *funcval) { // arguments of fn follow fn
	gp := getg()
	if gp.m.curg != gp {
		// go code on the system stack can't defer
		throw("defer on system stack")
    }

    //-------------------1. 创建一个_defer延迟调用
    // the arguments of fn are in a perilous state. The stack map
	// for deferproc does not describe them. So we can't let garbage
	// collection or stack copying trigger until we've copied them out
	// to somewhere safe. The memmove below does that.
	// Until the copy completes, we can only call nosplit routines.
	sp := getcallersp()
	argp := uintptr(unsafe.Pointer(&fn)) + unsafe.Sizeof(fn)
	callerpc := getcallerpc()

    //新的_defer结构，重点！
	d := newdefer(siz)
	if d._panic != nil {
		throw("deferproc: d.panic != nil after newdefer")
	}
    d.link = gp._defer
    //赋值fn,pc,sp等
	gp._defer = d
    d.fn = fn
	d.pc = callerpc
	d.sp = sp
	switch siz {
	case 0:
		// Do nothing.
	case sys.PtrSize:
		*(*uintptr)(deferArgs(d)) = *(*uintptr)(unsafe.Pointer(argp))
	default:
		memmove(deferArgs(d), unsafe.Pointer(argp), uintptr(siz))
    }

    //避免无限递归调用deferreturn，其是唯一一个不会触发由延迟调用的函数
    // deferproc returns 0 normally.
	// a deferred func that stops a panic
	// makes the deferproc return 1.
	// the code the compiler generates always
	// checks the return value and jumps to the
    // end of the function if deferproc returns != 0.
    //其会在deferproc的最后去发信号给不会跳转到deferreturn的goroutine
    // return0 is a stub used to return 0 from deferproc.
    // It is called at the very end of deferproc to signal
    // the calling Go function that it should not jump
    // to deferreturn.
    // in asm_*.s
	return0()
}

其中newdefer()的作用就是要获取一个_defer结构,大概分为几种方式:

从全局调度器的延迟调用缓存池sched.deferpool中取出结构体并将该结构体追加到当前goroutine的缓存池
从goroutine绑定的p的延迟调用缓存池pp.deferpool中取出
mallocgc()创建一个新的结构体

无论哪种，最后都会被放入到link链表的最前面

// Allocate a Defer, usually using per-P pool.
// Each defer must be released with freedefer.  The defer is not
// added to any defer chain yet.
//
// This must not grow the stack because there may be a frame without
// stack map information when this is called.
//
//go:nosplit
func newdefer(siz int32) *_defer {
	var d *_defer
	sc := deferclass(uintptr(siz))
	gp := getg()
	if sc < uintptr(len(p{}.deferpool)) {
        pp := gp.m.p.ptr()
        //--------------1. 从sched.deferpool里面拿--------------
		if len(pp.deferpool[sc]) == 0 && sched.deferpool[sc] != nil {
			// Take the slow path on the system stack so
			// we don't grow newdefer's stack.
			systemstack(func() {
				lock(&sched.deferlock)
				for len(pp.deferpool[sc]) < cap(pp.deferpool[sc])/2 && sched.deferpool[sc] != nil {
                    //拿出_defer
					d := sched.deferpool[sc]
					sched.deferpool[sc] = d.link
                    d.link = nil
                    //append到当前goroutine的缓存池中
					pp.deferpool[sc] = append(pp.deferpool[sc], d)
				}
				unlock(&sched.deferlock)
			})
        }
        //-----------------2. 从gourtine的pp.deferpool中拿
		if n := len(pp.deferpool[sc]); n > 0 {
			d = pp.deferpool[sc][n-1]
			pp.deferpool[sc][n-1] = nil
			pp.deferpool[sc] = pp.deferpool[sc][:n-1]
		}
    }
    

	if d == nil {
        //-----------------3. mallogc一个新的结构体--------
		// Allocate new defer+args.
		systemstack(func() {
			total := roundupsize(totaldefersize(uintptr(siz)))
			d = (*_defer)(mallocgc(total, deferType, true))
		})
		if debugCachedWork {
			// Duplicate the tail below so if there's a
			// crash in checkPut we can tell if d was just
			// allocated or came from the pool.
            d.siz = siz
            //追加到link上面
			d.link = gp._defer
			gp._defer = d
			return d
		}
	}
	d.siz = siz
	d.heap = true
	return d
}

上面defer关键字插入是从后到前，但是其执行是从前到后，即为什么运行好像一个栈一样

执行延迟调用(堆上)

// Run a deferred function if there is one.
// The compiler inserts a call to this at the end of any
// function which calls defer.
// If there is a deferred function, this will call runtime·jmpdefer,
// which will jump to the deferred function such that it appears
// to have been called by the caller of deferreturn at the point
// just before deferreturn was called. The effect is that deferreturn
// is called again and again until there are no more deferred functions.
//
// Declared as nosplit, because the function should not be preempted once we start
// modifying the caller's frame in order to reuse the frame to call the deferred
// function.
//
// The single argument isn't actually used - it just has its address
// taken so it can be matched against pending defers.
//go:nosplit
func deferreturn(arg0 uintptr) {
    gp := getg()
    //取出_defer
	d := gp._defer
	if d == nil {
		return
	}
	sp := getcallersp()
	if d.sp != sp {
		return
	}
	if d.openDefer {
		//1.14 opencoded
		done := runOpenDeferFrame(gp, d)
		if !done {
			throw("unfinished open-coded defers in deferreturn")
		}
		gp._defer = d.link
		freedefer(d)
		return
	}

	// Moving arguments around.
	//
	// Everything called after this point must be recursively
	// nosplit because the garbage collector won't know the form
	// of the arguments until the jmpdefer can flip the PC over to
	// fn.
	switch d.siz {
	case 0:
		// Do nothing.
	case sys.PtrSize:
		*(*uintptr)(unsafe.Pointer(&arg0)) = *(*uintptr)(deferArgs(d))
	default:
		memmove(unsafe.Pointer(&arg0), deferArgs(d), uintptr(d.siz))
	}
	fn := d.fn
	d.fn = nil
	gp._defer = d.link
	freedefer(d)
	// If the defer function pointer is nil, force the seg fault to happen
	// here rather than in jmpdefer. gentraceback() throws an error if it is
	// called with a callback on an LR architecture and jmpdefer is on the
	// stack, because the stack trace can be incorrect in that case - see
	// issue #8153).
    _ = fn.fn
    //传入_defer的fn和其参数
	jmpdefer(fn, uintptr(unsafe.Pointer(&arg0)))
}

jmpdefer是汇编实现的runtime函数，目的就是跳转defer所在代码，并在执行结束之后跳转回deferreturn

// func jmpdefer(fv *funcval, argp uintptr)
// argp is a caller SP.
// called from deferreturn.
// 1. pop the caller
// 2. sub 5 bytes from the callers return
// 3. jmp to the argument
TEXT runtime·jmpdefer(SB), NOSPLIT, $0-16
	MOVQ	fv+0(FP), DX	// fn
	MOVQ	argp+8(FP), BX	// caller sp
	LEAQ	-8(BX), SP	// caller sp after CALL
	MOVQ	-8(SP), BP	// restore BP as if deferreturn returned (harmless if framepointers not in use)
	SUBQ	$5, (SP)	// return to CALL again
	MOVQ	0(DX), BX
	JMP	BX	// but first run the deferred function

//todo ??? deferreturn 函数会多次判断当前 Goroutine 的 _defer 链表中是否有未执行的剩余结构，在所有的延迟函数调用都执行完成之后，该函数才会返回;

调用deferproc创建新的延迟调用时就会立刻copy函数的参数，所以参数在defer声明时就已经开始执行计算

open-coded

对于运行中才确定的defer，可以看到其中1.14新增的open-coded，采用位操作(defer bits，即变量df)来确认分支:

首先直接在defer func()这种会直接插入;
其次，在条件判断分支中的 defer,则要在运行时记录每个defer是否被执行，从而便于判断最后的延迟调用该执行哪些函数;

原理：

同一个函数内每出现一个 defer 都会为其分配 var df byte，如果被执行到则设为 1，否则设为 0(比如df|=1)，当到达函数返回之前需要判断延迟调用时，则用掩码(比如df&1>0判断第一个defer是否存在)判断每个位置的比特，若为 1 则调用延迟函数，否则跳过。

为了轻量，官方将延迟比特限制为 1 个字节，即 8 个比特，这就是为什么不能超过 8 个 defer 的原因，若超过依然会选择堆栈分配，但显然大部分情况不会超过 8 个;

// runOpenDeferFrame runs the active open-coded defers in the frame specified by
// d. It normally processes all active defers in the frame, but stops immediately
// if a defer does a successful recover. It returns true if there are no
// remaining defers to run in the frame.
func runOpenDeferFrame(gp *g, d *_defer) bool {
	done := true
	fd := d.fd

	// Skip the maxargsize
	_, fd = readvarintUnsafe(fd)
	deferBitsOffset, fd := readvarintUnsafe(fd)
	//defer的数量
	nDefers, fd := readvarintUnsafe(fd)
	//获得deferBits，因为open-coded最多支持8个，所以默认bits数量就是8,等于一个字节
	//而且这些bits是运行时设置的
	deferBits := *(*uint8)(unsafe.Pointer(d.varp - uintptr(deferBitsOffset)))

	for i := int(nDefers) - 1; i >= 0; i-- {
		// read the funcdata info for this defer
		var argWidth, closureOffset, nArgs uint32
		argWidth, fd = readvarintUnsafe(fd)
		closureOffset, fd = readvarintUnsafe(fd)
		nArgs, fd = readvarintUnsafe(fd)
		//移位来判定,掩码
		if deferBits&(1<<i) == 0 {
			for j := uint32(0); j < nArgs; j++ {
				_, fd = readvarintUnsafe(fd)
				_, fd = readvarintUnsafe(fd)
				_, fd = readvarintUnsafe(fd)
			}
			continue
		}
		closure := *(**funcval)(unsafe.Pointer(d.varp - uintptr(closureOffset)))
		d.fn = closure
		deferArgs := deferArgs(d)
		// If there is an interface receiver or method receiver, it is
		// described/included as the first arg.
		for j := uint32(0); j < nArgs; j++ {
			var argOffset, argLen, argCallOffset uint32
			argOffset, fd = readvarintUnsafe(fd)
			argLen, fd = readvarintUnsafe(fd)
			argCallOffset, fd = readvarintUnsafe(fd)
			memmove(unsafe.Pointer(uintptr(deferArgs)+uintptr(argCallOffset)),
				unsafe.Pointer(d.varp-uintptr(argOffset)),
				uintptr(argLen))
		}
		//移位到下一位
		deferBits = deferBits &^ (1 << i)
		*(*uint8)(unsafe.Pointer(d.varp - uintptr(deferBitsOffset))) = deferBits
		p := d._panic
		reflectcallSave(p, unsafe.Pointer(closure), deferArgs, argWidth)
		if p != nil && p.aborted {
			break
		}
		d.fn = nil
		// These args are just a copy, so can be cleared immediately
		memclrNoHeapPointers(deferArgs, uintptr(argWidth))
		if d._panic != nil && d._panic.recovered {
			done = deferBits == 0
			break
		}
	}

	return done
}

缺点

官方给出的测试中提高了几乎有一个0，但是要注意到一种问题，如果在插入的多个oepn-coded代码之间panic或者goExit()，下面的代码就不会进入，而是去找注册的defer; 针对这种情况，程序还会去进行一次栈扫描;

当然针对这些情况，_defer结构体就新增了以下几个字段来帮忙找到未注册到链表的defer函数:

type _defer struct {
	...
    openDefer bool
	// If openDefer is true, the fields below record values about the stack
	// frame and associated function that has the open-coded defer(s). sp
	// above will be the sp for the frame, and pc will be address of the
	// deferreturn call in the function.
	fd   unsafe.Pointer // funcdata for the function associated with the frame
	varp uintptr        // value of varp for the stack frame
	// framepc is the current pc associated with the stack frame. Together,
	// with sp above (which is the sp associated with the stack frame),
	// framepc/sp can be used as pc/sp pair to continue a stack trace via
	// gentraceback().
	framepc uintptr
}

其实就导致在1.14中，如果使用opencoded，defer的确会变快了，但是在有panic的情况，却更慢了;

golang

本博客所有文章除特别声明外，均采用 CC BY-SA 4.0 协议，转载请注明出处！

Interview Previous

Kafka notes Next