runtime/trace: enable frame pointer unwinding on amd64
Change tracer to use frame pointer unwinding by default on amd64. The
expansion of inline frames is delayed until the stack table is dumped at
the end of the trace. This requires storing the skip argument in the
stack table, which now resides in pcBuf[0]. For stacks that are not
produced by traceStackID (e.g. CPU samples), a logicalStackSentinel
value in pcBuf[0] indicates that no inline expansion is needed.
Add new GODEBUG=tracefpunwindoff=1 option to use the old unwinder if
needed.
Benchmarks show a considerable decrease in CPU overhead when using frame
pointer unwinding for trace events:
GODEBUG=tracefpunwindoff=1 ../bin/go test -run '^$' -bench '.+PingPong' -count 20 -v -trace /dev/null ./runtime | tee tracefpunwindoff1.txt
GODEBUG=tracefpunwindoff=0 ../bin/go test -run '^$' -bench '.+PingPong' -count 20 -v -trace /dev/null ./runtime | tee tracefpunwindoff0.txt
goos: linux
goarch: amd64
pkg: runtime
cpu: Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
│ tracefpunwindoff1.txt │ tracefpunwindoff0.txt │
│ sec/op │ sec/op vs base │
PingPongHog-32 3782.5n ± 0% 740.7n ± 2% -80.42% (p=0.000 n=20)
For #16638
Change-Id: I2928a2fcd8779a31c45ce0f2fbcc0179641190bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/463835 Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>