Ian Lance Taylor [Mon, 13 Apr 2020 23:26:39 +0000 (23:26 +0000)]
Revert "time/tzdata: new package"
This reverts CL 224588.
Reason for revert: Test failing on secondary platforms.
Change-Id: Ic15fdc73a0d2b860e776733abb82c58809e13160
Reviewed-on: https://go-review.googlesource.com/c/go/+/228200 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Eric [Mon, 13 Apr 2020 21:28:47 +0000 (21:28 +0000)]
io: simplify Examples
- CopyN: 5 creates ambiguity with respect to whitespace and upperbound
- TeeReader less boilerplate and displays a common usage of it
- SectionReader_* all sections unified to 5:17 for clarity
- SectionReader_Seek uses io.Copy to stdout like other examples
- Seeker_Seek remove useless prints
- Pipe print reader like other examples
Updates #36417
Change-Id: Ibd01761d5a5786cdb1ea934f7a98f8302430c8a5
GitHub-Last-Rev: 4c17f9a8e32d89743b7eaec7c52032256972cc0a
GitHub-Pull-Request: golang/go#38379
Reviewed-on: https://go-review.googlesource.com/c/go/+/227868
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
cmd/link: don't split container symbols when write blocks
We split the output into blocks and write them in parallel. The
block boundary is placed at symbol boundary. In the case of outer
symbols and sub symbols, currently we may split an outer symbol
into two blocks. This will be bad, as the two blocks will have
overlapping address range, since outer symbol and its sub symbols
occupies the same address range.
Make sure we place block boundary only at top-level symbol
boundaries.
Cherry Zhang [Fri, 24 Jan 2020 15:43:09 +0000 (10:43 -0500)]
cmd/compile: debug rewrite
If -d=ssa/PASS/debug=N is specified (N >= 2) for a rewrite pass
(e.g. lower), when a Value (or Block) is rewritten, print the
Value (or Block) before and after.
Ian Lance Taylor [Sun, 22 Mar 2020 01:28:16 +0000 (18:28 -0700)]
time/tzdata: new package
Importing the time/tzdata package will embed a copy of the IANA
timezone database into the program. This will let the program work
correctly when the timezone database is not available on the system.
It will increase the size of the binary by about 800K.
You can also build a program with -tags timetzdata to embed the
timezone database in the program being built.
Michael Pratt [Tue, 7 Apr 2020 19:17:25 +0000 (15:17 -0400)]
runtime/pprof: clarify recursive inline heuristic
Following CL 226818, the compiler will allow inlining a single cycle in
an inline chain. Immediately-recursive functions are still disallowed,
which is what this heuristic refers to.
Add a regression test for this case.
Note that in addition to this check, if the compiler were to inline
multiple cycles via a loop (i.e., rather than appending duplicate code),
much more work would be required here to handle a single address
appearing in multiple different inline frames.
Updates #29737
Change-Id: I88de15cfbeabb9c04381e1c12cc36778623132a5
Reviewed-on: https://go-review.googlesource.com/c/go/+/227346
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dan Scales <danscales@google.com> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Michael Pratt [Tue, 7 Apr 2020 16:12:44 +0000 (12:12 -0400)]
runtime/pprof: try to use real stack in TestTryAdd
TestTryAdd is particularly brittle because it tests some real cases by
constructing fake sample stack frames. If those frames don't correctly
represent what the runtime would generate then they may fail to catch
regressions.
Instead, call runtime.Callers at the bottom of real function calls to
generate real frames as a base for truncation, etc in tests. Several of
these tests still have to fake parts of the frames to test the right
thing, but this is a bit less fragile.
This change is equivalent to the original 0dfb0513ec6a0e97db166bd91a2dc0a1ceb154f7 (golang.org/cl/227484), except
that the test skips if the test functions aren't inline (e.g., noopt
builders).
Change-Id: Ie9e32b5660cfe28a924f9cfcddcd887ea2effd66
Reviewed-on: https://go-review.googlesource.com/c/go/+/227922
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
Joel Sing [Mon, 13 Apr 2020 16:36:36 +0000 (02:36 +1000)]
cmd/compile: update TestIntendedInlining for riscv64
Mark nextFreeFast as not inline, as it is too expensive to inline on riscv64.
Also remove riscv64 from non-atomic inline architectures, as we now have
atomic intrisics.
Hana (Hyang-Ah) Kim [Fri, 10 Apr 2020 18:56:59 +0000 (14:56 -0400)]
cmd/trace: use the focustask mode for user task/region trace views
The taskid mode is based on the goroutine-oriented trace view,
which displays each goroutine as a separate row. This is good when
inspecting the interaction and timeline among related goroutines,
and the user region information (associated with each goroutine)
in detail, but when many goroutines are involved, this mode does
not scale.
The focustask mode is based on the default trace view with the
user task hierarchy at the top. Each row is a P and there are only
a handful number of Ps in most cases, so browsers can handle
this mode more gracefully. But, I had difficulty in displaying
the user region information (because a goroutine can start/stop/
migrate across Ps, and visualizing the stack of regions nicely
was complicated). It may be doable, but it's a work.
This CL surfaces the hidden focustask mode. Moreover, use it
as the default user task view mode. The taskid mode can be still
accessible through 'goroutine view' links.
Unlike taskid-based user annotation view that extends goroutine-based
trace view, the focustask view
Change-Id: Ib691a5e1dd14695fa70a0ae67bff62817025e8c3
Reviewed-on: https://go-review.googlesource.com/c/go/+/227921 Reviewed-by: Michael Matloob <matloob@golang.org>
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
HWCap and HWCap2 are no longer linknamed into package runtime. Also,
merge two sentences both starting with "These are..." and don't mention
any file name where archauxv is defined, as it become outdated if
support for a new $GOOS/$GOARCH combination is added. This is e.g.
already the case for arm64, where archauxv is also defined for
freebsd/arm64.
Change-Id: I9314a66633736b12e777869a832d8b79d442a6f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/228057 Reviewed-by: Ian Lance Taylor <iant@golang.org>
The repository of delve has already switched from the personal
account github.com/derekparker/delve to the organization account
github.com/go-delve/delve. According to go-delve/delve#1456.
Change-Id: Ie64f72c2808a8aca5059a75e2c2f11d8691e66b3
GitHub-Last-Rev: f90120c3b33f2134a572a62aaf291aa2abe58749
GitHub-Pull-Request: golang/go#38387
Reviewed-on: https://go-review.googlesource.com/c/go/+/227999 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Joel Sing [Sun, 12 Apr 2020 16:47:11 +0000 (02:47 +1000)]
cmd/compile: run TestLogOpt for riscv64 on amd64
Run TestLogOpt for riscv64 on amd64, as is done for other architectures.
This would have caught the test failure on riscv64 introduced in 47ade08141b23cfeafed92943e16012d5dc5eb8b.
Change-Id: If29dea2ef383b087154d046728f6d1c96811f5a3
Reviewed-on: https://go-review.googlesource.com/c/go/+/227806
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
cmd/compile: correct comment for len check when make slice
CL 226737 optimizes len check when make slice. The comment that cap is
constrainted to [0, 2^31) is not quite true, it's 31 or 63 depends on
whether it's 32/64-bit systems.
Change-Id: I6f54e41827ffe4d0b67a44975da3ce07b2fabbad
Reviewed-on: https://go-review.googlesource.com/c/go/+/227803
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
Keith Randall [Fri, 10 Apr 2020 21:07:13 +0000 (14:07 -0700)]
cmd/link: turn ASLR off for netbsd+race
The race detector can't handle ASLR (adddress space layout randomization).
On some platforms it can re-exec the binary with ASLR off. But not NetBSD.
For NetBSD we have to introduce a special ELF header note that tells
the kernel not to use ASLR.
This works fine for internal linking. For external linking it also works,
but "readelf -n" shows multiple notes in the resulting binary. Maybe the
last one wins? Not sure, but it appears to work.
David Chase [Mon, 11 Nov 2019 18:18:38 +0000 (13:18 -0500)]
cmd/compile: add explanations to escape-analysis JSON/LSP logging
For 1.15.
From the test:
{"range":{"start":{"line":7,"character":13},"end":{...},"severity":3,"code":"leaks","source":"go compiler","message":"parameter z leaks to ~r2 with derefs=0","relatedInformation":[
{"location":{"uri":"file://T/file.go","range":{"start":{"line":9,"character":13},"end":{...}},"message":"escflow: flow: y = z:"},
{"location":{"uri":"file://T/file.go","range":{"start":{"line":9,"character":13},"end":{...}},"message":"escflow: from y = \u003cN\u003e (assign-pair)"},
{"location":{"uri":"file://T/file.go","range":{"start":{"line":9,"character":13},"end":{...}},"message":"escflow: flow: ~r1 = y:"},
{"location":{"uri":"file://T/file.go","range":{"start":{"line":4,"character":11},"end":{...}},"message":"inlineLoc"},
{"location":{"uri":"file://T/file.go","range":{"start":{"line":9,"character":13},"end":{...}},"message":"escflow: from y.b (dot of pointer)"},
{"location":{"uri":"file://T/file.go","range":{"start":{"line":4,"character":11},"end":{...}},"message":"inlineLoc"},
{"location":{"uri":"file://T/file.go","range":{"start":{"line":9,"character":13},"end":{...}},"message":"escflow: from \u0026y.b (address-of)"},
{"location":{"uri":"file://T/file.go","range":{"start":{"line":4,"character":9},"end":...}},"message":"inlineLoc"},
{"location":{"uri":"file://T/file.go","range":{"start":{"line":9,"character":13},"end":{...}},"message":"escflow: from ~r1 = \u003cN\u003e (assign-pair)"},
{"location":{"uri":"file://T/file.go","range":{"start":{"line":9,"character":3},"end":...}},"message":"escflow: flow: ~r2 = ~r1:"},
{"location":{"uri":"file://T/file.go","range":{"start":{"line":9,"character":3},"end":...}},"message":"escflow: from return (*int)(~r1) (return)"}]}
Lynn Boger [Tue, 31 Mar 2020 14:08:29 +0000 (10:08 -0400)]
cmd/internal/obj/ppc64: add support for pcalign 32 on ppc64x
Previous PCALIGN support on ppc64x only accepted 8 and 16 byte
alignment since the default function alignment was 16. Now that
the function's alignment can be set to a larger value when needed,
PCALIGN can accept 32. When this happens then the function's
alignment will be changed to 32.
Test has been updated to recognized this new value.
Andy Pan [Fri, 10 Apr 2020 02:45:58 +0000 (10:45 +0800)]
runtime: replace the type of netpollWakeSig from a uintptr to a uint32
There's no need for netpollWakeSig to use a uintptr type, a uint32 is enough.
Relevant CL: CL 212737
Change-Id: Ide24478b217a02bad62f7e000a9680c26a8c5366
Reviewed-on: https://go-review.googlesource.com/c/go/+/227798
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Keith Randall [Thu, 19 Mar 2020 23:25:08 +0000 (16:25 -0700)]
cmd/compile: start implementing strongly typed aux and auxint fields
Right now the Aux and AuxInt fields of ssa.Values are typed as
interface{} and int64, respectively. Each rule that uses these values
must cast them to the type they actually are (*obj.LSym, or int32, or
ValAndOff, etc.), use them, and then cast them back to interface{} or
int64.
We know for each opcode what the types of the Aux and AuxInt fields
should be. So let's modify the rule generator to declare the types to
be what we know they should be, autoconverting to and from the generic
types for us. That way we can make the rules more type safe.
It's difficult to make a single CL for this, so I've coopted the "=>"
token to indicate a rule that is strongly typed. "->" rules are
processed as before. That will let us migrate a few rules at a time in
separate CLs. Hopefully we can reach a state where all rules are
strongly typed and we can drop the distinction.
This CL changes just a few rules to get a feel for what this
transition would look like.
I've decided not to put explicit types in the rules. I think it
makes the rules somewhat clearer, but definitely more verbose.
In particular, the passthrough rules that don't modify the fields
in question are verbose for no real reason.
Change-Id: I63a1b789ac5702e7caf7934cd49f784235d1d73d
Reviewed-on: https://go-review.googlesource.com/c/go/+/190197
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
Than McIntosh [Fri, 13 Mar 2020 15:09:05 +0000 (11:09 -0400)]
test: deflaking measures for runtime gdb test
Tweak the runtime's GDB python test to try to reduce flake failures.
Background: the intent of the testpoint in question is to make sure
that python-supported commands like "info goroutines" or "goroutine 1
backtrace" work properly. The Go code being run under the debugger as
part of the test is single-threaded, but the test is written assuming
that in addition to the primary goroutine there will be other
background goroutines available (owned by the runtime). The flakiness
seems to crop up the most when requesting a backtrace for one of these
background goroutines; the speculation is that if we catch a
runtime-owned goroutine in an odd state, this could interfere with the
test.
The change in this patch is to explicitly start an additional
goroutine from the main thread, so that when the debugger stops the
main thread we can be sure that there is some other non-main goroutine
in a known state.
This change authored by Josh Bleecher Snyder <josharian@gmail.com>.
net: convert many Close tests to use parallel subtests
Also set a deadline in TestCloseWrite so that we can more easily
determine which kind of connection is getting stuck on the
darwin-arm64-corellium builder (#34837).
Change-Id: I8ccacbf436e8e493fb2298a79b17e0af8fc6eb81
Reviewed-on: https://go-review.googlesource.com/c/go/+/227588
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
cmd/compile: do not allocate bucket for non-escaping map
For map with hint larger than BUCKETSIZE, makemap ignore allocated
bucket and allocate buckets itself. So do not allocate bucket in
this case, save us the cost of zeroing+assignment to the bucket.
The previous change moved code around to create slicesym.
This change simplifies slicesym and its callsites
by accepting an int64 for lencap instead of a node,
and by removing all the calls to gdata.
It also stops modifying n,
which avoids the need to make a copy of it.
cmd/compile,runtime: pass only ptr and len to some runtime calls
Some runtime calls accept a slice, but only use ptr and len.
This change modifies most such routines to accept only ptr and len.
After this change, the only runtime calls that accept an unnecessary
cap arg are concatstrings and slicerunetostring.
Neither is particularly common, and both are complicated to modify.
Negligible compiler performance impact. Shrinks binaries a little.
There are only a few regressions; the one I investigated was
due to register allocation fluctuation.
Passes 'go test -race std cmd', modulo #38265 and #38266.
Wow, does that take a long time to run.
cmd/compile: handle some additional phis in shortcircuit
Prior to this change, the shortcircuit pass could only
handle blocks containing only a single phi control value,
possibly wrapped in some OpNot and OpCopy values.
This change partially lifts this limitation.
It handles some cases in which the block contains other phi values.
This appears to happen most commonly in cases in which
the conditionals being checked involve the memory state,
in which case there is a phi memory value in the block.
The general idea here is to use the information we have about
the CFG to (1) move the other phi values into other blocks
and/or (2) rewrite uses of the other phi values in other blocks.
For example, consider this CFG:
p q
\ /
b
/ \
t u
And consider a phi value v in block b.
We'll write v = Phi(p: x, q: y) to say that v has value x corresponding
to inbound block p, and value y for block q.
We will rewrite this CFG to:
p q
| /
| b
|/ \
t u
What should we do with v?
Any uses of v in u can be replaced with y. Why?
If we are in block u, we came from b, and before that from q.
If prior to b we came from p, then we would have gone to t, not u.
Since we came from q, we know that v took the value y.
Uses of v in t are a bit more complicated.
It is going to end up being a phi value: Phi(p: ?, b: ?).
Suppose, after the rewrite, we came from block p.
Then, before the rewrite, we would have gone to b,
where v would have the value x.
So we have Phi(p: x, b: ?).
Suppose, after the rewrite, we came from block b.
Then we must have come from block q.
If we come from block q, v has value y.
So we have Phi(p: x, b: y).
Uses of v in t can thus be replaced with a new phi value,
with the same values as v, but with altered predecessors.
Similar reasoning can be employed to rewrite or replace
other uses of v elsewhere in the CFG, so that v itself can be eliminated,
and the CFG rewrite can proceed.
This change sets up the infrastructure for such optimizations
and adds a few cheap ones. All optimizations in this change depend
only on the shape of the CFG; future changes may also depend on where
v's uses are. That analysis is more powerful but more expensive,
and should be done incrementally.
The use of closures here is perhaps a bit unusual,
but during development it proved critical to having readable code.
We must decide early on whether we can safely do the CFG modifications,
and then later fix up the phis if so.
Safely storing state and decisions across these two phases is hard to do readably.
Closures solve the problem neatly.
I manually instrumented the code paths in shortcircuitPhiPlan.
During make.bash there are nearly 6000 invocations.
The least-visited code path gets run 85 times,
so all the code in this CL is reasonably well-exercised.
Here is a concrete example of code improved by this change:
func f(e interface{}) int {
if x, ok := e.(int); ok {
return x
}
return 0
}
Omitting PCDATA, FUNCDATA, and the like, it used to compile to:
I noticed a timeout in TestIgnore in
https://build.golang.org/log/52d83a72f3a5ea9a16eb5d670c729694144f9624,
which suggests that the settle time is currently set too low.
I've also added a check for the same GO_TEST_TIMEOUT_SCALE used in
TestTerminalSignal, so that if this builder remains too slow we can
increase the builder's scale factor rather than the test's baseline
running time.
Updates #33174
Change-Id: I18b10eaa3bb5ae2f604300aedaaf6f79ee7ad567
Reviewed-on: https://go-review.googlesource.com/c/go/+/227649
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
net/url: add URL.Redacted to return a password scrubbed string
Returning an URL.String() without the password is very useful for
situations where the URL is supposed to be logged and the password is
not useful to be shown.
This method re-uses URL.String() but with the password scrubbed and
substituted for a "xxxxx" in order to make it obvious that there was a
password. If the URL had no password then no "xxxxx" will be shown.
Ruixin(Peter) Bao [Thu, 21 Nov 2019 15:44:23 +0000 (10:44 -0500)]
cmd/compile: allow floating point Ops to produce flags on s390x
On s390x, some floating point arithmetic instructions (FSUB, FADD) generate flag.
This patch allows those related SSA ops to return a tuple, where the second argument of
the tuple is the generated flag. We can use the flag and remove the
subsequent comparison instruction (e.g: LTDBR).
This CL also reduces the .text section for math.test binary by 0.4KB.
Rémy Oudompheng [Thu, 5 Mar 2020 06:59:00 +0000 (07:59 +0100)]
math/big: correct off-by-one access in divBasic
The divBasic function computes the quotient of big nats u/v word by word.
It estimates each word qhat by performing a long division (top 2 words of u
divided by top word of v), looks at the next word to correct the estimate,
then perform a full multiplication (qhat*v) to catch any inaccuracy in the
estimate.
In the latter case, "negative" values appear temporarily and carries
must be carefully managed, and the recursive division refactoring
introduced a case where qhat*v has the same length as v, triggering an
out-of-bounds write in the case it happens when computing the top word
of the quotient.
Fixes #37499
Change-Id: I15089da4a4027beda43af497bf6de261eb792f94
Reviewed-on: https://go-review.googlesource.com/c/go/+/221980 Reviewed-by: Robert Griesemer <gri@golang.org>
The cgo build tag is not necessary for root_darwin_arm64.go. We can't
build for darwin/arm64 without cgo, and even if we did 1) this code
would work fine 2) the no-cgo code that shells out to
/usr/bin/security would not work.
Michael Matloob [Fri, 14 Feb 2020 22:44:00 +0000 (17:44 -0500)]
cmd/go: allow configuring module cache directory with GOMODCACHE
Adds a GOMODCACHE environment variable that's used by cmd/go to determine the
location of the module cache. The default value of GOMODCACHE will be
GOPATH[0]/pkg/mod, the default location of the module cache before this change.
Replace the cmd/go/internal/modfetch.PkgMod variable which previously held the
location of the module cache with the new cmd/go/internal/cfg.GOMODCACHE
variable, for consistency with many of the other environment variables that
affect the behavior of cmd/go. (Most of the changes in this CL are due to
moving/renaming the variable.)
The value of cfg.GOMODCACHE is now set using a variable initializer. It was
previously set in cmd/go/internal/modload.Init.
The location of GOPATH/pkg/sumdb is unchanged by this CL. While it was
previously determined using the value of PkgMod, it now is determined
independently dirctly from the value of GOPATH[0].
Fixes #34527
Change-Id: Id4d31d217b3507d6057c8ef7c52af1a0606603e4
Reviewed-on: https://go-review.googlesource.com/c/go/+/219538
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com> Reviewed-by: Jay Conrod <jayconrod@google.com>
Michael Pratt [Tue, 7 Apr 2020 16:12:44 +0000 (12:12 -0400)]
runtime/pprof: try to use real stack in TestTryAdd
TestTryAdd is particularly brittle because it tests some real cases by
constructing fake sample stack frames. If those frames don't correctly
represent what the runtime would generate then they may fail to catch
regressions.
Instead, call runtime.Callers at the bottom of real function calls to
generate real frames as a base for truncation, etc in tests. Several of
these tests still have to fake parts of the frames to test the right
thing, but this is a bit less fragile.
Change-Id: I62522a9ded5544b06d1bf28550af5400f3af667b
Reviewed-on: https://go-review.googlesource.com/c/go/+/227484
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Xiangdong Ji [Fri, 27 Mar 2020 11:04:21 +0000 (11:04 +0000)]
runtime: fix infinite callstack of cgo on arm64
This change adds CFA information to the assembly function 'crosscall1'
and reorgnizes its code to establish well-formed prologue and epilogue.
It will fix an infinite callstack issue when debugging cgo program with
GDB on arm64.
Brief root cause analysis:
GDB's aarch64 unwinder parses prologue to determine current frame's size
and previous PC&SP if CFA information is not available.
The unwinder parses the prologue of 'crosscall1' to determine a frame size
of 0x10, then turns to its next frame trying to compute its previous PC&SP
as they are not saved on current frame's stack as per its 'traditional frame
unwind' rules, which ends up getting an endless frame chain like:
[callee] : pc:<pc0>, sp:<sp0>
crosscall1: pc:<pc1>, sp:<sp0>+0x10
[caller] : pc:<pc1>, sp:<sp0>+0x10+0x10
[caller] : pc:<pc1>, sp:<sp0>+0x10+0x10+0x10
...
GDB fails to detect the 'caller' frame is same as 'crosscall1' and terminate
unwinding since SP increases everytime.
Fixes #37238
Change-Id: Ia6bd8555828541a3a61f7dc9b94dfa00775ec52a
Reviewed-on: https://go-review.googlesource.com/c/go/+/226999
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Dan Scales [Thu, 14 Nov 2019 01:34:47 +0000 (17:34 -0800)]
runtime: static lock ranking for the runtime (enabled by GOEXPERIMENT)
I took some of the infrastructure from Austin's lock logging CR
https://go-review.googlesource.com/c/go/+/192704 (with deadlock
detection from the logs), and developed a setup to give static lock
ranking for runtime locks.
Static lock ranking establishes a documented total ordering among locks,
and then reports an error if the total order is violated. This can
happen if a deadlock happens (by acquiring a sequence of locks in
different orders), or if just one side of a possible deadlock happens.
Lock ordering deadlocks cannot happen as long as the lock ordering is
followed.
Along the way, I found a deadlock involving the new timer code, which Ian fixed
via https://go-review.googlesource.com/c/go/+/207348, as well as two other
potential deadlocks.
See the constants at the top of runtime/lockrank.go to show the static
lock ranking that I ended up with, along with some comments. This is
great documentation of the current intended lock ordering when acquiring
multiple locks in the runtime.
I also added an array lockPartialOrder[] which shows and enforces the
current partial ordering among locks (which is embedded within the total
ordering). This is more specific about the dependencies among locks.
I don't try to check the ranking within a lock class with multiple locks
that can be acquired at the same time (i.e. check the ranking when
multiple hchan locks are acquired).
Currently, I am doing a lockInit() call to set the lock rank of most
locks. Any lock that is not otherwise initialized is assumed to be a
leaf lock (a very high rank lock), so that eliminates the need to do
anything for a bunch of locks (including all architecture-dependent
locks). For two locks, root.lock and notifyList.lock (only in the
runtime/sema.go file), it is not as easy to do lock initialization, so
instead, I am passing the lock rank with the lock calls.
For Windows compilation, I needed to increase the StackGuard size from
896 to 928 because of the new lock-rank checking functions.
Checking of the static lock ranking is enabled by setting
GOEXPERIMENT=staticlockranking before doing a run.
To make sure that the static lock ranking code has no overhead in memory
or CPU when not enabled by GOEXPERIMENT, I changed 'go build/install' so
that it defines a build tag (with the same name) whenever any experiment
has been baked into the toolchain (by checking Expstring()). This allows
me to avoid increasing the size of the 'mutex' type when static lock
ranking is not enabled.
Fixes #38029
Change-Id: I154217ff307c47051f8dae9c2a03b53081acd83a
Reviewed-on: https://go-review.googlesource.com/c/go/+/207619 Reviewed-by: Dan Scales <danscales@google.com> Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dan Scales <danscales@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Michael Munday [Fri, 6 Mar 2020 22:20:45 +0000 (22:20 +0000)]
cmd/compile: delete the floating point Greater and Geq ops
Extend CL 220417 (which removed the integer Greater and Geq ops) to
floating point comparisons. Greater and Geq can always be
implemented using Less and Leq.
Fixes #37316.
Change-Id: Ieaddb4877dd0ff9037a1dd11d0a9a9e45ced71e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/222397
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
cmd/compile: use MOVBQZX for OpAMD64LoweredHasCPUFeature
In the commit message of CL 212360, I wrote:
> This new intrinsic ... generates MOVB+TESTB+NE.
> (It is possible that MOVBQZX+TESTQ+NE would be better.)
I should have tested. MOVBQZX+TESTQ+NE does in fact appear to be better.
For the benchmark in #36196, on my machine:
name old time/op new time/op delta
FMA-8 0.86ns ± 6% 0.70ns ± 5% -18.79% (p=0.000 n=98+97)
NonFMA-8 0.61ns ± 5% 0.60ns ± 4% -0.74% (p=0.001 n=100+97)
Interestingly, these are both considerably faster than
the measurements I took a couple of months ago (1.4ns/2ns).
It appears that CL 219131 (clearing VZEROUPPER in asyncPreempt) helped a lot.
And FMA is now once again slower than NonFMA, although this change
helps it regain some ground.
Ruixin(Peter) Bao [Tue, 7 Apr 2020 13:20:08 +0000 (09:20 -0400)]
cmd/internal: add MVCIN instruction to s390x assembler
On s390x, we already have MVCIN opcode in asmz.go,
but we did not use it. This CL uses that opcode and adds MVCIN
instruction.
MVCIN instruction can be used to move data from one storage location
to another while reversing the order of bytes within the field. This
could be useful when transforming data from little-endian to big-endian.
Change-Id: Ifa1a911c0d3442f4a62f91f74ed25b196d01636b
Reviewed-on: https://go-review.googlesource.com/c/go/+/227478 Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The darwin/arm port is removed in Go 1.15. Setting GOOS=darwin
GOARCH=arm will fail, therefore "go test cmd/link" on macOS will
fail (in non -short mode). Remove this test point.
Updates #37611.
Change-Id: Ia9531c4b4a6692a0c49153517af9fdddd1f3e0bf
Reviewed-on: https://go-review.googlesource.com/c/go/+/227341 Reviewed-by: Ian Lance Taylor <iant@golang.org>
cmd/compile: lay out exit post-dominated blocks at the end
Complete a long-standing TODO in the code.
Exit blocks are cold code, so we lay them out at the end of the function.
Blocks that are post-dominated by exit blocks are also ipso facto exit blocks.
Treat them as such.
Implement using a simple loop, because there are generally very few exit blocks.
In addition to improved instruction cache, this empirically yields
better register allocation.
This also appears to improve compiler performance
(-0.15% geomean time/op, -1.20% geomean user time/op),
but that could just be alignment effects.
Compiler benchmarking hasn't been super reliably recently,
and there's no particular reason to think this should
speed up the compiler that much.
Multiple instances of testDWARF run in parallel, with a shared
backing store of the env input slice. Do modification of the
environment locally, instead of on the shared slice.
'go get' will now check absolute paths without wildcards the same way
it checks relative paths. modload.DirImportPath may be used for both
without converting path separators.
Fixes #38038
Change-Id: I453299898ece58f3b5002a5e80021d6bfe815fdd
Reviewed-on: https://go-review.googlesource.com/c/go/+/226857
Run-TryBot: Jay Conrod <jayconrod@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com> Reviewed-by: Michael Matloob <matloob@golang.org>
Lynn Boger [Mon, 30 Mar 2020 19:23:19 +0000 (15:23 -0400)]
cmd/compile: improve lowered moves and zeros for ppc64le
This change includes the following:
- Generate LXV/STXV sequences instead of LXVD2X/STXVD2X on power9.
These instructions do not require an index register, which
allows more loads and stores within a loop without initializing
multiple index registers. The LoweredQuadXXX generate LXV/STXV.
- Create LoweredMoveXXXShort and LoweredZeroXXXShort for short
moves that don't generate loops, and therefore don't clobber the
address registers or flags.
- Use registers other than R3 and R4 to avoid conflicting with
registers that have already been allocated to avoid unnecessary
register moves.
- Eliminate the use of R14 as scratch register and use R31
instead.
- Add PCALIGN when the LoweredMoveXXX or LoweredZeroXXX generates a
loop with more than 3 iterations.
This performance opportunity was noticed in github.com/golang/snappy
benchmarks. Results on power9:
If the final pass(es) are identical during ssa.html generation,
they are persisted in-memory as "pendingPhases" but never get
written as a column in the html. This change flushes those
in-memory phases.
Change-Id: I05c73d848b8f40dc864a18c733ca3f47b1eab54d
Reviewed-on: https://go-review.googlesource.com/c/go/+/227004 Reviewed-by: Ian Lance Taylor <iant@golang.org>
cmd/compile: refactor around HTMLWriter removing logger in favor of Func
Replace HTMLWriter's Logger field with a *Func. Implement Fatalf method
for HTMLWriter which gets the Frontend() from the Func and calls down
into it's Fatalf method, passing the msg and args along. Replace
remaining calls to the old Logger with calls to logging methods on
the Func.
cmd/compile: add intrinsic HasCPUFeature for checking cpu features
Before using some CPU instructions, we must check for their presence.
We use global variables in the runtime package to record features.
Prior to this CL, we issued a regular memory load for these features.
The downside to this is that, because it is a regular memory load,
it cannot be hoisted out of loops or otherwise reordered with other loads.
This CL introduces a new intrinsic just for checking cpu features.
It still ends up resulting in a memory load, but that memory load can
now be floated to the entry block and rematerialized as needed.
One downside is that the regular load could be combined with the comparison
into a CMPBconstload+NE. This new intrinsic cannot; it generates MOVB+TESTB+NE.
(It is possible that MOVBQZX+TESTQ+NE would be better.)
This CL does only amd64. It is easy to extend to other architectures.
For the benchmark in #36196, on my machine, this offers a mild speedup.
name old time/op new time/op delta
FMA-8 1.39ns ± 6% 1.29ns ± 9% -7.19% (p=0.000 n=97+96)
NonFMA-8 2.03ns ±11% 2.04ns ±12% ~ (p=0.618 n=99+98)
Updates #15808
Updates #36196
Change-Id: I75e2fcfcf5a6df1bdb80657a7143bed69fca6deb
Reviewed-on: https://go-review.googlesource.com/c/go/+/212360
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>