]> Cypherpunks.ru repositories - gostls13.git/log
gostls13.git
6 months agopath/filepath: fix various issues in parsing Windows paths
Damien Neil [Fri, 1 Sep 2023 18:17:19 +0000 (11:17 -0700)]
path/filepath: fix various issues in parsing Windows paths

On Windows, A root local device path is a path which begins with
\\?\ or \??\.  A root local device path accesses the DosDevices
object directory, and permits access to any file or device on the
system. For example \??\C:\foo is equivalent to common C:\foo.

The Clean, IsAbs, IsLocal, and VolumeName functions did not
recognize root local device paths beginning with \??\.

Clean could convert a rooted path such as \a\..\??\b into
the root local device path \??\b. It will now convert this
path into .\??\b.

IsAbs now correctly reports paths beginning with \??\
as absolute.

IsLocal now correctly reports paths beginning with \??\
as non-local.

VolumeName now reports the \??\ prefix as a volume name.

Join(`\`, `??`, `b`) could convert a seemingly innocent
sequence of path elements into the root local device path
\??\b. It will now convert this to \.\??\b.

In addition, the IsLocal function did not correctly
detect reserved names in some cases:

  - reserved names followed by spaces, such as "COM1 ".
  - "COM" or "LPT" followed by a superscript 1, 2, or 3.

IsLocal now correctly reports these names as non-local.

Fixes #63713
Fixes CVE-2023-45283
Fixes CVE-2023-45284

Change-Id: I446674a58977adfa54de7267d716ac23ab496c54
Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/2040691
Reviewed-by: Roland Shoemaker <bracewell@google.com>
Reviewed-by: Tatiana Bradley <tatianabradley@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/540277
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Heschi Kreinick <heschi@google.com>

6 months agocmd/internal/obj/riscv: support subtraction with a constant
Joel Sing [Fri, 25 Aug 2023 18:19:40 +0000 (04:19 +1000)]
cmd/internal/obj/riscv: support subtraction with a constant

Allow SUB and SUBW to be specified with a constant, which are mapped
to ADDI and ADDIW with negated values.

Change-Id: I7dc55692febc81ea87393b0a3a7d23a43c30313b
Reviewed-on: https://go-review.googlesource.com/c/go/+/538915
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: M Zhuo <mzh@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Wang Yaduo <wangyaduo@linux.alibaba.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
6 months agosyscall: provide and use ioctlPtr for all BSD platforms
Joel Sing [Sat, 4 Nov 2023 15:32:08 +0000 (02:32 +1100)]
syscall: provide and use ioctlPtr for all BSD platforms

Provide ioctlPtr for all BSD platforms, then use this for BPF.
This reduces darwin specific code, as well as avoiding the use of
an indirect system call on OpenBSD.

Updates #63900

Change-Id: I81f3e74a3149150abe972f106903310e3cf26929
Reviewed-on: https://go-review.googlesource.com/c/go/+/540019
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Josh Rickmar <jrick@zettaport.com>
6 months agosyscall: provide and use fcntlPtr for all BSD platforms
Joel Sing [Sat, 4 Nov 2023 15:19:46 +0000 (02:19 +1100)]
syscall: provide and use fcntlPtr for all BSD platforms

Provide fcntlPtr for all BSD platforms, then use this for FcntlFlock.
This reduces darwin and openbsd specific code, as well as avoiding
the use of an indirect system call on OpenBSD.

Updates #63900

Change-Id: I5c701f0d8413fab5477b9e21381395621d1fb6d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/540018
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Josh Rickmar <jrick@zettaport.com>
6 months agoboring: update documentation to include arm64
Christian Kruse [Thu, 2 Nov 2023 22:32:08 +0000 (15:32 -0700)]
boring: update documentation to include arm64

Support for boring has been extended to include linux/arm64. This change
updates the docs to reflect that.

Fixes #63920

Change-Id: If8d6eca713e8245dcc222c3e38d140874d48725d
Reviewed-on: https://go-review.googlesource.com/c/go/+/539298
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agocmd/internal/obj/riscv: fix the offset of JALR transformed from JAL
Wang Yaduo [Fri, 27 Oct 2023 06:48:25 +0000 (14:48 +0800)]
cmd/internal/obj/riscv: fix the offset of JALR transformed from JAL

Currently, the offset of JALR is zero all the time, which is transformed
from JAL with over ±1MB offset. This causes the segment fault for the
wrong address.

Change-Id: I4dcb3eb13bd1ea71e9eb27f07c03ffec376608ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/538135
Run-TryBot: M Zhuo <mzh@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: M Zhuo <mzh@golangcn.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
6 months agogo/version: add new package
Russ Cox [Tue, 31 Oct 2023 23:51:18 +0000 (19:51 -0400)]
go/version: add new package

go/version provides basic comparison of Go versions,
for use when deciding whether certain language features
are allowed, and so on.

See the proposal issue #62039 for more details.

Fixes #62039

Change-Id: Ibdfd4fe15afe406c46da568cb31feb42ec30b530
Reviewed-on: https://go-review.googlesource.com/c/go/+/538895
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
6 months agonet/http/cgi: the PATH_INFO should be empty or start with a slash
aimuz [Fri, 3 Nov 2023 23:42:21 +0000 (23:42 +0000)]
net/http/cgi: the PATH_INFO should be empty or start with a slash

fixed PATH_INFO not starting with a slash as described in RFC 3875
for PATH_INFO.

Fixes #63925

Change-Id: I1ead98dff190c53eb7a50546569ef6ded3199a0a
GitHub-Last-Rev: 1c532e330b0d74ee42afc412611a005bc565bb26
GitHub-Pull-Request: golang/go#63926
Reviewed-on: https://go-review.googlesource.com/c/go/+/539615
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
Auto-Submit: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agonet/http: remove Content-Encoding header in roundtrip_js
carl.tao [Thu, 21 Sep 2023 15:38:33 +0000 (23:38 +0800)]
net/http: remove Content-Encoding header in roundtrip_js

The fetch api will decode the gzip, but Content-Encoding not be deleted.
To ensure that the behavior of roundtrip_js is consistent with native. delete the Content-Encoding header when the response body is decompressed by js fetch api.

Fixes #63139

Change-Id: Ie35b3aa050786e2ef865f9ffa992e30ab060506e
Reviewed-on: https://go-review.googlesource.com/c/go/+/530155
Commit-Queue: Damien Neil <dneil@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Damien Neil <dneil@google.com>

6 months agonet/netip: allow only valid prefix digits in ParsePrefix
Mauri de Souza Meneguzzo [Wed, 1 Nov 2023 23:19:39 +0000 (23:19 +0000)]
net/netip: allow only valid prefix digits in ParsePrefix

The prefix bits for a call to ParsePrefix are passed raw to
strconv.Atoi, this means that it can accept +- signs as well as leading
zeroes, which are not allowed prefix values following RFC 4632 Section
3.1 and RFC 4291 Section 2.3.

Validate non-digit characters as well as leading zeroes and return an
error accordingly.

Fixes #63850

Change-Id: I412a7e1cecc6ee9ea1582d4b04cb40d79ee714f1
GitHub-Last-Rev: 462d97fc5f412e18376356dbc10b63711c084144
GitHub-Pull-Request: golang/go#63859
Reviewed-on: https://go-review.googlesource.com/c/go/+/538860
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agonet/http: set/override Content-Length for encoded range requests
Mitar [Mon, 6 Nov 2023 11:00:51 +0000 (11:00 +0000)]
net/http: set/override Content-Length for encoded range requests

Currently, http.ServeContent returns invalid Content-Length header if:

* Request is a range request.
* Content is encoded (e.g., gzip compressed).
* Content-Length of the encoded content has been set before calling
  http.ServeContent, as suggested in https://github.com/golang/go/issues/19420.

Example:

w.Header().Set("Content-Type", "application/json")
w.Header().Set("Content-Length", strconv.Itoa(len(compressedJsonBody)))
w.Header().Set("Content-Encoding", "gzip")
w.Header().Set("Etag", etag)
http.ServeContent(
w, req, "", time.Time{},
bytes.NewReader(compressedJsonBody),
)

The issue is that http.ServeContent currently sees Content-Length as
something optional when Content-Encoding is set, but that is a problem
with range request which can send a payload of different size. So this
reverts https://go.dev/cl/4538111 and makes Content-Length be set
always to the number of bytes which will actually be send (both for
range and non-range requests).

Without this fix, this is an example response:

HTTP/1.1 206 Partial Content
Accept-Ranges: bytes
Content-Encoding: gzip
Content-Length: 351
Content-Range: bytes 100-350/351
Content-Type: application/json; charset=UTF-8
Etag: "amCTP_vgT5PQt5OsAEI7NFJ6Hx1UfEpR5nIaYEInfOA"
Date: Sat, 29 Jan 2022 14:42:15 GMT

As you see, Content-Length is invalid and should be 251.

Change-Id: I4d2ea3a8489a115f92ef1f7e98250d555b47a94e
GitHub-Last-Rev: 3aff9126f5d62725c7d539df2d0eb2b860a84ca6
GitHub-Pull-Request: golang/go#50904
Reviewed-on: https://go-review.googlesource.com/c/go/+/381956
Reviewed-by: Damien Neil <dneil@google.com>
Run-TryBot: t hepudds <thepudds1460@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>

6 months agonet/http/httptest: remove unnecessary creation of http.Transport
Zeke Lu [Wed, 1 Nov 2023 13:47:23 +0000 (13:47 +0000)]
net/http/httptest: remove unnecessary creation of http.Transport

In (*Server).StartTLS, it's unnecessary to create an http.Client
with a Transport, because a new one will be created with the
TLSClientConfig later.

Change-Id: I086e28717e9739787529006c3f0296c8224cd790
GitHub-Last-Rev: 33724596bd901a05a91654f8c2df233aa6563ea6
GitHub-Pull-Request: golang/go#60124
Reviewed-on: https://go-review.googlesource.com/c/go/+/494355
Run-TryBot: t hepudds <thepudds1460@gmail.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: qiulaidongfeng <2645477756@qq.com>
6 months agocmd/go/internal/vcs: error out if the requested repo does not support a secure protocol
Bryan C. Mills [Thu, 2 Nov 2023 19:06:35 +0000 (15:06 -0400)]
cmd/go/internal/vcs: error out if the requested repo does not support a secure protocol

Fixes #63845.

Change-Id: If86d6b13d3b55877b35c087112bd76388c9404b8
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/539321
Reviewed-by: Michael Matloob <matloob@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>

6 months agocmd/link/internal/loadpe: allocate comdat definitions map lazily
Than McIntosh [Mon, 6 Nov 2023 19:18:43 +0000 (14:18 -0500)]
cmd/link/internal/loadpe: allocate comdat definitions map lazily

Switch the "comdatDefinitions" map to lazy allocation; we only need it
for loading PE objects, no point doing an allocation during package
init if we don't need it.

Change-Id: Ie33f2c56e964f35ac2e137840ac021cfaaa897c6
Reviewed-on: https://go-review.googlesource.com/c/go/+/540255
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
6 months agoruntime/internal/atomic: add 386/amd64 And/Or operators
Mauri de Souza Meneguzzo [Wed, 1 Nov 2023 02:36:36 +0000 (02:36 +0000)]
runtime/internal/atomic: add 386/amd64 And/Or operators

This CL adds the atomic primitives for the And/Or operators on x86-64.
It also includes missing benchmarks for the ops.

For #61395

Change-Id: I23ef5192866d21fc3a479d0159edeafc3aeb5c47
GitHub-Last-Rev: df800be1925a9f3929456844b4e6d1524e627990
GitHub-Pull-Request: golang/go#62621
Reviewed-on: https://go-review.googlesource.com/c/go/+/528315
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Mauri de Souza Meneguzzo <mauri870@gmail.com>

6 months agounicode: add available godoc link
cui fliter [Sat, 4 Nov 2023 08:42:48 +0000 (16:42 +0800)]
unicode: add available godoc link

Change-Id: I2273274249f05b0492950c27dc5a654422cefc79
Reviewed-on: https://go-review.googlesource.com/c/go/+/539856
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Run-TryBot: shuang cui <imcusg@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
6 months agocmd/compile: adding rule to eliminate ANDCCconst
Jayanth Krishnamurthy [Wed, 1 Nov 2023 19:43:42 +0000 (14:43 -0500)]
cmd/compile: adding rule to eliminate ANDCCconst

For example, the Slicemask rule in PPC64 generates a sequence wherein there is andi operation, after an  sradi, which can be replaced by srdi. This new rule eliminates ANDCCconst.

Change-Id: I27aaadf76b9c749a60bcdc5e87b1ebb8167d2fd4
Reviewed-on: https://go-review.googlesource.com/c/go/+/539055
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Jayanth Krishnamurthy <jayanth.krishnamurthy@ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
6 months agoplugin: add available godoc link
cui fliter [Fri, 3 Nov 2023 11:19:58 +0000 (19:19 +0800)]
plugin: add available godoc link

Change-Id: I371b52215d3f9efdcab1439e7215f340dbf1ec08
Reviewed-on: https://go-review.googlesource.com/c/go/+/539598
Run-TryBot: shuang cui <imcusg@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
6 months agoruntime: add crash stack support for mips64x
Mauri de Souza Meneguzzo [Thu, 2 Nov 2023 22:15:03 +0000 (22:15 +0000)]
runtime: add crash stack support for mips64x

Change-Id: I240ea7dd6430f4c89cfdadbfa790e4a70a4fd79d
GitHub-Last-Rev: 585742b5eebad7c03ec313b590e2171491b78b37
GitHub-Pull-Request: golang/go#63905
Reviewed-on: https://go-review.googlesource.com/c/go/+/539295
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Mauri de Souza Meneguzzo <mauri870@gmail.com>

6 months agocmd/internal/asm/ppc64: avoid generating exser nops
Paul E. Murphy [Thu, 26 Oct 2023 14:23:12 +0000 (09:23 -0500)]
cmd/internal/asm/ppc64: avoid generating exser nops

"OR $0, R31, R31" is the execution serializing nop called "exser"
on ISA 3.1 processors such as Power10.

In general, the "OR $0, Rx, Rx" where Rx != 0 form should be avoided
unless used explicitly for the uarch side-effects.

Change-Id: Id76e3a703c902676ba4a3ffb64dd90dad9a320bf
Reviewed-on: https://go-review.googlesource.com/c/go/+/537855
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
6 months agosyscall: fix syntax error in mkall.sh
Tobias Klauser [Wed, 1 Nov 2023 10:46:59 +0000 (11:46 +0100)]
syscall: fix syntax error in mkall.sh

Fix the following error introduced by CL 518627:

    ./mkall.sh: line 370: syntax error near unexpected token `)'
    ./mkall.sh: line 370: `openbsd_riscv64)'

Change-Id: I044563759bf07c94840f2024734d32a0ad663aab
Reviewed-on: https://go-review.googlesource.com/c/go/+/538935
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Heschi Kreinick <heschi@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
6 months agotesting: correct comments on runCleanup
Evan Jones [Tue, 31 Oct 2023 13:04:15 +0000 (09:04 -0400)]
testing: correct comments on runCleanup

The comment on runCleanup states "If catchPanic is true ...", but
there is no catchPanic argument or variable. This was introduced
in CL 214822, which introduced the panicHandling type. The code was
updated during code review, but the comment was missed.

Change-Id: Id14c5397e7a026bfdf98ea10ecb1e4c61ce2f924
Reviewed-on: https://go-review.googlesource.com/c/go/+/538695
Auto-Submit: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
6 months agocmd/go: handle '@' in local path when running 'go mod edit -replace'
Quan Tong [Wed, 1 Nov 2023 11:08:23 +0000 (18:08 +0700)]
cmd/go: handle '@' in local path when running 'go mod edit -replace'

The existing implementation considers everything after '@' as a version.

Fixes #61500

Change-Id: I72c32529c2726c2b59c089f5ffd6a2e361ef2c65
Reviewed-on: https://go-review.googlesource.com/c/go/+/538916
Auto-Submit: Bryan Mills <bcmills@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agoruntime: fix badmorestackg0 never called on wasm
Mauri de Souza Meneguzzo [Sun, 5 Nov 2023 16:48:15 +0000 (16:48 +0000)]
runtime: fix badmorestackg0 never called on wasm

Previously, badmorestackg0 was never called since it was behind a g ==
R1 check, R1 holding g.m. This is clearly wrong, since we want to check
if g == g0. Fixed by using R2 that holds the value of g0.

Fixes #63953

Change-Id: I1e2a1c3be7ad9e7ae8dbf706ef6783e664a44764
GitHub-Last-Rev: b3e92cf28603ef6f6fafc9f9b724b84253a4355c
GitHub-Pull-Request: golang/go#63954
Reviewed-on: https://go-review.googlesource.com/c/go/+/539840
Reviewed-by: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
6 months agocmd/go: allow 'go mod download' to switch toolchains if called with explicit arguments
Bryan C. Mills [Wed, 30 Aug 2023 14:06:00 +0000 (10:06 -0400)]
cmd/go: allow 'go mod download' to switch toolchains if called with explicit arguments

Fixes #62054.

Change-Id: I4ea24070f7d9aa4964c2f215836602068058f718
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/537480
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
6 months agoruntime: donate racectx to g0 in ReadMetricsSlow
Michael Anthony Knyszek [Fri, 3 Nov 2023 17:16:51 +0000 (17:16 +0000)]
runtime: donate racectx to g0 in ReadMetricsSlow

ReadMetricsSlow was updated to call the core of readMetrics on the
systemstack to prevent issues with stat skew if the stack gets moved
between readmemstats_m and readMetrics. However, readMetrics calls into
the map implementation, which has race instrumentation. The system stack
typically has no racectx set, resulting in crashes.

Donate racectx to g0 like the tracer does, so that these accesses don't
crash.

For #60607.

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-race
Change-Id: Ic0251af2d9b60361f071fe97084508223109480c
Reviewed-on: https://go-review.googlesource.com/c/go/+/539695
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>

6 months agoruntime: eliminate possible stack movements in ReadMetricsSlow
Michael Anthony Knyszek [Thu, 2 Nov 2023 05:51:20 +0000 (05:51 +0000)]
runtime: eliminate possible stack movements in ReadMetricsSlow

Currently it's possible (and even probable, with mayMoreStackMove mode)
for a stack allocation to occur between readmemstats_m and readMetrics
in ReadMetricsSlow. This can cause tests to fail by producing metrics
that are inconsistent between the two sources.

Fix this by breaking out the critical section of readMetrics and calling
that from ReadMetricsSlow on the systemstack. Our main constraint in
calling readMetrics on the system stack is the fact that we can't
acquire the metrics semaphore from the system stack. But if we break out
the critical section, then we can acquire that semaphore before we go on
the system stack.

While we're here, add another readMetrics call before readmemstats_m.
Since we're being paranoid about ways that metrics could get skewed
between the two calls, let's eliminate all uncertainty. It's possible
for readMetrics to allocate new memory, for example for histograms, and
fail while it's reading metrics. I believe we're just getting lucky
today with the order in which the metrics are produced. Another call to
readMetrics will preallocate this data in the samples slice. One nice
thing about this second read is that now we effectively have a way to
check if readMetrics really will allocate if called a second time on the
same samples slice.

Fixes #60607.

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Change-Id: If6ce666530903239ef9f02dbbc3f1cb6be71e425
Reviewed-on: https://go-review.googlesource.com/c/go/+/539117
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
6 months agocmd/go/internal/modfetch: show real URL in response body read errors
Dmitri Shuralyov [Sun, 1 Oct 2023 21:17:44 +0000 (17:17 -0400)]
cmd/go/internal/modfetch: show real URL in response body read errors

CL 233437 added a redactedURL field to proxyRepo, a struct that already
had a field named 'url'. Neither fields were documented, so the similar
names suggest the most natural interpretation that proxyRepo.redactedURL
is equivalent to proxyRepo.url.Redacted() rather than something else.
That's possibly why it was joined with the module version in CL 406675.

It turns out the two URLs differ in more than just redaction: one is the
base proxy URL with (escaped) module path joined, the other is just the
base proxy URL, in redacted form.

Document and rename the fields to make the distinction more clear, and
include all 3 of base module proxy URL + module path + module version
in the reported URL, rather than just the first and third bits as seen
in the errors at https://go.dev/issue/51323#issuecomment-1735812250.

For #51323.
Updates #38680.
Updates #52727.

Change-Id: Ib4b134b548adeec826ee88fe51a2cf580fde0516
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/532035
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
6 months agocrypto/x509: fix certificate policy marshaling
Roland Shoemaker [Thu, 2 Nov 2023 17:04:21 +0000 (10:04 -0700)]
crypto/x509: fix certificate policy marshaling

CL 520535 added the new OID type, and the Certificate field Policies to
replace PolicyIdentifiers. During review I missed three problems: (1)
the marshaling of Certificate didn't take into account the case where
both fields were populated with the same OIDs (which would be the case
if you parsed a certificate and used it as a template), (2)
buildCertExtensions only generated the certificate policies extension if
PolicyIdentifiers was populated, and (3) how we would marshal an empty
OID (i.e. OID{}).

This change makes marshaling a certificate with an empty OID an error,
and only adds a single copy of any OID that appears in both Policies and
PolicyIdentifiers to the certificate policies extension. This should
make the round trip behavior for certificates reasonable.

Additionally this change documents that CreateCertificate uses the
Policies field from the template, and fixes buildCertExtensions to
populate the certificate policies extension if _either_
PolicyIdentifiers or Policies is populated, not just PolicyIdentifiers.

Fixes #63909

Change-Id: I0fcbd3ceaab7a376e7e991ff8b37e2145ffb4a61
Reviewed-on: https://go-review.googlesource.com/c/go/+/539297
Reviewed-by: Mateusz Poliwczak <mpoliwczak34@gmail.com>
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agocmd/link/internal/ld: use strings.TrimPrefix in expandFile
Jes Cok [Wed, 1 Nov 2023 15:34:27 +0000 (15:34 +0000)]
cmd/link/internal/ld: use strings.TrimPrefix in expandFile

Change-Id: Iea00d1951fa222a6e4e54320d204958dbdeabfe4
GitHub-Last-Rev: f9a1e4415cdfb8f61e57142e5a2e182465fa5cda
GitHub-Pull-Request: golang/go#63874
Reviewed-on: https://go-review.googlesource.com/c/go/+/538863
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
6 months agoruntime: remove getcallerpc on riscv64
Joel Sing [Thu, 2 Nov 2023 10:46:38 +0000 (21:46 +1100)]
runtime: remove getcallerpc on riscv64

This was converted to a compiler intrinsic and no longer needs to exist
in assembly.

Change-Id: I7495c435d4642e0e71d8f7677d70af3a3ca2a6ba
Reviewed-on: https://go-review.googlesource.com/c/go/+/539195
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
6 months agomisc/wasm: silence Wasmtime 14 CLI warning
Dmitri Shuralyov [Thu, 2 Nov 2023 03:08:53 +0000 (23:08 -0400)]
misc/wasm: silence Wasmtime 14 CLI warning

The latest version of Wasmtime, 14.0.4 as of writing this, offers a new
CLI while also supporting the old CLI. Since this is known and tracked
in issue #63718, silence the warning that otherwise causes many tests
to fail.

Since Wasmtime 13 and older don't pay attention to WASMTIME_NEW_CLI,
this change increases compatibility of the script, letting it work
with Wasmtime 9.0.1 as currently tested by the old cmd/coordinator, and
with Wasmtime 14.0.4 as currently tested in the new LUCI infrastructure.

The rest of the transition is left as future work.

For #63718.
For #61116.

Change-Id: I77d4f74cc1d34a657e48dcaaceb6fbda7d1e9428
Cq-Include-Trybots: luci.golang.try:gotip-wasip1-wasm_wasmtime
Reviewed-on: https://go-review.googlesource.com/c/go/+/538699
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
6 months agosyscall: copy rlimit.go's build constraint to rlimit_test.go
Dmitri Shuralyov [Thu, 2 Nov 2023 18:36:30 +0000 (14:36 -0400)]
syscall: copy rlimit.go's build constraint to rlimit_test.go

Tests in rlimit_test.go exist to test the behavior of automatically
bumping RLIMIT_NOFILE on Unix implemented in rlimit.go (issue #46279),
with darwin-specific behavior split out into rlimit_darwin.go and
the rest left empty in rlimit_stub.go.

Since the behavior happens only on Unix, it doesn't make sense to test
it on other platforms. Copy rlimit.go's 'unix' build constraint to
rlimit_test.go to accomplish that.

Also simplify the build constraint in rlimit_stub.go while here,
so that its maintenance is easier and it starts to match all
non-darwin Unix GOOS values (previously, 'hurd' happened to be missed).

In particular, this fixes a problem where TestOpenFileLimit was
failing in some environments when testing the wasip1/wasm port.
The RLIMIT_NOFILE bumping behavior isn't implemented there, so
the test was testing the environment and not the Go project.

Updates #46279.
For #61116.

Change-Id: Ic993f9cfc021d4cda4fe3d7fed8e2e180f78a2ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/539435
Reviewed-by: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>

6 months agocmd/compile: fix unstable selection of hottest edge
Michael Pratt [Thu, 2 Nov 2023 18:16:16 +0000 (14:16 -0400)]
cmd/compile: fix unstable selection of hottest edge

When selecting the hottest edge to use for PGO-based devirtualization,
edges are order by:

1. Edge weight
2. If weights are equal, prefer the edge with IR available in the
   package.
3. Otherwise, simply sort lexicographically.

The existing logic for (2) is incomplete.

If the hottest edge so far is missing IR, but the new edge has IR, then
it works as expected and selects the new edge.

But if the hottest edge so far has IR and the new edge is missing IR, we
want to always keep the hottest edge so far, but this logic will fall
through and use lexicographical ordering instead.

Adjust the check to always make an explicit choice when IR availability
differs.

Change-Id: Ia7fcc286aa9a62ac209fd978cfce60463505f4cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/539475
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agonet/http: remove arbitrary timeouts in tests of Server.ErrorLog
Bryan C. Mills [Thu, 2 Nov 2023 14:29:08 +0000 (10:29 -0400)]
net/http: remove arbitrary timeouts in tests of Server.ErrorLog

This also allows us to remove the chanWriter helper from the test,
using a simpler strings.Builder instead, relying on
clientServerTest.close for synchronization.
(I don't think this runs afoul of #38370, because the handler
functions themselves in these tests should never be executed,
let alone result in an asynchronous write to the error log.)

Fixes #57599.

Change-Id: I45c6cefca0bb218f6f9a9659de6bde454547f704
Reviewed-on: https://go-review.googlesource.com/c/go/+/539436
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agodebug/elf,cmd/link: add additional relocations for loong64
Guoqi Chen [Tue, 24 Oct 2023 08:24:39 +0000 (16:24 +0800)]
debug/elf,cmd/link: add additional relocations for loong64

The Linker Relaxation feature on Loong64 is already supported in binutils 2.41.
The intermediate code generated after enabling this feature introduces three
reloc types R_LARCH_B26, R_LARCH_ADD32 and R_LARCH_SUB32.

The other relocation types are not currently used when running all.bash, but
in order to avoid the host tool chain making the decision to use it we don't
have to catch it every time.

The LoongArch ABI at here:
https://github.com/loongson/la-abi-specs/blob/release/la-abi.adoc

Corresponding binutils implementation:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=be1ebb6710a8f707bd4b0eecbd00f4f4964050e5
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=1b6fccd28db14fffe75ff6755307047ef932c81e

Fixes #63725

Change-Id: I891115cfdbcf785ab494c881d5f9d1bf8748da8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/537615
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agoerrors: add available godoc link
cui fliter [Fri, 13 Oct 2023 06:59:10 +0000 (14:59 +0800)]
errors: add available godoc link

Change-Id: Ie86493ebad3c3d7ea914754451985d7ee3e8e270
Reviewed-on: https://go-review.googlesource.com/c/go/+/535080
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: shuang cui <imcusg@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: qiulaidongfeng <2645477756@qq.com>
6 months agoruntime: remove unused getOSRev on openbsd
Tobias Klauser [Thu, 2 Nov 2023 09:29:29 +0000 (10:29 +0100)]
runtime: remove unused getOSRev on openbsd

It's unused since CL 538458.

Change-Id: Ic8d30b0fb54f3f1d723626c5db56fbf4cf181dea
Reviewed-on: https://go-review.googlesource.com/c/go/+/539155
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
6 months agocmd/go/internal/modfetch: avoid path.Join in URL errors, part 2
Dmitri Shuralyov [Sun, 1 Oct 2023 20:03:01 +0000 (16:03 -0400)]
cmd/go/internal/modfetch: avoid path.Join in URL errors, part 2

CL 406675 added more detail to bare errors from net/http in two places.
CL 461682 improved one of the two places to stop folding "//" into "/".
This CL applies the same change to the other place.

For #52727.

Change-Id: I3fc13f30cf0f054949ce78269c52b7fafd477e70
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/532015
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agocmd/compile/internal/syntax: use strings.TrimPrefix in typeOf
Jes Cok [Thu, 2 Nov 2023 12:59:24 +0000 (12:59 +0000)]
cmd/compile/internal/syntax: use strings.TrimPrefix in typeOf

Change-Id: I38c2a1afa1684b069522cd1b74529ae10f019ce8
GitHub-Last-Rev: 8c726f1f01f2827081a0afc161360497c50a9c7a
GitHub-Pull-Request: golang/go#63894
Reviewed-on: https://go-review.googlesource.com/c/go/+/539057
Run-TryBot: qiulaidongfeng <2645477756@qq.com>
Run-TryBot: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: qiulaidongfeng <2645477756@qq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
6 months agoos/signal: skip nohup tests on darwin builders
Michael Pratt [Wed, 1 Nov 2023 15:55:10 +0000 (11:55 -0400)]
os/signal: skip nohup tests on darwin builders

The new LUCI builders have a temporary limitation that breaks nohup.
Skip nohup tests there.

For #63875.

Cq-Include-Trybots: luci.golang.try:gotip-darwin-arm64_13
Change-Id: Ia9ffecea7310f84a21f6138d8f8cdfc5e1392307
Reviewed-on: https://go-review.googlesource.com/c/go/+/538698
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
6 months agoruntime: move userArenaHeapBitsSetType into mbitmap.go
Michael Anthony Knyszek [Sat, 28 Oct 2023 17:42:51 +0000 (17:42 +0000)]
runtime: move userArenaHeapBitsSetType into mbitmap.go

This will make the upcoming GOEXPERIMENT easier to implement, since this
function relies on a lot of heap bitmap internals.

Change-Id: I2ab76e928e7bfd383dcdb5bfe72c9b23c2002a4e
Reviewed-on: https://go-review.googlesource.com/c/go/+/537979
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agoruntime: split out pointer/scalar metadata from heapArena
Michael Anthony Knyszek [Sat, 28 Oct 2023 16:20:04 +0000 (16:20 +0000)]
runtime: split out pointer/scalar metadata from heapArena

We're going to want to fork this data in the near future for a
GOEXPERIMENT, so break it out now.

Change-Id: Ia7ded850bb693c443fe439c6b7279dcac638512c
Reviewed-on: https://go-review.googlesource.com/c/go/+/537978
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>

6 months agoreflect: pass the right element type in verifyGCBitsSlice
Michael Anthony Knyszek [Sat, 28 Oct 2023 16:30:57 +0000 (16:30 +0000)]
reflect: pass the right element type in verifyGCBitsSlice

Currently verifyGCBitsSlice creates a new array type to represent the
slice backing store, but passes the element type as the slice type in
this construction. This is incorrect, but the tests currently don't care
about it. They will in a follow-up CL, so fix it now.

Change-Id: I6ed8a9808ae78c624be316db1566376fa0e12758
Reviewed-on: https://go-review.googlesource.com/c/go/+/537981
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agocmd/cgo: disable #cgo noescape/nocallback until Go 1.23
Russ Cox [Thu, 2 Nov 2023 12:15:44 +0000 (08:15 -0400)]
cmd/cgo: disable #cgo noescape/nocallback until Go 1.23

Go 1.21 and earlier do not understand this line, causing
"go mod vendor" of //go:build go1.22-tagged code that
uses this feature to fail.

The solution is to include the go/build change to skip over
the line in Go 1.22 (making "go mod vendor" from Go 1.22 onward
work with this change) and then wait to deploy the cgo change
until Go 1.23, at which point Go 1.21 and earlier will be unsupported.

For #56378.
Fixes #63293.

Change-Id: Ifa08b134eac5a6aa15d67dad0851f00e15e1e58b
Reviewed-on: https://go-review.googlesource.com/c/go/+/539235
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
6 months agoos/signal: remove go t.Run from TestNohup
Michael Pratt [Wed, 1 Nov 2023 21:06:08 +0000 (17:06 -0400)]
os/signal: remove go t.Run from TestNohup

Since CL 226138, TestNohup has a bit of a strange construction: it wants
to run the "uncaught" subtests in parallel with each other, and the
"nohup" subtests in parallel with each other, but also needs join
between "uncaught" and "nohop" so it can Stop notifying for SIGHUP.

It achieves this by doing `go t.Run` with a WaitGroup rather than using
`t.Parallel` in the subtest (which would make `t.Run` return immediately).

However, this makes things more difficult to understand than necessary.
As noted on https://pkg.go.dev/testing#hdr-Subtests_and_Sub_benchmarks,
a second layer of subtest can be used to join parallel subtests.

Switch to this form, which makes the test simpler to follow
(particularly the cleanup that goes with "uncaught").

For #63799.

Change-Id: Ibfce0f439508a7cfca848c7ccfd136c9c453ad8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/538899
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
6 months agoruntime: add crash stack support for riscv64
Joel Sing [Tue, 31 Oct 2023 14:34:33 +0000 (01:34 +1100)]
runtime: add crash stack support for riscv64

Change-Id: Ib89a71e20f9c6b86c97814c75cb427e9bd7075e5
Reviewed-on: https://go-review.googlesource.com/c/go/+/538735
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>

6 months agogo/parser: better error messages for incorrect type parameter list
Robert Griesemer [Tue, 31 Oct 2023 21:22:05 +0000 (14:22 -0700)]
go/parser: better error messages for incorrect type parameter list

This is a port of CL 538856 from the syntax parser to go/parser.
As part of the port, make more portions of parseParameterList
matching the equivalent paramList method (from the syntax parser).
As a result, this now also produces a better error message in cases
where the missing piece might not be a type parameter name but a
constraint (this fixes a TODO in a test).

Improve comments in the code and adjust the corresponding comments
in the syntax parser.

Change references to issues to use the format go.dev/issue/ddddd.

For #60812.

Change-Id: Ia243bd78161ed8543d3dc5deb20ca4a215c5b1e4
Reviewed-on: https://go-review.googlesource.com/c/go/+/538858
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>

6 months agosyscall: call getfsstat via libc on openbsd
Joel Sing [Wed, 1 Nov 2023 12:43:15 +0000 (23:43 +1100)]
syscall: call getfsstat via libc on openbsd

On openbsd, call getfsstat directly via libc, instead of calling it
via syscall.Syscall.

Updates #63900

Change-Id: Ib4c581160b170e6cc6017c42e959e647d97ac993
Reviewed-on: https://go-review.googlesource.com/c/go/+/538736
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Josh Rickmar <jrick@zettaport.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>

6 months agoruntime: remove map stack version handling for openbsd
Joel Sing [Mon, 30 Oct 2023 13:25:52 +0000 (00:25 +1100)]
runtime: remove map stack version handling for openbsd

OpenBSD 6.3 is more than five years old and has not been supported for
the last four years (only 7.3 and 7.4 are currently supported). As such,
remove special handling of MAP_STACK for 6.3 and earlier.

Change-Id: I1086c910bbcade7fb3938bb1226813212794b587
Reviewed-on: https://go-review.googlesource.com/c/go/+/538458
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Aaron Bieber <aaron@bolddaemon.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>

6 months agotest: run range-over-integer tests without need for -goexperiment
Robert Griesemer [Wed, 1 Nov 2023 23:09:45 +0000 (16:09 -0700)]
test: run range-over-integer tests without need for -goexperiment

Move the range-over-function tests into range4.go.

Change-Id: Idccf30a0c7d7e8d2a17fb1c5561cf21e00506135
Reviewed-on: https://go-review.googlesource.com/c/go/+/539095
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>

6 months agogo/types, types2: enable range over int w/o need for goexperiment
Robert Griesemer [Tue, 31 Oct 2023 23:13:21 +0000 (16:13 -0700)]
go/types, types2: enable range over int w/o need for goexperiment

For #61405.

Change-Id: I047ec31bc36b1707799ffef25506070613477d1f
Reviewed-on: https://go-review.googlesource.com/c/go/+/538718
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
6 months agospec: document range over integer expression
Robert Griesemer [Tue, 31 Oct 2023 22:38:35 +0000 (15:38 -0700)]
spec: document range over integer expression

This CL is partly based on CL 510535.

For #61405.

Change-Id: Ic94f6726f9eb34313f11bec7b651921d7e5c18d4
Reviewed-on: https://go-review.googlesource.com/c/go/+/538859
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Bypass: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
6 months agoos: fix PathError.Op for dirFS.Open
Joe Tsai [Wed, 1 Nov 2023 20:12:43 +0000 (13:12 -0700)]
os: fix PathError.Op for dirFS.Open

This appears to be a copy-paste error from CL 455362.

The operation name used to be "open"
but seems to have been accidentally changed to "stat".
This CL reverts back to "open".

Change-Id: I3fc5168095e2d9eee3efa3cc091b10bcf4e3ecde
Reviewed-on: https://go-review.googlesource.com/c/go/+/539056
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Joseph Tsai <joetsai@digital-static.net>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agointernal/bytealg: optimize indexbyte in amd64
qiulaidongfeng [Wed, 1 Nov 2023 11:36:55 +0000 (11:36 +0000)]
internal/bytealg: optimize indexbyte in amd64

goos: windows
goarch: amd64
pkg: bytes
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
                         │   old.txt   │               new.txt               │
                         │   sec/op    │   sec/op     vs base                │
IndexByte/10-16            2.613n ± 1%   2.558n ± 1%   -2.09% (p=0.014 n=10)
IndexByte/32-16            3.034n ± 1%   3.010n ± 2%        ~ (p=0.305 n=10)
IndexByte/4K-16            57.20n ± 2%   39.58n ± 2%  -30.81% (p=0.000 n=10)
IndexByte/4M-16            34.48µ ± 1%   33.83µ ± 2%   -1.87% (p=0.023 n=10)
IndexByte/64M-16           1.493m ± 2%   1.450m ± 2%   -2.89% (p=0.000 n=10)
IndexBytePortable/10-16    3.172n ± 4%   3.163n ± 2%        ~ (p=0.684 n=10)
IndexBytePortable/32-16    8.465n ± 2%   8.375n ± 3%        ~ (p=0.631 n=10)
IndexBytePortable/4K-16    852.0n ± 1%   846.6n ± 3%        ~ (p=0.971 n=10)
IndexBytePortable/4M-16    868.2µ ± 2%   856.6µ ± 2%        ~ (p=0.393 n=10)
IndexBytePortable/64M-16   13.81m ± 2%   13.88m ± 3%        ~ (p=0.684 n=10)
geomean                    1.204µ        1.148µ        -4.63%

                         │   old.txt    │               new.txt                │
                         │     B/s      │     B/s       vs base                │
IndexByte/10-16            3.565Gi ± 1%   3.641Gi ± 1%   +2.15% (p=0.015 n=10)
IndexByte/32-16            9.821Gi ± 1%   9.899Gi ± 2%        ~ (p=0.315 n=10)
IndexByte/4K-16            66.70Gi ± 2%   96.39Gi ± 2%  +44.52% (p=0.000 n=10)
IndexByte/4M-16            113.3Gi ± 1%   115.5Gi ± 2%   +1.91% (p=0.023 n=10)
IndexByte/64M-16           41.85Gi ± 2%   43.10Gi ± 2%   +2.98% (p=0.000 n=10)
IndexBytePortable/10-16    2.936Gi ± 4%   2.945Gi ± 2%        ~ (p=0.684 n=10)
IndexBytePortable/32-16    3.521Gi ± 2%   3.559Gi ± 3%        ~ (p=0.631 n=10)
IndexBytePortable/4K-16    4.477Gi ± 1%   4.506Gi ± 3%        ~ (p=0.971 n=10)
IndexBytePortable/4M-16    4.499Gi ± 2%   4.560Gi ± 2%        ~ (p=0.393 n=10)
IndexBytePortable/64M-16   4.525Gi ± 2%   4.504Gi ± 3%        ~ (p=0.684 n=10)
geomean                    10.04Gi        10.53Gi        +4.86%

For #63678

Change-Id: I0571c2b540a816d57bd6ed8bb1df4191c7992d92
GitHub-Last-Rev: 7e95b8bfb035b53175f5a1b7d8750113933a7e17
GitHub-Pull-Request: golang/go#63847
Reviewed-on: https://go-review.googlesource.com/c/go/+/538715
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>

6 months agocrypto/x509: optimize the performance of checkSignature
dchaofei [Thu, 19 May 2022 02:02:50 +0000 (02:02 +0000)]
crypto/x509: optimize the performance of checkSignature

The loop should be terminated immediately when `algo` has been found

Fixes #52955

Change-Id: Ib3865c4616a0c1af9b72daea45f5a1750f84562f
GitHub-Last-Rev: 721322725fb2d3a3ea410d09fd8320dfef865d8d
GitHub-Pull-Request: golang/go#52987
Reviewed-on: https://go-review.googlesource.com/c/go/+/407215
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Roland Shoemaker <roland@golang.org>

6 months agobytes,internal/bytealg: add func bytealg.LastIndexRabinKarp
Jes Cok [Tue, 31 Oct 2023 18:34:07 +0000 (18:34 +0000)]
bytes,internal/bytealg: add func bytealg.LastIndexRabinKarp

Also rename 'substr' to 'sep' in IndexRabinKarp for consistency.

Change-Id: Icc2ad1116aecaf002c8264daa2fa608306c9a88a
GitHub-Last-Rev: 1784b93f53d569991f86585f9011120ea26f193f
GitHub-Pull-Request: golang/go#63854
Reviewed-on: https://go-review.googlesource.com/c/go/+/538716
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
6 months agoos: report IO_REPARSE_TAG_DEDUP files as regular in Stat and Lstat
Bryan C. Mills [Thu, 26 Oct 2023 16:06:04 +0000 (12:06 -0400)]
os: report IO_REPARSE_TAG_DEDUP files as regular in Stat and Lstat

Prior to CL 460595, Lstat reported most reparse points as regular
files. However, reparse points can in general implement unusual
behaviors (consider IO_REPARSE_TAG_AF_UNIX or IO_REPARSE_TAG_LX_CHR),
and Windows allows arbitrary user-defined reparse points, so in
general we must not assume that an unrecognized reparse tag represents
a regular file; in CL 460595, we began marking them as irregular.

As it turns out, the Data Deduplication service on Windows Server runs
an Optimization job that turns regular files into reparse files with
the tag IO_REPARSE_TAG_DEDUP. Those files still behave more-or-less
like regular files, in that they have well-defined sizes and support
random-access reads and writes, so most programs can treat them as
regular files without difficulty. However, they are still reparse
files: as a result, on servers with the Data Deduplication service
enabled, files could arbitrarily change from “regular” to “irregular”
without explicit user intervention.

Since dedup files are converted in the background and otherwise behave
like regular files, this change adds a special case to report DEDUP
reparse points as regular.

Fixes #63429.

No test because to my knowledge we don't have any Windows builders
that have the deduplication service enabled, nor do we have a way to
reliably guarantee the existence of an IO_REPARSE_TAG_DEDUP file.

(In theory we could add a builder with the service enabled on a
specific volume, write a test that encodes knowledge of that volume,
and use the GO_BUILDER_NAME environment variable to run that test only
on the specially-configured builders. However, I don't currently have
the bandwidth to reconfigure the builders in this way, and given the
simplicity of the change I think it is unlikely to regress
accidentally.)

Change-Id: I649e7ef0b67e3939a980339ce7ec6a20b31b23a1
Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/537915
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
Auto-Submit: Bryan Mills <bcmills@google.com>

6 months agocmd/compile/internal/syntax: better error messages for incorrect type parameter list
Robert Griesemer [Tue, 31 Oct 2023 20:29:25 +0000 (13:29 -0700)]
cmd/compile/internal/syntax: better error messages for incorrect type parameter list

When parsing a declaration of the form

        type a [b[c]]d

where a, b, c, d stand for identifiers, b[c] is parsed as a type
constraint (because an array length must be constant and an index
expression b[c] is never constant, even if b is a constant string
and c a constant index - this is crucial for disambiguation of the
various possibilities).

As a result, the error message referred to a missing type parameter
name and not an invalid array declaration.

Recognize this special case and report both possibilities (because
we can't be sure without type information) with the new error:

       "missing type parameter name or invalid array length"

ALso, change the previous error message

        "type parameter must be named"

to

        "missing type parameter name"

which is more fitting as the error refers to an absent type parameter
(rather than a type parameter that's somehow invisibly present but
unnamed).

Fixes #60812.

Change-Id: Iaad3b3a9aeff9dfe2184779f3d799f16c7500b34
Reviewed-on: https://go-review.googlesource.com/c/go/+/538856
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
6 months agocmd/compile/internal/syntax: fix/update various comments
Robert Griesemer [Tue, 31 Oct 2023 16:50:00 +0000 (09:50 -0700)]
cmd/compile/internal/syntax: fix/update various comments

Change-Id: I30b448c8fcdbad94afcd7ff0dfc5cfebb485bdd7
Reviewed-on: https://go-review.googlesource.com/c/go/+/538855
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
6 months agoos/signal: use syscall.Wait4 directly in tests
Joel Sing [Sat, 28 Oct 2023 15:07:01 +0000 (02:07 +1100)]
os/signal: use syscall.Wait4 directly in tests

Rather than using syscall.Syscall6 with SYS_WAIT4, use syscall.Wait4
directly.

Updates #59667

Change-Id: I50fea3b7d10003dbc632aafd5e170a9fe96d6f42
Reviewed-on: https://go-review.googlesource.com/c/go/+/538459
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>

6 months agosyscall: regen zsyscall for openbsd/riscv64
Joel Sing [Sat, 28 Oct 2023 16:14:32 +0000 (03:14 +1100)]
syscall: regen zsyscall for openbsd/riscv64

This removes the unused writelen function, which was cleaned up for other
platforms in CL#529035.

Change-Id: I1999dc81276763bdc73d8590c16729447c4e8538
Reviewed-on: https://go-review.googlesource.com/c/go/+/538119
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Joel Sing <joel@sing.id.au>

6 months agosyscall: regenerate zsyscall for dragonfly/freebsd/netbsd
Joel Sing [Mon, 30 Oct 2023 13:51:25 +0000 (00:51 +1100)]
syscall: regenerate zsyscall for dragonfly/freebsd/netbsd

The sysctl declaration was moved in CL 141639, however the files were
presumably not regenerated. There is no functional change, however
regenerating avoids unrelated noise in future diffs.

Change-Id: Ifb840b5853f3f1c3c88a3f94df21b6f6d3c635d4
Reviewed-on: https://go-review.googlesource.com/c/go/+/538118
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
6 months agointernal/cpu: add comments to copied functions
apocelipes [Wed, 25 Oct 2023 10:29:01 +0000 (10:29 +0000)]
internal/cpu: add comments to copied functions

Just as same as other copied functions,
like stringsTrimSuffix in "os/executable_procfs.go"

Change-Id: I9c9fbd75b009a5ae0e869cf1fddc77c0e08d9a67
GitHub-Last-Rev: 4c18865e15ede0f53121b6845a1879cdd70d1a38
GitHub-Pull-Request: golang/go#63704
Reviewed-on: https://go-review.googlesource.com/c/go/+/537056
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
6 months agoruntime: use testenv.Command in TestG0StackOverflow
Cherry Mui [Tue, 31 Oct 2023 20:26:52 +0000 (16:26 -0400)]
runtime: use testenv.Command in TestG0StackOverflow

For debugging timeouts.

Change-Id: I08dc86ec0264196e5fd54066655e94a9d062ed80
Reviewed-on: https://go-review.googlesource.com/c/go/+/538697
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
6 months agoruntime: make select fairness test less picky
Keith Randall [Tue, 31 Oct 2023 16:54:54 +0000 (09:54 -0700)]
runtime: make select fairness test less picky

Allow up to 10 standard deviations from the mean, instead of
~5 that the current test allows.

10 standard deviations allows up to a 4500/5500 split.

Fixes #52465

Change-Id: Icb21c1d31fafbcf4723b75435ba5e98863e812c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/538815
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agocmd/compile: ensure pointer arithmetic happens after the nil check
Keith Randall [Wed, 25 Oct 2023 20:35:13 +0000 (13:35 -0700)]
cmd/compile: ensure pointer arithmetic happens after the nil check

Have nil checks return a pointer that is known non-nil. Users of
that pointer can use the result, ensuring that they are ordered
after the nil check itself.

The order dependence goes away after scheduling, when we've fixed
an order. At that point we move uses back to the original pointer
so it doesn't change regalloc any.

This prevents pointer arithmetic on nil from being spilled to the
stack and then observed by a stack scan.

Fixes #63657

Change-Id: I1a5fa4f2e6d9000d672792b4f90dfc1b7b67f6ea
Reviewed-on: https://go-review.googlesource.com/c/go/+/537775
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
6 months agocmd/compile: handle constant pointer offsets in dead store elimination
Keith Randall [Mon, 30 Oct 2023 04:00:29 +0000 (21:00 -0700)]
cmd/compile: handle constant pointer offsets in dead store elimination

Update #63657
Update #45573

Change-Id: I163c6038c13d974dc0ca9f02144472bc05331826
Reviewed-on: https://go-review.googlesource.com/c/go/+/538595
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
6 months agoruntime: on arm32, detect whether we have sync instructions
Keith Randall [Tue, 29 Aug 2023 21:23:59 +0000 (14:23 -0700)]
runtime: on arm32, detect whether we have sync instructions

Make the choice of using these instructions dynamic (triggered by cpu
feature detection) rather than static (trigered by GOARM setting).

if GOARM>=7, we know we have them.
For GOARM=5/6, dynamically dispatch based on auxv information.

Update #17082
Update #61588

Change-Id: I8a50481d942f62cf36348998a99225d0d242f8af
Reviewed-on: https://go-review.googlesource.com/c/go/+/525637
TryBot-Result: Gopher Robot <gobot@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
6 months agocrypto/x509: add new OID type and use it in Certificate
Mateusz Poliwczak [Tue, 31 Oct 2023 17:26:43 +0000 (17:26 +0000)]
crypto/x509: add new OID type and use it in Certificate

Fixes #60665

Change-Id: I814b7d4b26b964f74443584fb2048b3e27e3b675
GitHub-Last-Rev: 693c741c76e6369e36aa2a599ee6242d632573c7
GitHub-Pull-Request: golang/go#62096
Reviewed-on: https://go-review.googlesource.com/c/go/+/520535
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Mateusz Poliwczak <mpoliwczak34@gmail.com>
Auto-Submit: Roland Shoemaker <roland@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
6 months agobytes,internal/bytealg: eliminate IndexRabinKarpBytes using generics
Jes Cok [Fri, 27 Oct 2023 16:20:44 +0000 (16:20 +0000)]
bytes,internal/bytealg: eliminate IndexRabinKarpBytes using generics

This is a follow-up to CL 538175.

Change-Id: Iec2523b36a16d7e157c17858c89fcd43c2470d58
GitHub-Last-Rev: 812d36e57c71ea3bf44d2d64bde0703ef02a1b91
GitHub-Pull-Request: golang/go#63770
Reviewed-on: https://go-review.googlesource.com/c/go/+/538195
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
6 months agocmd/compile/internal/ssa: adjust default to the end in *Block.AuxIntString
Jes Cok [Sun, 29 Oct 2023 06:51:14 +0000 (06:51 +0000)]
cmd/compile/internal/ssa: adjust default to the end in *Block.AuxIntString

Change-Id: Id48cade7811e2dfbf78d3171fe202ad272534e37
GitHub-Last-Rev: ea6abb2dc216ebd4a42fc3dc25f39ea6869d2dad
GitHub-Pull-Request: golang/go#63808
Reviewed-on: https://go-review.googlesource.com/c/go/+/538377
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
6 months agohash/maphash: weaken avalanche test a bit more
Cuong Manh Le [Tue, 31 Oct 2023 11:34:42 +0000 (18:34 +0700)]
hash/maphash: weaken avalanche test a bit more

CL 495415 weaken avalanche, making allowed range from 43% to 57%. Since
then, we only see a failure with 58% on linux-386-longtest builder, so
let give the test a bit more wiggle room: 40% to 59%.

Fixes #60170

Change-Id: I9528ebc8601975b733c3d9fd464ce41429654273
Reviewed-on: https://go-review.googlesource.com/c/go/+/538655
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>

6 months agointernal/bytealg: optimize Count/CountString in arm64
cui fliter [Sat, 28 Oct 2023 13:21:31 +0000 (21:21 +0800)]
internal/bytealg: optimize Count/CountString in arm64

For #63678

goos: darwin
goarch: arm64
pkg: strings
                          │ count_old.txt │            count_new.txt            │
                          │    sec/op     │   sec/op     vs base                │
CountHard1-8                 368.7µ ± 11%   332.0µ ± 1%   -9.95% (p=0.002 n=10)
CountHard2-8                 348.8µ ±  5%   333.1µ ± 1%   -4.51% (p=0.000 n=10)
CountHard3-8                 402.7µ ± 25%   359.5µ ± 1%  -10.75% (p=0.000 n=10)
CountTorture-8              10.536µ ± 23%   9.913µ ± 0%   -5.91% (p=0.000 n=10)
CountTortureOverlapping-8    74.86µ ±  9%   67.56µ ± 1%   -9.75% (p=0.000 n=10)
CountByte/10-8               6.905n ±  3%   6.690n ± 1%   -3.11% (p=0.001 n=10)
CountByte/32-8               3.247n ± 13%   3.207n ± 2%   -1.23% (p=0.030 n=10)
CountByte/4096-8             83.72n ±  1%   82.58n ± 1%   -1.36% (p=0.007 n=10)
CountByte/4194304-8          85.17µ ±  5%   84.02µ ± 8%        ~ (p=0.075 n=10)
CountByte/67108864-8         1.497m ±  8%   1.397m ± 2%   -6.69% (p=0.000 n=10)
geomean                      9.977µ         9.426µ        -5.53%

                     │ count_old.txt │            count_new.txt            │
                     │      B/s      │     B/s       vs base               │
CountByte/10-8         1.349Gi ±  3%   1.392Gi ± 1%  +3.20% (p=0.002 n=10)
CountByte/32-8         9.180Gi ± 11%   9.294Gi ± 2%  +1.24% (p=0.029 n=10)
CountByte/4096-8       45.57Gi ±  1%   46.20Gi ± 1%  +1.38% (p=0.007 n=10)
CountByte/4194304-8    45.86Gi ±  5%   46.49Gi ± 7%       ~ (p=0.075 n=10)
CountByte/67108864-8   41.75Gi ±  8%   44.74Gi ± 2%  +7.16% (p=0.000 n=10)
geomean                16.10Gi         16.55Gi       +2.85%

Change-Id: Ifc2173ba3a926b0fa9598372d4404b8645929d45
Reviewed-on: https://go-review.googlesource.com/c/go/+/538116
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: shuang cui <imcusg@gmail.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agoruntime: allocate crash stack via stackalloc
Joel Sing [Mon, 30 Oct 2023 13:27:58 +0000 (00:27 +1100)]
runtime: allocate crash stack via stackalloc

On some platforms (notably OpenBSD), stacks must be specifically allocated
and marked as being stack memory. Allocate the crash stack using stackalloc,
which ensures these requirements are met, rather than using a global Go
variable.

Fixes #63794

Change-Id: I6513575797dd69ff0a36f3bfd4e5fc3bd95cbf50
Reviewed-on: https://go-review.googlesource.com/c/go/+/538457
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>

6 months agocmd/compile/internal/syntax: set up dummy name and type if func name is missing
Robert Griesemer [Tue, 31 Oct 2023 00:00:07 +0000 (17:00 -0700)]
cmd/compile/internal/syntax: set up dummy name and type if func name is missing

We do the same elsewhere (e.g. in parser.name when a name is missing).
This ensures functions have a (dummy) name and a non-nil type.
Avoids a crash in the type-checker (verified manually).
A test was added here (rather than the type checker) because type-
checker tests are shared between types2 and go/types and error
recovery in this case is different.

Fixes #63835.

Change-Id: I1460fc88d23d80b8d8c181c774d6b0a56ca06317
Reviewed-on: https://go-review.googlesource.com/c/go/+/538059
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Bypass: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>

6 months agogo/types, types2: more concise error if conversion fails due to integer overflow
Robert Griesemer [Mon, 30 Oct 2023 20:51:04 +0000 (13:51 -0700)]
go/types, types2: more concise error if conversion fails due to integer overflow

This change brings the error message for this case back in line
with the pre-Go1.18 error message.

Fixes #63563.

Change-Id: I3c6587d420907b34ee8a5f295ecb231e9f008380
Reviewed-on: https://go-review.googlesource.com/c/go/+/538058
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Bypass: Robert Griesemer <gri@google.com>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
6 months agocmd/dist,internal/platform: enable openbsd/ppc64 port
Joel Sing [Wed, 9 Aug 2023 17:32:21 +0000 (03:32 +1000)]
cmd/dist,internal/platform: enable openbsd/ppc64 port

Updates #56001

Change-Id: I16440114ecf661e9fc17d304ab3b16bc97ef82f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/517935
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
6 months agocmd/compile/internal/ssa: add missing space in comment
Jes Cok [Sun, 29 Oct 2023 02:11:19 +0000 (02:11 +0000)]
cmd/compile/internal/ssa: add missing space in comment

Change-Id: I54c3e8e0d61ceb6533284098dc32944f9f14459e
GitHub-Last-Rev: 9793d9d039911b74396b315ce47ad0a53169d25c
GitHub-Pull-Request: golang/go#63806
Reviewed-on: https://go-review.googlesource.com/c/go/+/538375
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: qiulaidongfeng <2645477756@qq.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: qiulaidongfeng <2645477756@qq.com>

6 months agointernal/fmtsort: makeChans pin pointer
qiulaidongfeng [Sat, 28 Oct 2023 07:09:01 +0000 (07:09 +0000)]
internal/fmtsort: makeChans pin pointer

Complete TODO.

For #49431

Change-Id: I1399205e430ebd83182c3e0c4becf1fde32d433e
GitHub-Last-Rev: 02cdea740bccbe0993c53fecdd32608f08861e59
GitHub-Pull-Request: golang/go#62673
Reviewed-on: https://go-review.googlesource.com/c/go/+/528796
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Commit-Queue: Keith Randall <khr@golang.org>
Run-TryBot: qiulaidongfeng <2645477756@qq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
6 months agocmd/go/internal/help: update the documentation to match the design and implementation
Quan Tong [Mon, 16 Oct 2023 08:39:48 +0000 (15:39 +0700)]
cmd/go/internal/help: update the documentation to match the design and implementation

The existing documentation imply that the build constraints
should be ignored after a block comments, but actually it's not.

Fixes #63502

Change-Id: I0597934b7a7eeab8908bf06e1312169b3702bf05
Reviewed-on: https://go-review.googlesource.com/c/go/+/535635
Reviewed-by: Michael Matloob <matloob@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Mark Pictor <mark.pictor@contrastsecurity.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
6 months agolog/slog: Reorder doc comment for level constants
Allen Li [Thu, 10 Aug 2023 22:01:32 +0000 (22:01 +0000)]
log/slog: Reorder doc comment for level constants

pkgsite and go doc print the doc comment *after* the code, resulting in:

    const (
            LevelDebug Level = -4
            ...
    )

    Many paragraphs...

    Names for common levels.

The "Names for common levels." feels out of place and confusing at the bottom.

This is also consistent with the recommendation for the first sentence in doc comments to be the "summary".

Change-Id: I656e85e27d2a4b23eaba5f2c1f4f811a88848c83
GitHub-Last-Rev: d9f7ee9b94df6779fcaef64edf3a480459e3ef16
GitHub-Pull-Request: golang/go#61943
Reviewed-on: https://go-review.googlesource.com/c/go/+/518537
Reviewed-by: Alan Donovan <alan@alandonovan.net>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: qiulaidongfeng <2645477756@qq.com>
Reviewed-by: qiulaidongfeng <2645477756@qq.com>
6 months agomath/rand/v2: delete Mitchell/Reeds source
Russ Cox [Tue, 6 Jun 2023 17:50:26 +0000 (13:50 -0400)]
math/rand/v2: delete Mitchell/Reeds source

These slowdowns are because we are now using PCG instead of the
Mitchell/Reeds LFSR for the benchmarks. PCG is in fact a bit slower
(but generates statically far better random numbers).

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 01ff938549.amd64 │           afa459a2f0.amd64           │
                        │      sec/op      │    sec/op     vs base                │
PCG_DXSM-32                    1.490n ± 0%    1.488n ± 2%        ~ (p=0.408 n=20)
SourceUint64-32                1.352n ± 1%    1.450n ± 3%   +7.21% (p=0.000 n=20)
GlobalInt64-32                 2.083n ± 0%    2.067n ± 2%        ~ (p=0.223 n=20)
GlobalInt64Parallel-32        0.1035n ± 1%   0.1044n ± 2%        ~ (p=0.010 n=20)
GlobalUint64-32                2.038n ± 1%    2.085n ± 0%   +2.28% (p=0.000 n=20)
GlobalUint64Parallel-32       0.1006n ± 1%   0.1008n ± 1%        ~ (p=0.733 n=20)
Int64-32                       1.687n ± 2%    1.779n ± 1%   +5.48% (p=0.000 n=20)
Uint64-32                      1.674n ± 2%    1.854n ± 2%  +10.69% (p=0.000 n=20)
GlobalIntN1000-32              3.135n ± 1%    3.140n ± 3%        ~ (p=0.794 n=20)
IntN1000-32                    2.478n ± 1%    2.496n ± 1%   +0.73% (p=0.006 n=20)
Int64N1000-32                  2.455n ± 1%    2.510n ± 2%   +2.22% (p=0.000 n=20)
Int64N1e8-32                   2.467n ± 2%    2.471n ± 2%        ~ (p=0.050 n=20)
Int64N1e9-32                   2.454n ± 1%    2.488n ± 2%   +1.39% (p=0.000 n=20)
Int64N2e9-32                   2.482n ± 1%    2.478n ± 2%        ~ (p=0.066 n=20)
Int64N1e18-32                  3.349n ± 2%    3.088n ± 1%   -7.81% (p=0.000 n=20)
Int64N2e18-32                  3.537n ± 1%    3.493n ± 1%   -1.24% (p=0.002 n=20)
Int64N4e18-32                  4.917n ± 0%    5.060n ± 2%   +2.91% (p=0.000 n=20)
Int32N1000-32                  2.386n ± 1%    2.620n ± 1%   +9.76% (p=0.000 n=20)
Int32N1e8-32                   2.366n ± 1%    2.652n ± 0%  +12.11% (p=0.000 n=20)
Int32N1e9-32                   2.355n ± 2%    2.644n ± 1%  +12.32% (p=0.000 n=20)
Int32N2e9-32                   2.371n ± 1%    2.619n ± 2%  +10.48% (p=0.000 n=20)
Float32-32                     2.245n ± 2%    2.261n ± 1%        ~ (p=0.625 n=20)
Float64-32                     2.235n ± 1%    2.241n ± 2%        ~ (p=0.393 n=20)
ExpFloat64-32                  3.813n ± 3%    3.716n ± 1%   -2.53% (p=0.000 n=20)
NormFloat64-32                 3.652n ± 2%    3.718n ± 1%   +1.79% (p=0.006 n=20)
Perm3-32                       33.12n ± 3%    34.11n ± 2%        ~ (p=0.021 n=20)
Perm30-32                      205.1n ± 1%    200.6n ± 0%   -2.17% (p=0.000 n=20)
Perm30ViaShuffle-32            110.8n ± 1%    109.7n ± 1%   -0.99% (p=0.002 n=20)
ShuffleOverhead-32             113.0n ± 1%    107.2n ± 1%   -5.09% (p=0.000 n=20)
Concurrent-32                  2.100n ± 0%    2.108n ± 6%        ~ (p=0.103 n=20)

goos: darwin
goarch: arm64
pkg: math/rand/v2
                       │ 01ff938549.arm64 │           afa459a2f0.arm64           │
                       │      sec/op      │    sec/op     vs base                │
PCG_DXSM-8                    2.531n ± 0%    2.531n ± 0%        ~ (p=0.763 n=20)
SourceUint64-8                2.258n ± 1%    2.531n ± 0%  +12.09% (p=0.000 n=20)
GlobalInt64-8                 2.167n ± 0%    2.177n ± 1%        ~ (p=0.213 n=20)
GlobalInt64Parallel-8        0.4310n ± 0%   0.4319n ± 0%        ~ (p=0.027 n=20)
GlobalUint64-8                2.182n ± 1%    2.185n ± 1%        ~ (p=0.683 n=20)
GlobalUint64Parallel-8       0.4297n ± 0%   0.4295n ± 1%        ~ (p=0.941 n=20)
Int64-8                       2.472n ± 1%    4.104n ± 0%  +66.00% (p=0.000 n=20)
Uint64-8                      2.449n ± 1%    4.080n ± 0%  +66.60% (p=0.000 n=20)
GlobalIntN1000-8              2.814n ± 2%    2.814n ± 1%        ~ (p=0.972 n=20)
IntN1000-8                    2.998n ± 2%    4.140n ± 0%  +38.09% (p=0.000 n=20)
Int64N1000-8                  2.949n ± 2%    4.139n ± 0%  +40.35% (p=0.000 n=20)
Int64N1e8-8                   2.953n ± 2%    4.140n ± 0%  +40.22% (p=0.000 n=20)
Int64N1e9-8                   2.950n ± 0%    4.139n ± 0%  +40.32% (p=0.000 n=20)
Int64N2e9-8                   2.946n ± 2%    4.140n ± 0%  +40.53% (p=0.000 n=20)
Int64N1e18-8                  3.779n ± 1%    5.273n ± 0%  +39.52% (p=0.000 n=20)
Int64N2e18-8                  4.370n ± 1%    6.059n ± 0%  +38.65% (p=0.000 n=20)
Int64N4e18-8                  6.544n ± 1%    8.803n ± 0%  +34.52% (p=0.000 n=20)
Int32N1000-8                  2.950n ± 0%    4.131n ± 0%  +40.06% (p=0.000 n=20)
Int32N1e8-8                   2.950n ± 2%    4.131n ± 0%  +40.03% (p=0.000 n=20)
Int32N1e9-8                   2.951n ± 2%    4.131n ± 0%  +39.99% (p=0.000 n=20)
Int32N2e9-8                   2.950n ± 2%    4.131n ± 0%  +40.03% (p=0.000 n=20)
Float32-8                     3.441n ± 0%    4.110n ± 0%  +19.44% (p=0.000 n=20)
Float64-8                     3.442n ± 0%    4.104n ± 0%  +19.24% (p=0.000 n=20)
ExpFloat64-8                  4.481n ± 0%    5.338n ± 0%  +19.11% (p=0.000 n=20)
NormFloat64-8                 4.725n ± 0%    5.731n ± 0%  +21.28% (p=0.000 n=20)
Perm3-8                       26.55n ± 0%    26.62n ± 0%   +0.28% (p=0.000 n=20)
Perm30-8                      181.9n ± 0%    194.6n ± 2%   +6.98% (p=0.000 n=20)
Perm30ViaShuffle-8            142.9n ± 0%    156.4n ± 0%   +9.45% (p=0.000 n=20)
ShuffleOverhead-8             120.8n ± 2%    125.8n ± 0%   +4.10% (p=0.000 n=20)
Concurrent-8                  2.421n ± 6%    2.654n ± 6%   +9.67% (p=0.002 n=20)

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 01ff938549.386 │            afa459a2f0.386             │
                        │     sec/op     │    sec/op     vs base                 │
PCG_DXSM-32                  7.613n ± 1%    7.793n ± 2%    +2.38% (p=0.000 n=20)
SourceUint64-32              2.069n ± 0%    7.680n ± 1%  +271.19% (p=0.000 n=20)
GlobalInt64-32               3.456n ± 1%    3.474n ± 3%         ~ (p=0.654 n=20)
GlobalInt64Parallel-32      0.3252n ± 0%   0.3253n ± 0%         ~ (p=0.952 n=20)
GlobalUint64-32              3.573n ± 1%    3.433n ± 2%    -3.92% (p=0.000 n=20)
GlobalUint64Parallel-32     0.3159n ± 0%   0.3156n ± 0%         ~ (p=0.223 n=20)
Int64-32                     2.562n ± 2%    7.707n ± 1%  +200.74% (p=0.000 n=20)
Uint64-32                    2.592n ± 0%    7.714n ± 1%  +197.65% (p=0.000 n=20)
GlobalIntN1000-32            6.266n ± 2%    6.236n ± 1%         ~ (p=0.039 n=20)
IntN1000-32                  4.724n ± 2%   10.410n ± 1%  +120.39% (p=0.000 n=20)
Int64N1000-32                5.490n ± 2%   10.975n ± 2%   +99.89% (p=0.000 n=20)
Int64N1e8-32                 5.513n ± 2%   10.980n ± 1%   +99.15% (p=0.000 n=20)
Int64N1e9-32                 5.476n ± 1%   10.950n ± 0%   +99.96% (p=0.000 n=20)
Int64N2e9-32                 5.501n ± 2%   11.110n ± 1%  +101.96% (p=0.000 n=20)
Int64N1e18-32                9.043n ± 2%   15.180n ± 2%   +67.86% (p=0.000 n=20)
Int64N2e18-32                9.601n ± 2%   15.610n ± 1%   +62.60% (p=0.000 n=20)
Int64N4e18-32                12.00n ± 1%    19.23n ± 2%   +60.14% (p=0.000 n=20)
Int32N1000-32                4.829n ± 2%   10.345n ± 1%  +114.25% (p=0.000 n=20)
Int32N1e8-32                 4.825n ± 2%   10.330n ± 1%  +114.09% (p=0.000 n=20)
Int32N1e9-32                 4.830n ± 2%   10.350n ± 1%  +114.26% (p=0.000 n=20)
Int32N2e9-32                 4.750n ± 2%   10.345n ± 1%  +117.81% (p=0.000 n=20)
Float32-32                   10.89n ± 4%    13.57n ± 1%   +24.61% (p=0.000 n=20)
Float64-32                   19.60n ± 4%    22.95n ± 4%   +17.12% (p=0.000 n=20)
ExpFloat64-32                12.96n ± 3%    15.23n ± 2%   +17.47% (p=0.000 n=20)
NormFloat64-32               7.516n ± 1%   13.780n ± 1%   +83.34% (p=0.000 n=20)
Perm3-32                     36.78n ± 2%    46.62n ± 2%   +26.72% (p=0.000 n=20)
Perm30-32                    238.9n ± 2%    400.7n ± 1%   +67.73% (p=0.000 n=20)
Perm30ViaShuffle-32          189.7n ± 2%    350.5n ± 1%   +84.79% (p=0.000 n=20)
ShuffleOverhead-32           159.8n ± 1%    326.0n ± 2%  +104.01% (p=0.000 n=20)
Concurrent-32                3.286n ± 1%    3.290n ± 0%         ~ (p=0.743 n=20)

On the other hand, compared to the original "update benchmarks" CL,
the cleanups we've made more than compensate for PCG being a bit
slower than LFSR, at least on 64-bit x86. ARM64 (Apple M1) is a bit
slower: perhaps the 64x64→128 multiply is slower there for some reason.
386 is noticeably slower, but it's also a non-SSA backend.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 220860f76f.amd64 │            afa459a2f0.amd64            │
                        │      sec/op      │    sec/op     vs base                  │
SourceUint64-32                1.555n ± 1%    1.450n ± 3%   -6.78% (p=0.000 n=20)
GlobalInt64-32                 2.071n ± 1%    2.067n ± 2%        ~ (p=0.673 n=20)
GlobalInt63Parallel-32        0.1023n ± 1%
GlobalInt64Parallel-32                       0.1044n ± 2%
GlobalUint64-32                5.193n ± 1%    2.085n ± 0%  -59.86% (p=0.000 n=20)
GlobalUint64Parallel-32       0.2341n ± 0%   0.1008n ± 1%  -56.93% (p=0.000 n=20)
Int64-32                       2.056n ± 2%    1.779n ± 1%  -13.47% (p=0.000 n=20)
Uint64-32                      2.077n ± 2%    1.854n ± 2%  -10.74% (p=0.000 n=20)
GlobalIntN1000-32              4.077n ± 2%    3.140n ± 3%  -22.98% (p=0.000 n=20)
IntN1000-32                    3.476n ± 2%    2.496n ± 1%  -28.19% (p=0.000 n=20)
Int64N1000-32                  3.059n ± 1%    2.510n ± 2%  -17.96% (p=0.000 n=20)
Int64N1e8-32                   2.942n ± 1%    2.471n ± 2%  -15.98% (p=0.000 n=20)
Int64N1e9-32                   2.932n ± 1%    2.488n ± 2%  -15.14% (p=0.000 n=20)
Int64N2e9-32                   2.925n ± 1%    2.478n ± 2%  -15.30% (p=0.000 n=20)
Int64N1e18-32                  3.116n ± 1%    3.088n ± 1%        ~ (p=0.013 n=20)
Int64N2e18-32                  4.067n ± 1%    3.493n ± 1%  -14.11% (p=0.000 n=20)
Int64N4e18-32                  4.054n ± 1%    5.060n ± 2%  +24.80% (p=0.000 n=20)
Int32N1000-32                  2.951n ± 1%    2.620n ± 1%  -11.22% (p=0.000 n=20)
Int32N1e8-32                   3.102n ± 1%    2.652n ± 0%  -14.50% (p=0.000 n=20)
Int32N1e9-32                   3.535n ± 1%    2.644n ± 1%  -25.20% (p=0.000 n=20)
Int32N2e9-32                   3.514n ± 1%    2.619n ± 2%  -25.47% (p=0.000 n=20)
Float32-32                     2.760n ± 1%    2.261n ± 1%  -18.06% (p=0.000 n=20)
Float64-32                     2.284n ± 1%    2.241n ± 2%        ~ (p=0.016 n=20)
ExpFloat64-32                  3.757n ± 1%    3.716n ± 1%        ~ (p=0.034 n=20)
NormFloat64-32                 3.837n ± 1%    3.718n ± 1%   -3.09% (p=0.000 n=20)
Perm3-32                       35.23n ± 2%    34.11n ± 2%   -3.19% (p=0.000 n=20)
Perm30-32                      208.8n ± 1%    200.6n ± 0%   -3.93% (p=0.000 n=20)
Perm30ViaShuffle-32            111.7n ± 1%    109.7n ± 1%   -1.84% (p=0.000 n=20)
ShuffleOverhead-32             101.1n ± 1%    107.2n ± 1%   +6.03% (p=0.000 n=20)
Concurrent-32                  2.108n ± 7%    2.108n ± 6%        ~ (p=0.644 n=20)
PCG_DXSM-32                                   1.488n ± 2%

goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
                       │ 220860f76f.arm64 │            afa459a2f0.arm64            │
                       │      sec/op      │    sec/op     vs base                  │
SourceUint64-8                2.316n ± 1%    2.531n ± 0%   +9.33% (p=0.000 n=20)
GlobalInt64-8                 2.183n ± 1%    2.177n ± 1%        ~ (p=0.533 n=20)
GlobalInt63Parallel-8        0.4331n ± 0%
GlobalInt64Parallel-8                       0.4319n ± 0%
GlobalUint64-8                4.377n ± 2%    2.185n ± 1%  -50.07% (p=0.000 n=20)
GlobalUint64Parallel-8       0.9237n ± 0%   0.4295n ± 1%  -53.50% (p=0.000 n=20)
Int64-8                       2.538n ± 1%    4.104n ± 0%  +61.68% (p=0.000 n=20)
Uint64-8                      2.604n ± 1%    4.080n ± 0%  +56.68% (p=0.000 n=20)
GlobalIntN1000-8              3.857n ± 2%    2.814n ± 1%  -27.04% (p=0.000 n=20)
IntN1000-8                    3.822n ± 2%    4.140n ± 0%   +8.32% (p=0.000 n=20)
Int64N1000-8                  3.318n ± 0%    4.139n ± 0%  +24.74% (p=0.000 n=20)
Int64N1e8-8                   3.349n ± 1%    4.140n ± 0%  +23.64% (p=0.000 n=20)
Int64N1e9-8                   3.317n ± 2%    4.139n ± 0%  +24.80% (p=0.000 n=20)
Int64N2e9-8                   3.317n ± 2%    4.140n ± 0%  +24.81% (p=0.000 n=20)
Int64N1e18-8                  3.542n ± 1%    5.273n ± 0%  +48.85% (p=0.000 n=20)
Int64N2e18-8                  5.087n ± 0%    6.059n ± 0%  +19.12% (p=0.000 n=20)
Int64N4e18-8                  5.084n ± 0%    8.803n ± 0%  +73.16% (p=0.000 n=20)
Int32N1000-8                  3.208n ± 2%    4.131n ± 0%  +28.79% (p=0.000 n=20)
Int32N1e8-8                   3.610n ± 1%    4.131n ± 0%  +14.43% (p=0.000 n=20)
Int32N1e9-8                   4.235n ± 0%    4.131n ± 0%   -2.44% (p=0.000 n=20)
Int32N2e9-8                   4.229n ± 1%    4.131n ± 0%   -2.33% (p=0.000 n=20)
Float32-8                     3.468n ± 0%    4.110n ± 0%  +18.50% (p=0.000 n=20)
Float64-8                     3.447n ± 0%    4.104n ± 0%  +19.05% (p=0.000 n=20)
ExpFloat64-8                  4.567n ± 0%    5.338n ± 0%  +16.86% (p=0.000 n=20)
NormFloat64-8                 4.821n ± 0%    5.731n ± 0%  +18.89% (p=0.000 n=20)
Perm3-8                       28.89n ± 0%    26.62n ± 0%   -7.84% (p=0.000 n=20)
Perm30-8                      175.7n ± 0%    194.6n ± 2%  +10.76% (p=0.000 n=20)
Perm30ViaShuffle-8            153.5n ± 0%    156.4n ± 0%   +1.86% (p=0.000 n=20)
ShuffleOverhead-8             119.8n ± 1%    125.8n ± 0%   +4.97% (p=0.000 n=20)
Concurrent-8                  2.433n ± 3%    2.654n ± 6%   +9.13% (p=0.001 n=20)
PCG_DXSM-8                                   2.531n ± 0%

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 220860f76f.386 │             afa459a2f0.386              │
                        │     sec/op     │    sec/op     vs base                   │
SourceUint64-32             2.370n ±  1%    7.680n ± 1%  +224.05% (p=0.000 n=20)
GlobalInt64-32              3.569n ±  1%    3.474n ± 3%    -2.66% (p=0.001 n=20)
GlobalInt63Parallel-32     0.3221n ±  1%
GlobalInt64Parallel-32                     0.3253n ± 0%
GlobalUint64-32             8.797n ± 10%    3.433n ± 2%   -60.98% (p=0.000 n=20)
GlobalUint64Parallel-32    0.6351n ±  0%   0.3156n ± 0%   -50.31% (p=0.000 n=20)
Int64-32                    2.612n ±  2%    7.707n ± 1%  +195.04% (p=0.000 n=20)
Uint64-32                   3.350n ±  1%    7.714n ± 1%  +130.25% (p=0.000 n=20)
GlobalIntN1000-32           5.892n ±  1%    6.236n ± 1%    +5.82% (p=0.000 n=20)
IntN1000-32                 4.546n ±  1%   10.410n ± 1%  +128.97% (p=0.000 n=20)
Int64N1000-32               14.59n ±  1%    10.97n ± 2%   -24.75% (p=0.000 n=20)
Int64N1e8-32                14.76n ±  2%    10.98n ± 1%   -25.58% (p=0.000 n=20)
Int64N1e9-32                16.57n ±  1%    10.95n ± 0%   -33.90% (p=0.000 n=20)
Int64N2e9-32                14.54n ±  1%    11.11n ± 1%   -23.62% (p=0.000 n=20)
Int64N1e18-32               16.14n ±  1%    15.18n ± 2%    -5.95% (p=0.000 n=20)
Int64N2e18-32               18.10n ±  1%    15.61n ± 1%   -13.73% (p=0.000 n=20)
Int64N4e18-32               18.65n ±  1%    19.23n ± 2%    +3.08% (p=0.000 n=20)
Int32N1000-32               3.560n ±  1%   10.345n ± 1%  +190.55% (p=0.000 n=20)
Int32N1e8-32                3.770n ±  2%   10.330n ± 1%  +174.01% (p=0.000 n=20)
Int32N1e9-32                4.098n ±  0%   10.350n ± 1%  +152.53% (p=0.000 n=20)
Int32N2e9-32                4.179n ±  1%   10.345n ± 1%  +147.52% (p=0.000 n=20)
Float32-32                  21.18n ±  4%    13.57n ± 1%   -35.93% (p=0.000 n=20)
Float64-32                  20.60n ±  2%    22.95n ± 4%   +11.41% (p=0.000 n=20)
ExpFloat64-32               13.07n ±  0%    15.23n ± 2%   +16.48% (p=0.000 n=20)
NormFloat64-32              7.738n ±  2%   13.780n ± 1%   +78.08% (p=0.000 n=20)
Perm3-32                    36.73n ±  1%    46.62n ± 2%   +26.91% (p=0.000 n=20)
Perm30-32                   211.9n ±  1%    400.7n ± 1%   +89.05% (p=0.000 n=20)
Perm30ViaShuffle-32         165.2n ±  1%    350.5n ± 1%  +112.20% (p=0.000 n=20)
ShuffleOverhead-32          133.9n ±  1%    326.0n ± 2%  +143.37% (p=0.000 n=20)
Concurrent-32               3.287n ±  2%    3.290n ± 0%         ~ (p=0.365 n=20)
PCG_DXSM-32                                 7.793n ± 2%

For #61716.

Change-Id: I4e9c0525b5f84a2ac46f23da9e365495e2d05777
Reviewed-on: https://go-review.googlesource.com/c/go/+/502506
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agomath/rand/v2: add PCG-DXSM
Russ Cox [Tue, 6 Jun 2023 17:50:08 +0000 (13:50 -0400)]
math/rand/v2: add PCG-DXSM

For the original math/rand, we ported Plan 9's random number
generator, which was a refinement by Ken Thompson of an algorithm
by Don Mitchell and Jim Reeds, which Mitchell in turn recalls as
having been derived from an algorithm by Marsaglia. At its core,
it is an additive lagged Fibonacci generator (ALFG).

Whatever the details of the history, this generator is nowhere
near the current state of the art for simple, pseudo-random
generators.

This CL adds an implementation of Melissa O'Neill's PCG, specifically
the variant PCG-DXSM, which she defined after writing the PCG paper
and which is now the default in Numpy. The update is slightly slower
(a few multiplies and adds, instead of a few adds), but the state
is dramatically smaller (2 words instead of 607). The statistical
output properties are better too.

A followup CL will delete the old generator.

PCG is the only change here, so no benchmarks should be affected.
Including them anyway as further evidence for caution.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 8993506f2f.amd64 │           01ff938549.amd64           │
                        │      sec/op      │    sec/op     vs base                │
SourceUint64-32                1.325n ± 1%    1.352n ± 1%   +2.00% (p=0.000 n=20)
GlobalInt64-32                 2.240n ± 1%    2.083n ± 0%   -7.03% (p=0.000 n=20)
GlobalInt64Parallel-32        0.1041n ± 1%   0.1035n ± 1%        ~ (p=0.064 n=20)
GlobalUint64-32                2.072n ± 3%    2.038n ± 1%        ~ (p=0.089 n=20)
GlobalUint64Parallel-32       0.1008n ± 1%   0.1006n ± 1%        ~ (p=0.804 n=20)
Int64-32                       1.716n ± 1%    1.687n ± 2%        ~ (p=0.045 n=20)
Uint64-32                      1.665n ± 1%    1.674n ± 2%        ~ (p=0.878 n=20)
GlobalIntN1000-32              3.335n ± 1%    3.135n ± 1%   -6.00% (p=0.000 n=20)
IntN1000-32                    2.484n ± 1%    2.478n ± 1%        ~ (p=0.085 n=20)
Int64N1000-32                  2.502n ± 2%    2.455n ± 1%   -1.88% (p=0.002 n=20)
Int64N1e8-32                   2.484n ± 2%    2.467n ± 2%        ~ (p=0.048 n=20)
Int64N1e9-32                   2.502n ± 0%    2.454n ± 1%   -1.92% (p=0.000 n=20)
Int64N2e9-32                   2.502n ± 0%    2.482n ± 1%   -0.76% (p=0.000 n=20)
Int64N1e18-32                  3.201n ± 1%    3.349n ± 2%   +4.62% (p=0.000 n=20)
Int64N2e18-32                  3.504n ± 1%    3.537n ± 1%        ~ (p=0.185 n=20)
Int64N4e18-32                  4.873n ± 1%    4.917n ± 0%   +0.90% (p=0.000 n=20)
Int32N1000-32                  2.639n ± 1%    2.386n ± 1%   -9.57% (p=0.000 n=20)
Int32N1e8-32                   2.686n ± 2%    2.366n ± 1%  -11.91% (p=0.000 n=20)
Int32N1e9-32                   2.636n ± 1%    2.355n ± 2%  -10.70% (p=0.000 n=20)
Int32N2e9-32                   2.660n ± 1%    2.371n ± 1%  -10.88% (p=0.000 n=20)
Float32-32                     2.261n ± 1%    2.245n ± 2%        ~ (p=0.752 n=20)
Float64-32                     2.280n ± 1%    2.235n ± 1%   -1.97% (p=0.007 n=20)
ExpFloat64-32                  3.891n ± 1%    3.813n ± 3%        ~ (p=0.087 n=20)
NormFloat64-32                 3.711n ± 1%    3.652n ± 2%        ~ (p=0.021 n=20)
Perm3-32                       32.60n ± 2%    33.12n ± 3%        ~ (p=0.107 n=20)
Perm30-32                      204.2n ± 0%    205.1n ± 1%        ~ (p=0.358 n=20)
Perm30ViaShuffle-32            121.7n ± 2%    110.8n ± 1%   -8.96% (p=0.000 n=20)
ShuffleOverhead-32             106.2n ± 2%    113.0n ± 1%   +6.36% (p=0.000 n=20)
Concurrent-32                  2.190n ± 5%    2.100n ± 0%   -4.13% (p=0.001 n=20)
PCG_DXSM-32                                   1.490n ± 0%

goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
                       │ 8993506f2f.arm64 │           01ff938549.arm64           │
                       │      sec/op      │    sec/op     vs base                │
SourceUint64-8                2.271n ± 0%    2.258n ± 1%        ~ (p=0.167 n=20)
GlobalInt64-8                 2.161n ± 1%    2.167n ± 0%        ~ (p=0.693 n=20)
GlobalInt64Parallel-8        0.4303n ± 0%   0.4310n ± 0%        ~ (p=0.051 n=20)
GlobalUint64-8                2.164n ± 1%    2.182n ± 1%        ~ (p=0.042 n=20)
GlobalUint64Parallel-8       0.4287n ± 0%   0.4297n ± 0%        ~ (p=0.082 n=20)
Int64-8                       2.478n ± 1%    2.472n ± 1%        ~ (p=0.151 n=20)
Uint64-8                      2.460n ± 1%    2.449n ± 1%        ~ (p=0.013 n=20)
GlobalIntN1000-8              2.814n ± 2%    2.814n ± 2%        ~ (p=0.821 n=20)
IntN1000-8                    3.003n ± 2%    2.998n ± 2%        ~ (p=0.024 n=20)
Int64N1000-8                  2.954n ± 0%    2.949n ± 2%        ~ (p=0.192 n=20)
Int64N1e8-8                   2.956n ± 0%    2.953n ± 2%        ~ (p=0.109 n=20)
Int64N1e9-8                   3.325n ± 0%    2.950n ± 0%  -11.26% (p=0.000 n=20)
Int64N2e9-8                   2.956n ± 2%    2.946n ± 2%        ~ (p=0.027 n=20)
Int64N1e18-8                  3.780n ± 1%    3.779n ± 1%        ~ (p=0.815 n=20)
Int64N2e18-8                  4.385n ± 0%    4.370n ± 1%        ~ (p=0.402 n=20)
Int64N4e18-8                  6.527n ± 0%    6.544n ± 1%        ~ (p=0.140 n=20)
Int32N1000-8                  2.964n ± 1%    2.950n ± 0%   -0.47% (p=0.002 n=20)
Int32N1e8-8                   2.964n ± 1%    2.950n ± 2%        ~ (p=0.013 n=20)
Int32N1e9-8                   2.963n ± 2%    2.951n ± 2%        ~ (p=0.062 n=20)
Int32N2e9-8                   2.961n ± 2%    2.950n ± 2%   -0.37% (p=0.002 n=20)
Float32-8                     3.442n ± 0%    3.441n ± 0%        ~ (p=0.211 n=20)
Float64-8                     3.442n ± 0%    3.442n ± 0%        ~ (p=0.067 n=20)
ExpFloat64-8                  4.472n ± 0%    4.481n ± 0%   +0.20% (p=0.000 n=20)
NormFloat64-8                 4.734n ± 0%    4.725n ± 0%   -0.19% (p=0.003 n=20)
Perm3-8                       26.55n ± 0%    26.55n ± 0%        ~ (p=0.833 n=20)
Perm30-8                      181.9n ± 0%    181.9n ± 0%   -0.03% (p=0.004 n=20)
Perm30ViaShuffle-8            143.1n ± 0%    142.9n ± 0%        ~ (p=0.204 n=20)
ShuffleOverhead-8             120.6n ± 1%    120.8n ± 2%        ~ (p=0.102 n=20)
Concurrent-8                  2.357n ± 2%    2.421n ± 6%        ~ (p=0.016 n=20)
PCG_DXSM-8                                   2.531n ± 0%

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 8993506f2f.386 │           01ff938549.386            │
                        │     sec/op     │    sec/op     vs base               │
SourceUint64-32              2.102n ± 2%    2.069n ± 0%       ~ (p=0.021 n=20)
GlobalInt64-32               3.542n ± 2%    3.456n ± 1%  -2.44% (p=0.001 n=20)
GlobalInt64Parallel-32      0.3202n ± 0%   0.3252n ± 0%  +1.56% (p=0.000 n=20)
GlobalUint64-32              3.507n ± 1%    3.573n ± 1%  +1.87% (p=0.000 n=20)
GlobalUint64Parallel-32     0.3170n ± 1%   0.3159n ± 0%       ~ (p=0.167 n=20)
Int64-32                     2.516n ± 1%    2.562n ± 2%       ~ (p=0.016 n=20)
Uint64-32                    2.544n ± 1%    2.592n ± 0%  +1.85% (p=0.000 n=20)
GlobalIntN1000-32            6.237n ± 1%    6.266n ± 2%       ~ (p=0.268 n=20)
IntN1000-32                  4.670n ± 2%    4.724n ± 2%       ~ (p=0.644 n=20)
Int64N1000-32                5.412n ± 1%    5.490n ± 2%       ~ (p=0.159 n=20)
Int64N1e8-32                 5.414n ± 2%    5.513n ± 2%       ~ (p=0.129 n=20)
Int64N1e9-32                 5.473n ± 1%    5.476n ± 1%       ~ (p=0.723 n=20)
Int64N2e9-32                 5.487n ± 1%    5.501n ± 2%       ~ (p=0.481 n=20)
Int64N1e18-32                8.901n ± 2%    9.043n ± 2%       ~ (p=0.330 n=20)
Int64N2e18-32                9.521n ± 1%    9.601n ± 2%       ~ (p=0.703 n=20)
Int64N4e18-32                11.92n ± 1%    12.00n ± 1%       ~ (p=0.489 n=20)
Int32N1000-32                4.785n ± 1%    4.829n ± 2%       ~ (p=0.402 n=20)
Int32N1e8-32                 4.748n ± 1%    4.825n ± 2%       ~ (p=0.218 n=20)
Int32N1e9-32                 4.810n ± 1%    4.830n ± 2%       ~ (p=0.794 n=20)
Int32N2e9-32                 4.812n ± 1%    4.750n ± 2%       ~ (p=0.057 n=20)
Float32-32                   10.48n ± 4%    10.89n ± 4%       ~ (p=0.162 n=20)
Float64-32                   19.79n ± 3%    19.60n ± 4%       ~ (p=0.668 n=20)
ExpFloat64-32                12.91n ± 3%    12.96n ± 3%       ~ (p=1.000 n=20)
NormFloat64-32               7.462n ± 1%    7.516n ± 1%       ~ (p=0.051 n=20)
Perm3-32                     35.98n ± 2%    36.78n ± 2%       ~ (p=0.033 n=20)
Perm30-32                    241.5n ± 1%    238.9n ± 2%       ~ (p=0.126 n=20)
Perm30ViaShuffle-32          187.3n ± 2%    189.7n ± 2%       ~ (p=0.387 n=20)
ShuffleOverhead-32           160.2n ± 1%    159.8n ± 1%       ~ (p=0.256 n=20)
Concurrent-32                3.308n ± 3%    3.286n ± 1%       ~ (p=0.038 n=20)
PCG_DXSM-32                                 7.613n ± 1%

For #61716.

Change-Id: Icb274ca1f782504d658305a40159b4ae6a2f3f1d
Reviewed-on: https://go-review.googlesource.com/c/go/+/502505
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Rob Pike <r@golang.org>
6 months agomath/rand/v2: simplify Perm
Russ Cox [Tue, 6 Jun 2023 14:51:09 +0000 (10:51 -0400)]
math/rand/v2: simplify Perm

The compiler says Perm is being inlined into BenchmarkPerm,
and yet BenchmarkPerm30ViaShuffle, which you'd think is the
same code, still runs significantly faster.

The benchmarks are mystifying but this is clearly still a step in
the right direction, since BenchmarkPerm30ViaShuffle is still
the fastest and we avoid having two copies of that logic.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ e1bbe739fb.amd64 │           8993506f2f.amd64           │
                        │      sec/op      │    sec/op     vs base                │
SourceUint64-32                1.316n ± 2%    1.325n ± 1%        ~ (p=0.208 n=20)
GlobalInt64-32                 2.048n ± 1%    2.240n ± 1%   +9.38% (p=0.000 n=20)
GlobalInt64Parallel-32        0.1037n ± 1%   0.1041n ± 1%        ~ (p=0.774 n=20)
GlobalUint64-32                2.039n ± 2%    2.072n ± 3%        ~ (p=0.115 n=20)
GlobalUint64Parallel-32       0.1013n ± 1%   0.1008n ± 1%        ~ (p=0.417 n=20)
Int64-32                       1.692n ± 2%    1.716n ± 1%        ~ (p=0.122 n=20)
Uint64-32                      1.643n ± 2%    1.665n ± 1%        ~ (p=0.062 n=20)
GlobalIntN1000-32              3.287n ± 1%    3.335n ± 1%        ~ (p=0.147 n=20)
IntN1000-32                    2.678n ± 2%    2.484n ± 1%   -7.24% (p=0.000 n=20)
Int64N1000-32                  2.684n ± 2%    2.502n ± 2%   -6.80% (p=0.000 n=20)
Int64N1e8-32                   2.663n ± 2%    2.484n ± 2%   -6.76% (p=0.000 n=20)
Int64N1e9-32                   2.633n ± 1%    2.502n ± 0%   -4.98% (p=0.000 n=20)
Int64N2e9-32                   2.657n ± 1%    2.502n ± 0%   -5.87% (p=0.000 n=20)
Int64N1e18-32                  3.125n ± 2%    3.201n ± 1%   +2.43% (p=0.000 n=20)
Int64N2e18-32                  3.476n ± 1%    3.504n ± 1%   +0.83% (p=0.009 n=20)
Int64N4e18-32                  4.795n ± 1%    4.873n ± 1%        ~ (p=0.106 n=20)
Int32N1000-32                  2.485n ± 2%    2.639n ± 1%   +6.20% (p=0.000 n=20)
Int32N1e8-32                   2.457n ± 1%    2.686n ± 2%   +9.34% (p=0.000 n=20)
Int32N1e9-32                   2.452n ± 1%    2.636n ± 1%   +7.52% (p=0.000 n=20)
Int32N2e9-32                   2.453n ± 1%    2.660n ± 1%   +8.44% (p=0.000 n=20)
Float32-32                     2.254n ± 1%    2.261n ± 1%        ~ (p=0.888 n=20)
Float64-32                     2.262n ± 1%    2.280n ± 1%        ~ (p=0.040 n=20)
ExpFloat64-32                  3.777n ± 2%    3.891n ± 1%   +3.03% (p=0.000 n=20)
NormFloat64-32                 3.606n ± 1%    3.711n ± 1%   +2.91% (p=0.000 n=20)
Perm3-32                       33.12n ± 2%    32.60n ± 2%        ~ (p=0.045 n=20)
Perm30-32                      176.1n ± 1%    204.2n ± 0%  +15.96% (p=0.000 n=20)
Perm30ViaShuffle-32            109.3n ± 1%    121.7n ± 2%  +11.30% (p=0.000 n=20)
ShuffleOverhead-32             112.5n ± 1%    106.2n ± 2%   -5.56% (p=0.000 n=20)
Concurrent-32                  2.099n ± 0%    2.190n ± 5%   +4.36% (p=0.001 n=20)

goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
                       │ e1bbe739fb.arm64 │           8993506f2f.arm64           │
                       │      sec/op      │    sec/op     vs base                │
SourceUint64-8                2.290n ± 1%    2.271n ± 0%        ~ (p=0.015 n=20)
GlobalInt64-8                 2.180n ± 1%    2.161n ± 1%        ~ (p=0.180 n=20)
GlobalInt64Parallel-8        0.4294n ± 0%   0.4303n ± 0%   +0.19% (p=0.001 n=20)
GlobalUint64-8                2.170n ± 1%    2.164n ± 1%        ~ (p=0.673 n=20)
GlobalUint64Parallel-8       0.4283n ± 0%   0.4287n ± 0%        ~ (p=0.128 n=20)
Int64-8                       2.481n ± 1%    2.478n ± 1%        ~ (p=0.867 n=20)
Uint64-8                      2.464n ± 1%    2.460n ± 1%        ~ (p=0.763 n=20)
GlobalIntN1000-8              2.814n ± 0%    2.814n ± 2%        ~ (p=0.969 n=20)
IntN1000-8                    2.934n ± 2%    3.003n ± 2%   +2.35% (p=0.000 n=20)
Int64N1000-8                  2.957n ± 1%    2.954n ± 0%        ~ (p=0.285 n=20)
Int64N1e8-8                   2.935n ± 2%    2.956n ± 0%   +0.73% (p=0.002 n=20)
Int64N1e9-8                   2.935n ± 2%    3.325n ± 0%  +13.29% (p=0.000 n=20)
Int64N2e9-8                   2.933n ± 4%    2.956n ± 2%        ~ (p=0.163 n=20)
Int64N1e18-8                  3.781n ± 1%    3.780n ± 1%        ~ (p=0.805 n=20)
Int64N2e18-8                  4.362n ± 0%    4.385n ± 0%        ~ (p=0.077 n=20)
Int64N4e18-8                  6.576n ± 1%    6.527n ± 0%        ~ (p=0.024 n=20)
Int32N1000-8                  2.942n ± 2%    2.964n ± 1%        ~ (p=0.073 n=20)
Int32N1e8-8                   2.941n ± 1%    2.964n ± 1%        ~ (p=0.058 n=20)
Int32N1e9-8                   2.938n ± 2%    2.963n ± 2%   +0.87% (p=0.003 n=20)
Int32N2e9-8                   2.982n ± 2%    2.961n ± 2%        ~ (p=0.056 n=20)
Float32-8                     3.441n ± 0%    3.442n ± 0%        ~ (p=0.030 n=20)
Float64-8                     3.441n ± 0%    3.442n ± 0%   +0.03% (p=0.001 n=20)
ExpFloat64-8                  4.472n ± 0%    4.472n ± 0%        ~ (p=0.877 n=20)
NormFloat64-8                 4.716n ± 0%    4.734n ± 0%   +0.38% (p=0.000 n=20)
Perm3-8                       26.66n ± 0%    26.55n ± 0%   -0.39% (p=0.000 n=20)
Perm30-8                      143.3n ± 0%    181.9n ± 0%  +26.97% (p=0.000 n=20)
Perm30ViaShuffle-8            142.9n ± 0%    143.1n ± 0%        ~ (p=0.669 n=20)
ShuffleOverhead-8             121.1n ± 1%    120.6n ± 1%   -0.41% (p=0.004 n=20)
Concurrent-8                  2.379n ± 2%    2.357n ± 2%        ~ (p=0.337 n=20)

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ e1bbe739fb.386 │            8993506f2f.386            │
                        │     sec/op     │    sec/op     vs base                │
SourceUint64-32              2.087n ± 1%    2.102n ± 2%        ~ (p=0.507 n=20)
GlobalInt64-32               3.538n ± 2%    3.542n ± 2%        ~ (p=0.425 n=20)
GlobalInt64Parallel-32      0.3207n ± 1%   0.3202n ± 0%        ~ (p=0.963 n=20)
GlobalUint64-32              3.543n ± 1%    3.507n ± 1%        ~ (p=0.034 n=20)
GlobalUint64Parallel-32     0.3170n ± 0%   0.3170n ± 1%        ~ (p=0.920 n=20)
Int64-32                     2.548n ± 1%    2.516n ± 1%        ~ (p=0.139 n=20)
Uint64-32                    2.565n ± 2%    2.544n ± 1%        ~ (p=0.394 n=20)
GlobalIntN1000-32            6.300n ± 1%    6.237n ± 1%        ~ (p=0.029 n=20)
IntN1000-32                  4.750n ± 0%    4.670n ± 2%        ~ (p=0.034 n=20)
Int64N1000-32                5.515n ± 2%    5.412n ± 1%   -1.86% (p=0.009 n=20)
Int64N1e8-32                 5.527n ± 0%    5.414n ± 2%   -2.05% (p=0.002 n=20)
Int64N1e9-32                 5.531n ± 2%    5.473n ± 1%        ~ (p=0.047 n=20)
Int64N2e9-32                 5.514n ± 2%    5.487n ± 1%        ~ (p=0.298 n=20)
Int64N1e18-32                9.059n ± 1%    8.901n ± 2%        ~ (p=0.037 n=20)
Int64N2e18-32                9.594n ± 1%    9.521n ± 1%        ~ (p=0.051 n=20)
Int64N4e18-32                12.05n ± 2%    11.92n ± 1%        ~ (p=0.357 n=20)
Int32N1000-32                4.840n ± 2%    4.785n ± 1%        ~ (p=0.189 n=20)
Int32N1e8-32                 4.832n ± 2%    4.748n ± 1%        ~ (p=0.042 n=20)
Int32N1e9-32                 4.815n ± 2%    4.810n ± 1%        ~ (p=0.878 n=20)
Int32N2e9-32                 4.813n ± 1%    4.812n ± 1%        ~ (p=0.542 n=20)
Float32-32                   10.90n ± 2%    10.48n ± 4%   -3.85% (p=0.007 n=20)
Float64-32                   20.32n ± 4%    19.79n ± 3%        ~ (p=0.553 n=20)
ExpFloat64-32                12.95n ± 3%    12.91n ± 3%        ~ (p=0.909 n=20)
NormFloat64-32               7.570n ± 1%    7.462n ± 1%   -1.44% (p=0.004 n=20)
Perm3-32                     37.80n ± 2%    35.98n ± 2%   -4.79% (p=0.000 n=20)
Perm30-32                    214.0n ± 1%    241.5n ± 1%  +12.85% (p=0.000 n=20)
Perm30ViaShuffle-32          188.7n ± 2%    187.3n ± 2%        ~ (p=0.029 n=20)
ShuffleOverhead-32           160.8n ± 1%    160.2n ± 1%        ~ (p=0.180 n=20)
Concurrent-32                3.288n ± 0%    3.308n ± 3%        ~ (p=0.037 n=20)

For #61716.

Change-Id: I342b611456c3569520d3c91c849d29eba325d87e
Reviewed-on: https://go-review.googlesource.com/c/go/+/502504
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Rob Pike <r@golang.org>
6 months agomath/rand/v2: remove bias in ExpFloat64 and NormFloat64
Branden Brown [Sat, 5 Aug 2023 13:24:57 +0000 (09:24 -0400)]
math/rand/v2: remove bias in ExpFloat64 and NormFloat64

The original implementation of the ziggurat algorithm was designed for
32-bit random integer inputs. This necessitated reusing some low-order
bits for the slice selection and the random coordinate, which introduces
statistical bias. The result is that PractRand consistently fails the
math/rand normal and exponential sequences (transformed to uniform)
within 2 GB of variates.

This change adjusts the ziggurat procedures to use 63-bit random inputs,
so that there is no need to reuse bits between the slice and coordinate.
This is sufficient for the normal sequence to survive to 256 GB of
PractRand testing.

An alternative technique is to recalculate the ziggurats to use 1024
rather than 128 or 256 slices to make full use of 64-bit inputs. This
improves the survival of the normal sequence to far beyond 256 GB and
additionally provides a 6% performance improvement due to the improved
rejection procedure efficiency. However, doing so increases the total
size of the ziggurat tables from 4.5 kB to 48 kB.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 2703446c2e.amd64 │           e1bbe739fb.amd64           │
                        │      sec/op      │    sec/op     vs base                │
SourceUint64-32                1.337n ± 1%    1.316n ± 2%        ~ (p=0.024 n=20)
GlobalInt64-32                 2.225n ± 2%    2.048n ± 1%   -7.93% (p=0.000 n=20)
GlobalInt64Parallel-32        0.1043n ± 2%   0.1037n ± 1%        ~ (p=0.587 n=20)
GlobalUint64-32                2.058n ± 1%    2.039n ± 2%        ~ (p=0.030 n=20)
GlobalUint64Parallel-32       0.1009n ± 1%   0.1013n ± 1%        ~ (p=0.984 n=20)
Int64-32                       1.719n ± 2%    1.692n ± 2%        ~ (p=0.085 n=20)
Uint64-32                      1.669n ± 1%    1.643n ± 2%        ~ (p=0.049 n=20)
GlobalIntN1000-32              3.321n ± 2%    3.287n ± 1%        ~ (p=0.298 n=20)
IntN1000-32                    2.479n ± 1%    2.678n ± 2%   +8.01% (p=0.000 n=20)
Int64N1000-32                  2.477n ± 1%    2.684n ± 2%   +8.38% (p=0.000 n=20)
Int64N1e8-32                   2.490n ± 1%    2.663n ± 2%   +6.99% (p=0.000 n=20)
Int64N1e9-32                   2.458n ± 1%    2.633n ± 1%   +7.12% (p=0.000 n=20)
Int64N2e9-32                   2.486n ± 2%    2.657n ± 1%   +6.90% (p=0.000 n=20)
Int64N1e18-32                  3.215n ± 2%    3.125n ± 2%   -2.78% (p=0.000 n=20)
Int64N2e18-32                  3.588n ± 2%    3.476n ± 1%   -3.15% (p=0.000 n=20)
Int64N4e18-32                  4.938n ± 2%    4.795n ± 1%   -2.91% (p=0.000 n=20)
Int32N1000-32                  2.673n ± 2%    2.485n ± 2%   -7.02% (p=0.000 n=20)
Int32N1e8-32                   2.631n ± 2%    2.457n ± 1%   -6.63% (p=0.000 n=20)
Int32N1e9-32                   2.628n ± 2%    2.452n ± 1%   -6.70% (p=0.000 n=20)
Int32N2e9-32                   2.684n ± 2%    2.453n ± 1%   -8.61% (p=0.000 n=20)
Float32-32                     2.240n ± 2%    2.254n ± 1%        ~ (p=0.878 n=20)
Float64-32                     2.253n ± 1%    2.262n ± 1%        ~ (p=0.963 n=20)
ExpFloat64-32                  3.677n ± 1%    3.777n ± 2%   +2.71% (p=0.004 n=20)
NormFloat64-32                 3.761n ± 1%    3.606n ± 1%   -4.15% (p=0.000 n=20)
Perm3-32                       33.55n ± 2%    33.12n ± 2%        ~ (p=0.402 n=20)
Perm30-32                      173.2n ± 1%    176.1n ± 1%   +1.67% (p=0.000 n=20)
Perm30ViaShuffle-32            115.9n ± 1%    109.3n ± 1%   -5.69% (p=0.000 n=20)
ShuffleOverhead-32             101.9n ± 1%    112.5n ± 1%  +10.35% (p=0.000 n=20)
Concurrent-32                  2.107n ± 6%    2.099n ± 0%        ~ (p=0.051 n=20)

goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
                       │ 2703446c2e.arm64 │          e1bbe739fb.arm64           │
                       │      sec/op      │    sec/op     vs base               │
SourceUint64-8                2.275n ± 0%    2.290n ± 1%       ~ (p=0.044 n=20)
GlobalInt64-8                 2.154n ± 1%    2.180n ± 1%       ~ (p=0.068 n=20)
GlobalInt64Parallel-8        0.4298n ± 0%   0.4294n ± 0%       ~ (p=0.079 n=20)
GlobalUint64-8                2.160n ± 1%    2.170n ± 1%       ~ (p=0.129 n=20)
GlobalUint64Parallel-8       0.4286n ± 0%   0.4283n ± 0%       ~ (p=0.350 n=20)
Int64-8                       2.491n ± 1%    2.481n ± 1%       ~ (p=0.330 n=20)
Uint64-8                      2.458n ± 0%    2.464n ± 1%       ~ (p=0.351 n=20)
GlobalIntN1000-8              2.814n ± 2%    2.814n ± 0%       ~ (p=0.325 n=20)
IntN1000-8                    2.933n ± 0%    2.934n ± 2%       ~ (p=0.079 n=20)
Int64N1000-8                  2.962n ± 1%    2.957n ± 1%       ~ (p=0.259 n=20)
Int64N1e8-8                   2.960n ± 1%    2.935n ± 2%       ~ (p=0.276 n=20)
Int64N1e9-8                   2.935n ± 2%    2.935n ± 2%       ~ (p=0.984 n=20)
Int64N2e9-8                   2.934n ± 0%    2.933n ± 4%       ~ (p=0.463 n=20)
Int64N1e18-8                  3.777n ± 1%    3.781n ± 1%       ~ (p=0.516 n=20)
Int64N2e18-8                  4.359n ± 1%    4.362n ± 0%       ~ (p=0.256 n=20)
Int64N4e18-8                  6.536n ± 1%    6.576n ± 1%       ~ (p=0.224 n=20)
Int32N1000-8                  2.937n ± 0%    2.942n ± 2%       ~ (p=0.312 n=20)
Int32N1e8-8                   2.937n ± 1%    2.941n ± 1%       ~ (p=0.463 n=20)
Int32N1e9-8                   2.936n ± 0%    2.938n ± 2%       ~ (p=0.044 n=20)
Int32N2e9-8                   2.938n ± 2%    2.982n ± 2%       ~ (p=0.174 n=20)
Float32-8                     3.441n ± 0%    3.441n ± 0%       ~ (p=0.064 n=20)
Float64-8                     3.441n ± 0%    3.441n ± 0%       ~ (p=0.826 n=20)
ExpFloat64-8                  4.486n ± 0%    4.472n ± 0%  -0.31% (p=0.000 n=20)
NormFloat64-8                 4.721n ± 0%    4.716n ± 0%       ~ (p=0.051 n=20)
Perm3-8                       26.65n ± 0%    26.66n ± 0%       ~ (p=0.080 n=20)
Perm30-8                      143.2n ± 0%    143.3n ± 0%  +0.10% (p=0.000 n=20)
Perm30ViaShuffle-8            143.0n ± 0%    142.9n ± 0%       ~ (p=0.642 n=20)
ShuffleOverhead-8             120.6n ± 1%    121.1n ± 1%  +0.41% (p=0.010 n=20)
Concurrent-8                  2.399n ± 5%    2.379n ± 2%       ~ (p=0.365 n=20)

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 2703446c2e.386 │           e1bbe739fb.386            │
                        │     sec/op     │    sec/op     vs base               │
SourceUint64-32             2.072n ±  2%    2.087n ± 1%       ~ (p=0.440 n=20)
GlobalInt64-32              3.546n ± 27%    3.538n ± 2%       ~ (p=0.101 n=20)
GlobalInt64Parallel-32     0.3211n ±  0%   0.3207n ± 1%       ~ (p=0.753 n=20)
GlobalUint64-32             3.522n ±  2%    3.543n ± 1%       ~ (p=0.071 n=20)
GlobalUint64Parallel-32    0.3172n ±  0%   0.3170n ± 0%       ~ (p=0.507 n=20)
Int64-32                    2.520n ±  2%    2.548n ± 1%       ~ (p=0.267 n=20)
Uint64-32                   2.581n ±  1%    2.565n ± 2%       ~ (p=0.143 n=20)
GlobalIntN1000-32           6.171n ±  1%    6.300n ± 1%       ~ (p=0.037 n=20)
IntN1000-32                 4.752n ±  2%    4.750n ± 0%       ~ (p=0.984 n=20)
Int64N1000-32               5.429n ±  1%    5.515n ± 2%       ~ (p=0.292 n=20)
Int64N1e8-32                5.469n ±  2%    5.527n ± 0%       ~ (p=0.013 n=20)
Int64N1e9-32                5.489n ±  2%    5.531n ± 2%       ~ (p=0.256 n=20)
Int64N2e9-32                5.492n ±  2%    5.514n ± 2%       ~ (p=0.606 n=20)
Int64N1e18-32               8.927n ±  1%    9.059n ± 1%       ~ (p=0.229 n=20)
Int64N2e18-32               9.622n ±  1%    9.594n ± 1%       ~ (p=0.703 n=20)
Int64N4e18-32               12.03n ±  1%    12.05n ± 2%       ~ (p=0.733 n=20)
Int32N1000-32               4.817n ±  1%    4.840n ± 2%       ~ (p=0.941 n=20)
Int32N1e8-32                4.801n ±  1%    4.832n ± 2%       ~ (p=0.228 n=20)
Int32N1e9-32                4.798n ±  1%    4.815n ± 2%       ~ (p=0.560 n=20)
Int32N2e9-32                4.840n ±  1%    4.813n ± 1%       ~ (p=0.015 n=20)
Float32-32                  10.51n ±  4%    10.90n ± 2%  +3.71% (p=0.007 n=20)
Float64-32                  20.33n ±  3%    20.32n ± 4%       ~ (p=0.566 n=20)
ExpFloat64-32               12.59n ±  2%    12.95n ± 3%  +2.86% (p=0.002 n=20)
NormFloat64-32              7.350n ±  2%    7.570n ± 1%  +2.99% (p=0.007 n=20)
Perm3-32                    39.29n ±  2%    37.80n ± 2%  -3.79% (p=0.000 n=20)
Perm30-32                   219.1n ±  2%    214.0n ± 1%  -2.33% (p=0.002 n=20)
Perm30ViaShuffle-32         189.8n ±  2%    188.7n ± 2%       ~ (p=0.147 n=20)
ShuffleOverhead-32          158.9n ±  2%    160.8n ± 1%       ~ (p=0.176 n=20)
Concurrent-32               3.306n ±  3%    3.288n ± 0%  -0.54% (p=0.005 n=20)

For #61716.

Change-Id: I4c5fe710b310dc075ae21c97d1805bcc20db5050
Reviewed-on: https://go-review.googlesource.com/c/go/+/516275
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Rob Pike <r@golang.org>
6 months agomath/rand/v2: optimize Float32, Float64
Russ Cox [Tue, 6 Jun 2023 14:49:19 +0000 (10:49 -0400)]
math/rand/v2: optimize Float32, Float64

We realized too late after Go 1 that float64(r.Uint64())/(1<<64)
is not a correct implementation: it occasionally rounds to 1.
The correct implementation is float64(r.Uint64()&(1<<53-1))/(1<<53)
but we couldn't change the implementation for compatibility, so we
changed it to retry only in the "round to 1" cases.

The change to v2 lets us update the algorithm to the simpler,
faster one.

Note that this implementation cannot generate 2⁻⁵⁴, nor 2⁻¹⁰⁰,
nor any of the other numbers between 0 and 2⁻⁵³. A slower algorithm
could shift some of the probability of generating these two boundary
values over to the values in between, but that would be much slower
and not necessarily be better. In particular, the current
implementation has the property that there are uniform gaps between
the possible returned floats, which might help stability. Also, the
result is often scaled and shifted, like Float64()*X+Y. Multiplying by
X>1 would open new gaps, and adding most Y would erase all the
distinctions that were introduced.

The only changes to benchmarks should be in Float32 and Float64.
The other changes remain a cautionary tale.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 4d84a369d1.amd64 │           2703446c2e.amd64           │
                        │      sec/op      │    sec/op     vs base                │
SourceUint64-32                1.348n ± 2%    1.337n ± 1%        ~ (p=0.662 n=20)
GlobalInt64-32                 2.082n ± 2%    2.225n ± 2%   +6.87% (p=0.000 n=20)
GlobalInt64Parallel-32        0.1036n ± 1%   0.1043n ± 2%        ~ (p=0.171 n=20)
GlobalUint64-32                2.077n ± 2%    2.058n ± 1%        ~ (p=0.560 n=20)
GlobalUint64Parallel-32       0.1012n ± 1%   0.1009n ± 1%        ~ (p=0.995 n=20)
Int64-32                       1.750n ± 0%    1.719n ± 2%   -1.74% (p=0.000 n=20)
Uint64-32                      1.707n ± 2%    1.669n ± 1%   -2.20% (p=0.000 n=20)
GlobalIntN1000-32              3.192n ± 1%    3.321n ± 2%   +4.04% (p=0.000 n=20)
IntN1000-32                    2.462n ± 2%    2.479n ± 1%        ~ (p=0.417 n=20)
Int64N1000-32                  2.470n ± 1%    2.477n ± 1%        ~ (p=0.664 n=20)
Int64N1e8-32                   2.503n ± 2%    2.490n ± 1%        ~ (p=0.245 n=20)
Int64N1e9-32                   2.487n ± 1%    2.458n ± 1%        ~ (p=0.032 n=20)
Int64N2e9-32                   2.487n ± 1%    2.486n ± 2%        ~ (p=0.507 n=20)
Int64N1e18-32                  3.006n ± 2%    3.215n ± 2%   +6.94% (p=0.000 n=20)
Int64N2e18-32                  3.368n ± 1%    3.588n ± 2%   +6.55% (p=0.000 n=20)
Int64N4e18-32                  4.763n ± 1%    4.938n ± 2%   +3.69% (p=0.000 n=20)
Int32N1000-32                  2.403n ± 1%    2.673n ± 2%  +11.19% (p=0.000 n=20)
Int32N1e8-32                   2.405n ± 1%    2.631n ± 2%   +9.42% (p=0.000 n=20)
Int32N1e9-32                   2.402n ± 2%    2.628n ± 2%   +9.41% (p=0.000 n=20)
Int32N2e9-32                   2.384n ± 1%    2.684n ± 2%  +12.56% (p=0.000 n=20)
Float32-32                     2.641n ± 2%    2.240n ± 2%  -15.18% (p=0.000 n=20)
Float64-32                     2.483n ± 1%    2.253n ± 1%   -9.26% (p=0.000 n=20)
ExpFloat64-32                  3.486n ± 2%    3.677n ± 1%   +5.49% (p=0.000 n=20)
NormFloat64-32                 3.648n ± 1%    3.761n ± 1%   +3.11% (p=0.000 n=20)
Perm3-32                       33.04n ± 1%    33.55n ± 2%        ~ (p=0.180 n=20)
Perm30-32                      171.9n ± 1%    173.2n ± 1%        ~ (p=0.050 n=20)
Perm30ViaShuffle-32            100.3n ± 1%    115.9n ± 1%  +15.55% (p=0.000 n=20)
ShuffleOverhead-32             102.5n ± 1%    101.9n ± 1%        ~ (p=0.266 n=20)
Concurrent-32                  2.101n ± 0%    2.107n ± 6%        ~ (p=0.212 n=20)

goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
                       │ 4d84a369d1.arm64 │          2703446c2e.arm64           │
                       │      sec/op      │    sec/op     vs base               │
SourceUint64-8                2.261n ± 1%    2.275n ± 0%       ~ (p=0.082 n=20)
GlobalInt64-8                 2.160n ± 1%    2.154n ± 1%       ~ (p=0.490 n=20)
GlobalInt64Parallel-8        0.4299n ± 0%   0.4298n ± 0%       ~ (p=0.663 n=20)
GlobalUint64-8                2.169n ± 1%    2.160n ± 1%       ~ (p=0.292 n=20)
GlobalUint64Parallel-8       0.4293n ± 1%   0.4286n ± 0%       ~ (p=0.155 n=20)
Int64-8                       2.473n ± 1%    2.491n ± 1%       ~ (p=0.317 n=20)
Uint64-8                      2.453n ± 1%    2.458n ± 0%       ~ (p=0.941 n=20)
GlobalIntN1000-8              2.814n ± 2%    2.814n ± 2%       ~ (p=0.972 n=20)
IntN1000-8                    2.933n ± 2%    2.933n ± 0%       ~ (p=0.287 n=20)
Int64N1000-8                  2.934n ± 2%    2.962n ± 1%       ~ (p=0.062 n=20)
Int64N1e8-8                   2.935n ± 2%    2.960n ± 1%       ~ (p=0.183 n=20)
Int64N1e9-8                   2.934n ± 2%    2.935n ± 2%       ~ (p=0.367 n=20)
Int64N2e9-8                   2.935n ± 2%    2.934n ± 0%       ~ (p=0.455 n=20)
Int64N1e18-8                  3.778n ± 1%    3.777n ± 1%       ~ (p=0.995 n=20)
Int64N2e18-8                  4.359n ± 1%    4.359n ± 1%       ~ (p=0.122 n=20)
Int64N4e18-8                  6.546n ± 1%    6.536n ± 1%       ~ (p=0.920 n=20)
Int32N1000-8                  2.940n ± 2%    2.937n ± 0%       ~ (p=0.149 n=20)
Int32N1e8-8                   2.937n ± 2%    2.937n ± 1%       ~ (p=0.620 n=20)
Int32N1e9-8                   2.938n ± 0%    2.936n ± 0%       ~ (p=0.046 n=20)
Int32N2e9-8                   2.938n ± 2%    2.938n ± 2%       ~ (p=0.455 n=20)
Float32-8                     3.486n ± 0%    3.441n ± 0%  -1.28% (p=0.000 n=20)
Float64-8                     3.480n ± 0%    3.441n ± 0%  -1.13% (p=0.000 n=20)
ExpFloat64-8                  4.533n ± 0%    4.486n ± 0%  -1.03% (p=0.000 n=20)
NormFloat64-8                 4.764n ± 0%    4.721n ± 0%  -0.90% (p=0.000 n=20)
Perm3-8                       26.66n ± 0%    26.65n ± 0%       ~ (p=0.019 n=20)
Perm30-8                      143.4n ± 0%    143.2n ± 0%  -0.17% (p=0.000 n=20)
Perm30ViaShuffle-8            142.9n ± 0%    143.0n ± 0%       ~ (p=0.522 n=20)
ShuffleOverhead-8             120.7n ± 0%    120.6n ± 1%       ~ (p=0.488 n=20)
Concurrent-8                  2.360n ± 2%    2.399n ± 5%       ~ (p=0.062 n=20)

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 4d84a369d1.386 │            2703446c2e.386             │
                        │     sec/op     │    sec/op      vs base                │
SourceUint64-32              2.101n ± 2%    2.072n ±  2%        ~ (p=0.273 n=20)
GlobalInt64-32               3.518n ± 2%    3.546n ± 27%   +0.78% (p=0.007 n=20)
GlobalInt64Parallel-32      0.3206n ± 0%   0.3211n ±  0%        ~ (p=0.386 n=20)
GlobalUint64-32              3.538n ± 1%    3.522n ±  2%        ~ (p=0.331 n=20)
GlobalUint64Parallel-32     0.3231n ± 0%   0.3172n ±  0%   -1.84% (p=0.000 n=20)
Int64-32                     2.554n ± 2%    2.520n ±  2%        ~ (p=0.465 n=20)
Uint64-32                    2.575n ± 2%    2.581n ±  1%        ~ (p=0.213 n=20)
GlobalIntN1000-32            6.292n ± 1%    6.171n ±  1%        ~ (p=0.015 n=20)
IntN1000-32                  4.735n ± 1%    4.752n ±  2%        ~ (p=0.635 n=20)
Int64N1000-32                5.489n ± 2%    5.429n ±  1%        ~ (p=0.324 n=20)
Int64N1e8-32                 5.528n ± 2%    5.469n ±  2%        ~ (p=0.013 n=20)
Int64N1e9-32                 5.438n ± 2%    5.489n ±  2%        ~ (p=0.984 n=20)
Int64N2e9-32                 5.474n ± 1%    5.492n ±  2%        ~ (p=0.616 n=20)
Int64N1e18-32                9.053n ± 1%    8.927n ±  1%        ~ (p=0.037 n=20)
Int64N2e18-32                9.685n ± 2%    9.622n ±  1%        ~ (p=0.449 n=20)
Int64N4e18-32                12.18n ± 1%    12.03n ±  1%        ~ (p=0.013 n=20)
Int32N1000-32                4.862n ± 1%    4.817n ±  1%   -0.94% (p=0.002 n=20)
Int32N1e8-32                 4.758n ± 2%    4.801n ±  1%        ~ (p=0.597 n=20)
Int32N1e9-32                 4.772n ± 1%    4.798n ±  1%        ~ (p=0.774 n=20)
Int32N2e9-32                 4.847n ± 0%    4.840n ±  1%        ~ (p=0.867 n=20)
Float32-32                   22.18n ± 4%    10.51n ±  4%  -52.61% (p=0.000 n=20)
Float64-32                   21.21n ± 3%    20.33n ±  3%   -4.17% (p=0.000 n=20)
ExpFloat64-32                12.39n ± 2%    12.59n ±  2%        ~ (p=0.139 n=20)
NormFloat64-32               7.422n ± 1%    7.350n ±  2%        ~ (p=0.208 n=20)
Perm3-32                     38.00n ± 2%    39.29n ±  2%   +3.38% (p=0.000 n=20)
Perm30-32                    212.7n ± 1%    219.1n ±  2%   +3.03% (p=0.001 n=20)
Perm30ViaShuffle-32          187.5n ± 2%    189.8n ±  2%        ~ (p=0.457 n=20)
ShuffleOverhead-32           159.7n ± 1%    158.9n ±  2%        ~ (p=0.920 n=20)
Concurrent-32                3.470n ± 0%    3.306n ±  3%   -4.71% (p=0.000 n=20)

For #61716.

Change-Id: I1933f1f9efd7e6e832d83e7fa5d84398f67d41f5
Reviewed-on: https://go-review.googlesource.com/c/go/+/502503
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agomath/rand/v2: add, optimize N, UintN, Uint32N, Uint64N
Russ Cox [Tue, 6 Jun 2023 13:43:20 +0000 (09:43 -0400)]
math/rand/v2: add, optimize N, UintN, Uint32N, Uint64N

Now that we can break the value stream, we can take advantage
of better algorithms that have been suggested since the original
code was written.

Also optimizes IntN, Int32N, Int64N, Perm (indirectly).

All the N variants (IntN, Int32N, Int64N, UintN, N, etc) now
return the same values given a Source and parameter n, so that
for example uint(r.IntN(10)) and r.UintN(10) and r.N(uint(10))
are completely interchangeable.

Int64N4e18 gets slower but that is a near worst case for
the algorithm and is extremely unlikely in practice.

32-bit Int32N variants got slower too, by 15-30%, in exchange
for speeding up everything on 64-bit systems and consistency
across the N functions.

Also rename previously missed benchmark
GlobalInt63Parallel to GlobalInt64Parallel.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 11ad9fdddc.amd64 │            4d84a369d1.amd64            │
                        │      sec/op      │    sec/op     vs base                  │
SourceUint64-32                1.335n ± 1%    1.348n ± 2%        ~ (p=0.335 n=20)
GlobalInt64-32                 2.046n ± 1%    2.082n ± 2%        ~ (p=0.310 n=20)
GlobalInt63Parallel-32        0.1037n ± 1%
GlobalInt64Parallel-32                       0.1036n ± 1%
GlobalUint64-32                2.075n ± 0%    2.077n ± 2%        ~ (p=0.228 n=20)
GlobalUint64Parallel-32       0.1013n ± 1%   0.1012n ± 1%        ~ (p=0.878 n=20)
Int64-32                       1.726n ± 2%    1.750n ± 0%   +1.39% (p=0.000 n=20)
Uint64-32                      1.673n ± 1%    1.707n ± 2%   +2.03% (p=0.002 n=20)
GlobalIntN1000-32              3.895n ± 2%    3.192n ± 1%  -18.05% (p=0.000 n=20)
IntN1000-32                    3.403n ± 1%    2.462n ± 2%  -27.65% (p=0.000 n=20)
Int64N1000-32                  3.053n ± 2%    2.470n ± 1%  -19.11% (p=0.000 n=20)
Int64N1e8-32                   2.718n ± 1%    2.503n ± 2%   -7.91% (p=0.000 n=20)
Int64N1e9-32                   2.712n ± 1%    2.487n ± 1%   -8.31% (p=0.000 n=20)
Int64N2e9-32                   2.690n ± 1%    2.487n ± 1%   -7.57% (p=0.000 n=20)
Int64N1e18-32                  3.084n ± 2%    3.006n ± 2%   -2.53% (p=0.000 n=20)
Int64N2e18-32                  4.026n ± 1%    3.368n ± 1%  -16.33% (p=0.000 n=20)
Int64N4e18-32                  4.049n ± 2%    4.763n ± 1%  +17.62% (p=0.000 n=20)
Int32N1000-32                  2.730n ± 0%    2.403n ± 1%  -11.94% (p=0.000 n=20)
Int32N1e8-32                   2.916n ± 2%    2.405n ± 1%  -17.53% (p=0.000 n=20)
Int32N1e9-32                   3.375n ± 1%    2.402n ± 2%  -28.83% (p=0.000 n=20)
Int32N2e9-32                   3.292n ± 1%    2.384n ± 1%  -27.58% (p=0.000 n=20)
Float32-32                     2.673n ± 1%    2.641n ± 2%        ~ (p=0.147 n=20)
Float64-32                     2.485n ± 1%    2.483n ± 1%        ~ (p=0.804 n=20)
ExpFloat64-32                  3.577n ± 2%    3.486n ± 2%   -2.57% (p=0.000 n=20)
NormFloat64-32                 3.797n ± 2%    3.648n ± 1%   -3.92% (p=0.000 n=20)
Perm3-32                       35.79n ± 2%    33.04n ± 1%   -7.68% (p=0.000 n=20)
Perm30-32                      205.1n ± 1%    171.9n ± 1%  -16.14% (p=0.000 n=20)
Perm30ViaShuffle-32            111.2n ± 2%    100.3n ± 1%   -9.76% (p=0.000 n=20)
ShuffleOverhead-32             100.5n ± 2%    102.5n ± 1%   +1.99% (p=0.007 n=20)
Concurrent-32                  2.188n ± 5%    2.101n ± 0%        ~ (p=0.013 n=20)

goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
                       │ 11ad9fdddc.arm64 │            4d84a369d1.arm64            │
                       │      sec/op      │    sec/op     vs base                  │
SourceUint64-8                2.272n ± 1%    2.261n ± 1%        ~ (p=0.172 n=20)
GlobalInt64-8                 2.155n ± 1%    2.160n ± 1%        ~ (p=0.482 n=20)
GlobalInt63Parallel-8        0.4352n ± 0%
GlobalInt64Parallel-8                       0.4299n ± 0%
GlobalUint64-8                2.173n ± 1%    2.169n ± 1%        ~ (p=0.262 n=20)
GlobalUint64Parallel-8       0.4340n ± 0%   0.4293n ± 1%   -1.08% (p=0.000 n=20)
Int64-8                       2.544n ± 1%    2.473n ± 1%   -2.83% (p=0.000 n=20)
Uint64-8                      2.552n ± 1%    2.453n ± 1%   -3.90% (p=0.000 n=20)
GlobalIntN1000-8              3.856n ± 0%    2.814n ± 2%  -27.02% (p=0.000 n=20)
IntN1000-8                    3.820n ± 0%    2.933n ± 2%  -23.22% (p=0.000 n=20)
Int64N1000-8                  3.219n ± 2%    2.934n ± 2%   -8.85% (p=0.000 n=20)
Int64N1e8-8                   3.221n ± 2%    2.935n ± 2%   -8.91% (p=0.000 n=20)
Int64N1e9-8                   3.276n ± 2%    2.934n ± 2%  -10.44% (p=0.000 n=20)
Int64N2e9-8                   3.217n ± 0%    2.935n ± 2%   -8.78% (p=0.000 n=20)
Int64N1e18-8                  3.502n ± 2%    3.778n ± 1%   +7.91% (p=0.000 n=20)
Int64N2e18-8                  4.968n ± 1%    4.359n ± 1%  -12.26% (p=0.000 n=20)
Int64N4e18-8                  4.963n ± 0%    6.546n ± 1%  +31.92% (p=0.000 n=20)
Int32N1000-8                  3.189n ± 1%    2.940n ± 2%   -7.81% (p=0.000 n=20)
Int32N1e8-8                   3.514n ± 1%    2.937n ± 2%  -16.41% (p=0.000 n=20)
Int32N1e9-8                   4.133n ± 0%    2.938n ± 0%  -28.91% (p=0.000 n=20)
Int32N2e9-8                   4.137n ± 0%    2.938n ± 2%  -28.97% (p=0.000 n=20)
Float32-8                     3.468n ± 1%    3.486n ± 0%   +0.52% (p=0.000 n=20)
Float64-8                     3.478n ± 0%    3.480n ± 0%        ~ (p=0.063 n=20)
ExpFloat64-8                  4.563n ± 0%    4.533n ± 0%   -0.67% (p=0.000 n=20)
NormFloat64-8                 4.768n ± 0%    4.764n ± 0%   -0.07% (p=0.001 n=20)
Perm3-8                       28.94n ± 0%    26.66n ± 0%   -7.88% (p=0.000 n=20)
Perm30-8                      175.9n ± 0%    143.4n ± 0%  -18.50% (p=0.000 n=20)
Perm30ViaShuffle-8            152.6n ± 1%    142.9n ± 0%   -6.29% (p=0.000 n=20)
ShuffleOverhead-8             119.6n ± 1%    120.7n ± 0%   +0.96% (p=0.000 n=20)
Concurrent-8                  2.452n ± 3%    2.360n ± 2%   -3.73% (p=0.007 n=20)

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 11ad9fdddc.386 │             4d84a369d1.386             │
                        │     sec/op     │    sec/op     vs base                  │
SourceUint64-32              2.091n ± 1%    2.101n ± 2%        ~ (p=0.672 n=20)
GlobalInt64-32               3.514n ± 2%    3.518n ± 2%        ~ (p=0.723 n=20)
GlobalInt63Parallel-32      0.3197n ± 0%
GlobalInt64Parallel-32                     0.3206n ± 0%
GlobalUint64-32              3.542n ± 1%    3.538n ± 1%        ~ (p=0.304 n=20)
GlobalUint64Parallel-32     0.3218n ± 0%   0.3231n ± 0%        ~ (p=0.071 n=20)
Int64-32                     2.552n ± 2%    2.554n ± 2%        ~ (p=0.693 n=20)
Uint64-32                    2.566n ± 1%    2.575n ± 2%        ~ (p=0.606 n=20)
GlobalIntN1000-32            5.965n ± 2%    6.292n ± 1%   +5.46% (p=0.000 n=20)
IntN1000-32                  4.652n ± 1%    4.735n ± 1%   +1.77% (p=0.000 n=20)
Int64N1000-32               14.485n ± 1%    5.489n ± 2%  -62.11% (p=0.000 n=20)
Int64N1e8-32                14.675n ± 1%    5.528n ± 2%  -62.33% (p=0.000 n=20)
Int64N1e9-32                16.805n ± 2%    5.438n ± 2%  -67.64% (p=0.000 n=20)
Int64N2e9-32                14.515n ± 1%    5.474n ± 1%  -62.28% (p=0.000 n=20)
Int64N1e18-32               16.165n ± 1%    9.053n ± 1%  -44.00% (p=0.000 n=20)
Int64N2e18-32               17.945n ± 2%    9.685n ± 2%  -46.03% (p=0.000 n=20)
Int64N4e18-32                18.35n ± 2%    12.18n ± 1%  -33.62% (p=0.000 n=20)
Int32N1000-32                3.608n ± 1%    4.862n ± 1%  +34.77% (p=0.000 n=20)
Int32N1e8-32                 3.767n ± 1%    4.758n ± 2%  +26.31% (p=0.000 n=20)
Int32N1e9-32                 4.130n ± 2%    4.772n ± 1%  +15.54% (p=0.000 n=20)
Int32N2e9-32                 4.206n ± 1%    4.847n ± 0%  +15.24% (p=0.000 n=20)
Float32-32                   22.18n ± 4%    22.18n ± 4%        ~ (p=0.195 n=20)
Float64-32                   20.75n ± 4%    21.21n ± 3%        ~ (p=0.394 n=20)
ExpFloat64-32                12.58n ± 3%    12.39n ± 2%        ~ (p=0.032 n=20)
NormFloat64-32               7.920n ± 3%    7.422n ± 1%   -6.29% (p=0.000 n=20)
Perm3-32                     40.27n ± 1%    38.00n ± 2%   -5.65% (p=0.000 n=20)
Perm30-32                    213.2n ± 2%    212.7n ± 1%        ~ (p=0.995 n=20)
Perm30ViaShuffle-32          164.2n ± 2%    187.5n ± 2%  +14.22% (p=0.000 n=20)
ShuffleOverhead-32           134.7n ± 2%    159.7n ± 1%  +18.52% (p=0.000 n=20)
Concurrent-32                3.301n ± 2%    3.470n ± 0%   +5.10% (p=0.000 n=20)

For #61716.

Change-Id: Id1481b04202883cd0b23e21bb58d1bca4e482bd3
Reviewed-on: https://go-review.googlesource.com/c/go/+/502500
Reviewed-by: Rob Pike <r@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agomath/rand/v2: change Source to use uint64
Russ Cox [Tue, 6 Jun 2023 13:16:34 +0000 (09:16 -0400)]
math/rand/v2: change Source to use uint64

This should make Uint64-using functions faster and leave
other things alone. It is a mystery why so much got faster.
A good cautionary tale not to read too much into minor
jitter in the benchmarks.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 220860f76f.amd64 │           11ad9fdddc.amd64           │
                        │      sec/op      │    sec/op     vs base                │
SourceUint64-32                1.555n ± 1%    1.335n ± 1%  -14.15% (p=0.000 n=20)
GlobalInt64-32                 2.071n ± 1%    2.046n ± 1%        ~ (p=0.016 n=20)
GlobalInt63Parallel-32        0.1023n ± 1%   0.1037n ± 1%   +1.37% (p=0.002 n=20)
GlobalUint64-32                5.193n ± 1%    2.075n ± 0%  -60.06% (p=0.000 n=20)
GlobalUint64Parallel-32       0.2341n ± 0%   0.1013n ± 1%  -56.74% (p=0.000 n=20)
Int64-32                       2.056n ± 2%    1.726n ± 2%  -16.10% (p=0.000 n=20)
Uint64-32                      2.077n ± 2%    1.673n ± 1%  -19.46% (p=0.000 n=20)
GlobalIntN1000-32              4.077n ± 2%    3.895n ± 2%   -4.45% (p=0.000 n=20)
IntN1000-32                    3.476n ± 2%    3.403n ± 1%   -2.10% (p=0.000 n=20)
Int64N1000-32                  3.059n ± 1%    3.053n ± 2%        ~ (p=0.131 n=20)
Int64N1e8-32                   2.942n ± 1%    2.718n ± 1%   -7.60% (p=0.000 n=20)
Int64N1e9-32                   2.932n ± 1%    2.712n ± 1%   -7.50% (p=0.000 n=20)
Int64N2e9-32                   2.925n ± 1%    2.690n ± 1%   -8.03% (p=0.000 n=20)
Int64N1e18-32                  3.116n ± 1%    3.084n ± 2%        ~ (p=0.425 n=20)
Int64N2e18-32                  4.067n ± 1%    4.026n ± 1%   -1.02% (p=0.007 n=20)
Int64N4e18-32                  4.054n ± 1%    4.049n ± 2%        ~ (p=0.204 n=20)
Int32N1000-32                  2.951n ± 1%    2.730n ± 0%   -7.49% (p=0.000 n=20)
Int32N1e8-32                   3.102n ± 1%    2.916n ± 2%   -6.03% (p=0.000 n=20)
Int32N1e9-32                   3.535n ± 1%    3.375n ± 1%   -4.54% (p=0.000 n=20)
Int32N2e9-32                   3.514n ± 1%    3.292n ± 1%   -6.30% (p=0.000 n=20)
Float32-32                     2.760n ± 1%    2.673n ± 1%   -3.13% (p=0.000 n=20)
Float64-32                     2.284n ± 1%    2.485n ± 1%   +8.80% (p=0.000 n=20)
ExpFloat64-32                  3.757n ± 1%    3.577n ± 2%   -4.78% (p=0.000 n=20)
NormFloat64-32                 3.837n ± 1%    3.797n ± 2%        ~ (p=0.204 n=20)
Perm3-32                       35.23n ± 2%    35.79n ± 2%        ~ (p=0.298 n=20)
Perm30-32                      208.8n ± 1%    205.1n ± 1%   -1.82% (p=0.000 n=20)
Perm30ViaShuffle-32            111.7n ± 1%    111.2n ± 2%        ~ (p=0.273 n=20)
ShuffleOverhead-32             101.1n ± 1%    100.5n ± 2%        ~ (p=0.878 n=20)
Concurrent-32                  2.108n ± 7%    2.188n ± 5%        ~ (p=0.417 n=20)

goos: darwin
goarch: arm64
pkg: math/rand/v2
                       │ 220860f76f.arm64 │           11ad9fdddc.arm64           │
                       │      sec/op      │    sec/op     vs base                │
SourceUint64-8                2.316n ± 1%    2.272n ± 1%   -1.86% (p=0.000 n=20)
GlobalInt64-8                 2.183n ± 1%    2.155n ± 1%        ~ (p=0.122 n=20)
GlobalInt63Parallel-8        0.4331n ± 0%   0.4352n ± 0%   +0.48% (p=0.000 n=20)
GlobalUint64-8                4.377n ± 2%    2.173n ± 1%  -50.35% (p=0.000 n=20)
GlobalUint64Parallel-8       0.9237n ± 0%   0.4340n ± 0%  -53.02% (p=0.000 n=20)
Int64-8                       2.538n ± 1%    2.544n ± 1%        ~ (p=0.189 n=20)
Uint64-8                      2.604n ± 1%    2.552n ± 1%   -1.98% (p=0.000 n=20)
GlobalIntN1000-8              3.857n ± 2%    3.856n ± 0%        ~ (p=0.051 n=20)
IntN1000-8                    3.822n ± 2%    3.820n ± 0%   -0.05% (p=0.001 n=20)
Int64N1000-8                  3.318n ± 0%    3.219n ± 2%   -2.98% (p=0.000 n=20)
Int64N1e8-8                   3.349n ± 1%    3.221n ± 2%   -3.79% (p=0.000 n=20)
Int64N1e9-8                   3.317n ± 2%    3.276n ± 2%   -1.24% (p=0.001 n=20)
Int64N2e9-8                   3.317n ± 2%    3.217n ± 0%   -3.01% (p=0.000 n=20)
Int64N1e18-8                  3.542n ± 1%    3.502n ± 2%   -1.16% (p=0.001 n=20)
Int64N2e18-8                  5.087n ± 0%    4.968n ± 1%   -2.33% (p=0.000 n=20)
Int64N4e18-8                  5.084n ± 0%    4.963n ± 0%   -2.39% (p=0.000 n=20)
Int32N1000-8                  3.208n ± 2%    3.189n ± 1%   -0.58% (p=0.001 n=20)
Int32N1e8-8                   3.610n ± 1%    3.514n ± 1%   -2.67% (p=0.000 n=20)
Int32N1e9-8                   4.235n ± 0%    4.133n ± 0%   -2.40% (p=0.000 n=20)
Int32N2e9-8                   4.229n ± 1%    4.137n ± 0%   -2.19% (p=0.000 n=20)
Float32-8                     3.468n ± 0%    3.468n ± 1%        ~ (p=0.350 n=20)
Float64-8                     3.447n ± 0%    3.478n ± 0%   +0.90% (p=0.000 n=20)
ExpFloat64-8                  4.567n ± 0%    4.563n ± 0%   -0.10% (p=0.002 n=20)
NormFloat64-8                 4.821n ± 0%    4.768n ± 0%   -1.09% (p=0.000 n=20)
Perm3-8                       28.89n ± 0%    28.94n ± 0%   +0.17% (p=0.000 n=20)
Perm30-8                      175.7n ± 0%    175.9n ± 0%   +0.14% (p=0.000 n=20)
Perm30ViaShuffle-8            153.5n ± 0%    152.6n ± 1%        ~ (p=0.010 n=20)
ShuffleOverhead-8             119.8n ± 1%    119.6n ± 1%        ~ (p=0.147 n=20)
Concurrent-8                  2.433n ± 3%    2.452n ± 3%        ~ (p=0.616 n=20)

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 220860f76f.386 │            11ad9fdddc.386            │
                        │     sec/op     │    sec/op     vs base                │
SourceUint64-32             2.370n ±  1%    2.091n ± 1%  -11.75% (p=0.000 n=20)
GlobalInt64-32              3.569n ±  1%    3.514n ± 2%   -1.56% (p=0.000 n=20)
GlobalInt63Parallel-32     0.3221n ±  1%   0.3197n ± 0%   -0.76% (p=0.000 n=20)
GlobalUint64-32             8.797n ± 10%    3.542n ± 1%  -59.74% (p=0.000 n=20)
GlobalUint64Parallel-32    0.6351n ±  0%   0.3218n ± 0%  -49.33% (p=0.000 n=20)
Int64-32                    2.612n ±  2%    2.552n ± 2%   -2.30% (p=0.000 n=20)
Uint64-32                   3.350n ±  1%    2.566n ± 1%  -23.42% (p=0.000 n=20)
GlobalIntN1000-32           5.892n ±  1%    5.965n ± 2%        ~ (p=0.082 n=20)
IntN1000-32                 4.546n ±  1%    4.652n ± 1%   +2.33% (p=0.000 n=20)
Int64N1000-32               14.59n ±  1%    14.48n ± 1%        ~ (p=0.652 n=20)
Int64N1e8-32                14.76n ±  2%    14.67n ± 1%        ~ (p=0.836 n=20)
Int64N1e9-32                16.57n ±  1%    16.80n ± 2%        ~ (p=0.016 n=20)
Int64N2e9-32                14.54n ±  1%    14.52n ± 1%        ~ (p=0.533 n=20)
Int64N1e18-32               16.14n ±  1%    16.16n ± 1%        ~ (p=0.606 n=20)
Int64N2e18-32               18.10n ±  1%    17.95n ± 2%        ~ (p=0.062 n=20)
Int64N4e18-32               18.65n ±  1%    18.35n ± 2%   -1.61% (p=0.010 n=20)
Int32N1000-32               3.560n ±  1%    3.608n ± 1%   +1.33% (p=0.001 n=20)
Int32N1e8-32                3.770n ±  2%    3.767n ± 1%        ~ (p=0.155 n=20)
Int32N1e9-32                4.098n ±  0%    4.130n ± 2%        ~ (p=0.016 n=20)
Int32N2e9-32                4.179n ±  1%    4.206n ± 1%        ~ (p=0.011 n=20)
Float32-32                  21.18n ±  4%    22.18n ± 4%   +4.70% (p=0.003 n=20)
Float64-32                  20.60n ±  2%    20.75n ± 4%   +0.73% (p=0.000 n=20)
ExpFloat64-32               13.07n ±  0%    12.58n ± 3%   -3.82% (p=0.000 n=20)
NormFloat64-32              7.738n ±  2%    7.920n ± 3%        ~ (p=0.066 n=20)
Perm3-32                    36.73n ±  1%    40.27n ± 1%   +9.65% (p=0.000 n=20)
Perm30-32                   211.9n ±  1%    213.2n ± 2%        ~ (p=0.262 n=20)
Perm30ViaShuffle-32         165.2n ±  1%    164.2n ± 2%        ~ (p=0.029 n=20)
ShuffleOverhead-32          133.9n ±  1%    134.7n ± 2%        ~ (p=0.551 n=20)
Concurrent-32               3.287n ±  2%    3.301n ± 2%        ~ (p=0.330 n=20)

For #61716.

Change-Id: I8d2f73f87dd3603a0c2ff069988938e0957b6904
Reviewed-on: https://go-review.googlesource.com/c/go/+/502499
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Rob Pike <r@golang.org>
6 months agocmd/compile: optimize right shifts of int32 on riscv64
Ubuntu [Fri, 22 Sep 2023 13:14:25 +0000 (13:14 +0000)]
cmd/compile: optimize right shifts of int32 on riscv64

The compiler is currently sign extending 32 bit signed integers to
64 bits before right shifting them using a 64 bit shift instruction.
There's no need to do this as RISC-V has instructions for right
shifting 32 bit signed values (sraw and sraiw) which sign extend
the result of the shift to 64 bits.  Change the compiler so that
it uses sraw and sraiw for shifts of signed 32 bit integers reducing
in most cases the number of instructions needed to perform the shift.

Here are some examples of code sequences that are changed by this
patch:

int32(a) >> 2

  before:

    sll     x5,x10,0x20
    sra     x10,x5,0x22

  after:

    sraw    x10,x10,0x2

int32(v) >> int(s)

  before:

    sext.w  x5,x10
    sltiu   x6,x11,64
    add     x6,x6,-1
    or      x6,x11,x6
    sra     x10,x5,x6

  after:

    sltiu   x5,x11,32
    add     x5,x5,-1
    or      x5,x11,x5
    sraw    x10,x10,x5

int32(v) >> (int(s) & 31)

  before:

    sext.w  x5,x10
    and     x6,x11,63
    sra     x10,x5,x6

after:

    and     x5,x11,31
    sraw    x10,x10,x5

int32(100) >> int(a)

  before:

    bltz    x10,<target address calls runtime.panicshift>
    sltiu   x5,x10,64
    add     x5,x5,-1
    or      x5,x10,x5
    li      x6,100
    sra     x10,x6,x5

  after:

    bltz    x10,<target address calls runtime.panicshift>
    sltiu   x5,x10,32
    add     x5,x5,-1
    or      x5,x10,x5
    li      x6,100
    sraw    x10,x6,x5

int32(v) >> (int(s) & 63)

  before:

    sext.w  x5,x10
    and     x6,x11,63
    sra     x10,x5,x6

  after:

    and     x5,x11,63
    sltiu   x6,x5,32
    add     x6,x6,-1
    or      x5,x5,x6
    sraw    x10,x10,x5

In most cases we eliminate one instruction.  In the case where
we shift a int32 constant by a variable the number of instructions
generated is identical.  A sra is simply replaced by a sraw.  In the
unusual case where we shift right by a variable anded with a constant
> 31 but < 64, we generate two additional instructions.  As this is
an unusual case we do not try to optimize for it.

Some improvements can be seen in some of the existing benchmarks,
notably in the utf8 package which performs right shifts of runes
which are signed 32 bit integers.

                      |  utf8-old   |              utf8-new            |
                      |   sec/op    |   sec/op     vs base             |
EncodeASCIIRune-4       17.68n ± 0%   17.67n ± 0%       ~ (p=0.312 n=10)
EncodeJapaneseRune-4    35.34n ± 0%   34.53n ± 1%  -2.31% (p=0.000 n=10)
AppendASCIIRune-4       3.213n ± 0%   3.213n ± 0%       ~ (p=0.318 n=10)
AppendJapaneseRune-4    36.14n ± 0%   35.35n ± 0%  -2.19% (p=0.000 n=10)
DecodeASCIIRune-4       28.11n ± 0%   27.36n ± 0%  -2.69% (p=0.000 n=10)
DecodeJapaneseRune-4    38.55n ± 0%   38.58n ± 0%       ~ (p=0.612 n=10)

Change-Id: I60a91cbede9ce65597571c7b7dd9943eeb8d3cc2
Reviewed-on: https://go-review.googlesource.com/c/go/+/535115
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: M Zhuo <mzh@golangcn.org>
Reviewed-by: David Chase <drchase@google.com>
6 months agomath/rand/v2: update benchmarks
Russ Cox [Tue, 6 Jun 2023 16:44:46 +0000 (12:44 -0400)]
math/rand/v2: update benchmarks

Change the benchmarks to use the result of the calls,
as I found that in certain cases inlining resulted in
discarding part of the computation in the benchmark loop.
Add various benchmarks that will be relevant in future CLs.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 220860f76f.amd64 │
                        │      sec/op      │
SourceUint64-32                1.555n ± 1%
GlobalInt64-32                 2.071n ± 1%
GlobalInt63Parallel-32        0.1023n ± 1%
GlobalUint64-32                5.193n ± 1%
GlobalUint64Parallel-32       0.2341n ± 0%
Int64-32                       2.056n ± 2%
Uint64-32                      2.077n ± 2%
GlobalIntN1000-32              4.077n ± 2%
IntN1000-32                    3.476n ± 2%
Int64N1000-32                  3.059n ± 1%
Int64N1e8-32                   2.942n ± 1%
Int64N1e9-32                   2.932n ± 1%
Int64N2e9-32                   2.925n ± 1%
Int64N1e18-32                  3.116n ± 1%
Int64N2e18-32                  4.067n ± 1%
Int64N4e18-32                  4.054n ± 1%
Int32N1000-32                  2.951n ± 1%
Int32N1e8-32                   3.102n ± 1%
Int32N1e9-32                   3.535n ± 1%
Int32N2e9-32                   3.514n ± 1%
Float32-32                     2.760n ± 1%
Float64-32                     2.284n ± 1%
ExpFloat64-32                  3.757n ± 1%
NormFloat64-32                 3.837n ± 1%
Perm3-32                       35.23n ± 2%
Perm30-32                      208.8n ± 1%
Perm30ViaShuffle-32            111.7n ± 1%
ShuffleOverhead-32             101.1n ± 1%
Concurrent-32                  2.108n ± 7%

goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
                       │ 220860f76f.arm64 │
                       │      sec/op      │
SourceUint64-8                2.316n ± 1%
GlobalInt64-8                 2.183n ± 1%
GlobalInt63Parallel-8        0.4331n ± 0%
GlobalUint64-8                4.377n ± 2%
GlobalUint64Parallel-8       0.9237n ± 0%
Int64-8                       2.538n ± 1%
Uint64-8                      2.604n ± 1%
GlobalIntN1000-8              3.857n ± 2%
IntN1000-8                    3.822n ± 2%
Int64N1000-8                  3.318n ± 0%
Int64N1e8-8                   3.349n ± 1%
Int64N1e9-8                   3.317n ± 2%
Int64N2e9-8                   3.317n ± 2%
Int64N1e18-8                  3.542n ± 1%
Int64N2e18-8                  5.087n ± 0%
Int64N4e18-8                  5.084n ± 0%
Int32N1000-8                  3.208n ± 2%
Int32N1e8-8                   3.610n ± 1%
Int32N1e9-8                   4.235n ± 0%
Int32N2e9-8                   4.229n ± 1%
Float32-8                     3.468n ± 0%
Float64-8                     3.447n ± 0%
ExpFloat64-8                  4.567n ± 0%
NormFloat64-8                 4.821n ± 0%
Perm3-8                       28.89n ± 0%
Perm30-8                      175.7n ± 0%
Perm30ViaShuffle-8            153.5n ± 0%
ShuffleOverhead-8             119.8n ± 1%
Concurrent-8                  2.433n ± 3%

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 220860f76f.386 │
                        │     sec/op     │
SourceUint64-32             2.370n ±  1%
GlobalInt64-32              3.569n ±  1%
GlobalInt63Parallel-32     0.3221n ±  1%
GlobalUint64-32             8.797n ± 10%
GlobalUint64Parallel-32    0.6351n ±  0%
Int64-32                    2.612n ±  2%
Uint64-32                   3.350n ±  1%
GlobalIntN1000-32           5.892n ±  1%
IntN1000-32                 4.546n ±  1%
Int64N1000-32               14.59n ±  1%
Int64N1e8-32                14.76n ±  2%
Int64N1e9-32                16.57n ±  1%
Int64N2e9-32                14.54n ±  1%
Int64N1e18-32               16.14n ±  1%
Int64N2e18-32               18.10n ±  1%
Int64N4e18-32               18.65n ±  1%
Int32N1000-32               3.560n ±  1%
Int32N1e8-32                3.770n ±  2%
Int32N1e9-32                4.098n ±  0%
Int32N2e9-32                4.179n ±  1%
Float32-32                  21.18n ±  4%
Float64-32                  20.60n ±  2%
ExpFloat64-32               13.07n ±  0%
NormFloat64-32              7.738n ±  2%
Perm3-32                    36.73n ±  1%
Perm30-32                   211.9n ±  1%
Perm30ViaShuffle-32         165.2n ±  1%
ShuffleOverhead-32          133.9n ±  1%
Concurrent-32               3.287n ±  2%

For #61716.

Change-Id: I2f0938eae4b7bf736a8cd899a99783e731bf2179
Reviewed-on: https://go-review.googlesource.com/c/go/+/502496
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Rob Pike <r@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agomath/rand/v2: remove Rand.Seed
Russ Cox [Tue, 6 Jun 2023 13:13:57 +0000 (09:13 -0400)]
math/rand/v2: remove Rand.Seed

Removing Rand.Seed lets us remove lockedSource as well,
along with the ambiguity in globalRand about which source
to use.

For #61716.

Change-Id: Ibe150520dd1e7dd87165eacaebe9f0c2daeaedfd
Reviewed-on: https://go-review.googlesource.com/c/go/+/502498
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Rob Pike <r@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>

6 months agomath/rand/v2: clean up regression test
Russ Cox [Sun, 6 Aug 2023 04:06:18 +0000 (00:06 -0400)]
math/rand/v2: clean up regression test

Add more test cases.
Replace -printgolden with -update,
which rewrites the files for us.

For #61716.

Change-Id: I7c4c900ee896042429135a21971a56ebe16b6a66
Reviewed-on: https://go-review.googlesource.com/c/go/+/516858
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agomath/rand/v2: remove Read
Russ Cox [Tue, 6 Jun 2023 12:53:54 +0000 (08:53 -0400)]
math/rand/v2: remove Read

In math/rand, Read is deprecated. Remove in v2.
People should use crypto/rand if they need long strings.

For #61716.

Change-Id: Ib254b7e1844616e96db60a3a7abb572b0dcb1583
Reviewed-on: https://go-review.googlesource.com/c/go/+/502497
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agomath/rand/v2: rename various functions
Russ Cox [Sun, 6 Aug 2023 03:49:01 +0000 (23:49 -0400)]
math/rand/v2: rename various functions

Int31 -> Int32
Int31n -> Int32N
Int63 -> Int64
Int63n -> Int64N
Intn -> IntN

The 31 and 63 are pedantic and confusing: the functions should
be named for the type they return, same as all the others.

The lower-case n is inconsistent with Go's usual CamelCase
and especially problematic because we plan to add 'func N'.
Capitalize the n.

For #61716.

Change-Id: Idb1a005a82f353677450d47fb612ade7a41fde69
Reviewed-on: https://go-review.googlesource.com/c/go/+/516857
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agomath/rand/v2: start of new API
Russ Cox [Tue, 6 Jun 2023 12:46:45 +0000 (08:46 -0400)]
math/rand/v2: start of new API

This is the beginning of the math/rand/v2 package from proposal #61716.
Start by copying old API. This CL copies math/rand/* to math/rand/v2
and updates references to math/rand to add v2 throughout.
Later CLs will make the v2 changes.

For #61716.

Change-Id: I1624ccffae3dfa442d4ba2461942decbd076e11b
Reviewed-on: https://go-review.googlesource.com/c/go/+/502495
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
6 months agocmd/compile: rework TestPGOHash to not rebuild dependencies
Cherry Mui [Tue, 17 Oct 2023 02:36:57 +0000 (22:36 -0400)]
cmd/compile: rework TestPGOHash to not rebuild dependencies

TestPGOHash may rebuild dependencies as we pass -trimpath to the
go command. This CL makes it pass -trimpath compiler flag to only
the current package instead, as we only need the current package
to have a stable source file path.

Also refactor buildPGOInliningTest to only take compiler flags,
not go flags, to avoid accidental rebuild.

Should fix #63733.

Change-Id: Iec6c4e90cf659790e21083ee2e697f518234c5b9
Reviewed-on: https://go-review.googlesource.com/c/go/+/535915
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
6 months agointernal/testenv: use cmd.Environ in CleanCmdEnv
Cherry Mui [Fri, 27 Oct 2023 16:30:53 +0000 (12:30 -0400)]
internal/testenv: use cmd.Environ in CleanCmdEnv

In CleanCmdEnv, use cmd.Environ instead of os.Environ, so it
sets the PWD environment variable if cmd.Dir is set. This ensures
the child process sees a canonical path for its working directory.

Change-Id: Ia769552a488dc909eaf6bb7d21937adba06d1072
Reviewed-on: https://go-review.googlesource.com/c/go/+/538215
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
6 months agobytes,internal/bytealg: eliminate HashStrBytes,HashStrRevBytes using …
Jes Cok [Fri, 27 Oct 2023 15:01:35 +0000 (15:01 +0000)]
bytes,internal/bytealg: eliminate HashStrBytes,HashStrRevBytes using …

…generics

The logic of HashStrBytes, HashStrRevBytes and HashStr, HashStrRev,
are exactly the same, except that the types are different.

Since the bootstrap toolchain is bumped to 1.20, we can eliminate them
by using generics.

Change-Id: I4336b1cab494ba963f09646c169b45f6b1ee62e3
GitHub-Last-Rev: b11a2bf9476d54bed4bd18a3f9269b5c95a66d67
GitHub-Pull-Request: golang/go#63766
Reviewed-on: https://go-review.googlesource.com/c/go/+/538175
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

6 months agoruntime: print a stack trace at "morestack on g0"
Cherry Mui [Mon, 28 Aug 2023 18:57:29 +0000 (14:57 -0400)]
runtime: print a stack trace at "morestack on g0"

Error like "morestack on g0" is one of the errors that is very
hard to debug, because often it doesn't print a useful stack trace.
The runtime doesn't directly print a stack trace because it is
a bad stack state to call print. Sometimes the SIGABRT may trigger
a traceback, but sometimes not especially in a cgo binary. Even if
it triggers a traceback it often does not include the stack trace
of the bad stack.

This CL makes it explicitly print a stack trace and throw. The
idea is to have some space as an "emergency" crash stack. When the
stack is in a really bad state, we switch to the crash stack and
do a traceback.

Currently only implemented on AMD64 and ARM64.

TODO: also handle errors like "morestack on gsignal" and bad
systemstack. Also handle other architectures.

Change-Id: Ibfc397202f2bb0737c5cbe99f2763de83301c1c1
Reviewed-on: https://go-review.googlesource.com/c/go/+/419435
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>