]> Cypherpunks.ru repositories - gostls13.git/commit
math/rand/v2: add, optimize N, UintN, Uint32N, Uint64N
authorRuss Cox <rsc@golang.org>
Tue, 6 Jun 2023 13:43:20 +0000 (09:43 -0400)
committerGopher Robot <gobot@golang.org>
Mon, 30 Oct 2023 17:08:37 +0000 (17:08 +0000)
commitc26658784622e486c828c58403f0a58941d1e856
tree3adadf78c3976db2a5abcfb79554ca028cf01c7a
parentc7dddb02d334e25d4cb7c227f2eaeaffd0d88a23
math/rand/v2: add, optimize N, UintN, Uint32N, Uint64N

Now that we can break the value stream, we can take advantage
of better algorithms that have been suggested since the original
code was written.

Also optimizes IntN, Int32N, Int64N, Perm (indirectly).

All the N variants (IntN, Int32N, Int64N, UintN, N, etc) now
return the same values given a Source and parameter n, so that
for example uint(r.IntN(10)) and r.UintN(10) and r.N(uint(10))
are completely interchangeable.

Int64N4e18 gets slower but that is a near worst case for
the algorithm and is extremely unlikely in practice.

32-bit Int32N variants got slower too, by 15-30%, in exchange
for speeding up everything on 64-bit systems and consistency
across the N functions.

Also rename previously missed benchmark
GlobalInt63Parallel to GlobalInt64Parallel.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 11ad9fdddc.amd64 │            4d84a369d1.amd64            │
                        │      sec/op      │    sec/op     vs base                  │
SourceUint64-32                1.335n ± 1%    1.348n ± 2%        ~ (p=0.335 n=20)
GlobalInt64-32                 2.046n ± 1%    2.082n ± 2%        ~ (p=0.310 n=20)
GlobalInt63Parallel-32        0.1037n ± 1%
GlobalInt64Parallel-32                       0.1036n ± 1%
GlobalUint64-32                2.075n ± 0%    2.077n ± 2%        ~ (p=0.228 n=20)
GlobalUint64Parallel-32       0.1013n ± 1%   0.1012n ± 1%        ~ (p=0.878 n=20)
Int64-32                       1.726n ± 2%    1.750n ± 0%   +1.39% (p=0.000 n=20)
Uint64-32                      1.673n ± 1%    1.707n ± 2%   +2.03% (p=0.002 n=20)
GlobalIntN1000-32              3.895n ± 2%    3.192n ± 1%  -18.05% (p=0.000 n=20)
IntN1000-32                    3.403n ± 1%    2.462n ± 2%  -27.65% (p=0.000 n=20)
Int64N1000-32                  3.053n ± 2%    2.470n ± 1%  -19.11% (p=0.000 n=20)
Int64N1e8-32                   2.718n ± 1%    2.503n ± 2%   -7.91% (p=0.000 n=20)
Int64N1e9-32                   2.712n ± 1%    2.487n ± 1%   -8.31% (p=0.000 n=20)
Int64N2e9-32                   2.690n ± 1%    2.487n ± 1%   -7.57% (p=0.000 n=20)
Int64N1e18-32                  3.084n ± 2%    3.006n ± 2%   -2.53% (p=0.000 n=20)
Int64N2e18-32                  4.026n ± 1%    3.368n ± 1%  -16.33% (p=0.000 n=20)
Int64N4e18-32                  4.049n ± 2%    4.763n ± 1%  +17.62% (p=0.000 n=20)
Int32N1000-32                  2.730n ± 0%    2.403n ± 1%  -11.94% (p=0.000 n=20)
Int32N1e8-32                   2.916n ± 2%    2.405n ± 1%  -17.53% (p=0.000 n=20)
Int32N1e9-32                   3.375n ± 1%    2.402n ± 2%  -28.83% (p=0.000 n=20)
Int32N2e9-32                   3.292n ± 1%    2.384n ± 1%  -27.58% (p=0.000 n=20)
Float32-32                     2.673n ± 1%    2.641n ± 2%        ~ (p=0.147 n=20)
Float64-32                     2.485n ± 1%    2.483n ± 1%        ~ (p=0.804 n=20)
ExpFloat64-32                  3.577n ± 2%    3.486n ± 2%   -2.57% (p=0.000 n=20)
NormFloat64-32                 3.797n ± 2%    3.648n ± 1%   -3.92% (p=0.000 n=20)
Perm3-32                       35.79n ± 2%    33.04n ± 1%   -7.68% (p=0.000 n=20)
Perm30-32                      205.1n ± 1%    171.9n ± 1%  -16.14% (p=0.000 n=20)
Perm30ViaShuffle-32            111.2n ± 2%    100.3n ± 1%   -9.76% (p=0.000 n=20)
ShuffleOverhead-32             100.5n ± 2%    102.5n ± 1%   +1.99% (p=0.007 n=20)
Concurrent-32                  2.188n ± 5%    2.101n ± 0%        ~ (p=0.013 n=20)

goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
                       │ 11ad9fdddc.arm64 │            4d84a369d1.arm64            │
                       │      sec/op      │    sec/op     vs base                  │
SourceUint64-8                2.272n ± 1%    2.261n ± 1%        ~ (p=0.172 n=20)
GlobalInt64-8                 2.155n ± 1%    2.160n ± 1%        ~ (p=0.482 n=20)
GlobalInt63Parallel-8        0.4352n ± 0%
GlobalInt64Parallel-8                       0.4299n ± 0%
GlobalUint64-8                2.173n ± 1%    2.169n ± 1%        ~ (p=0.262 n=20)
GlobalUint64Parallel-8       0.4340n ± 0%   0.4293n ± 1%   -1.08% (p=0.000 n=20)
Int64-8                       2.544n ± 1%    2.473n ± 1%   -2.83% (p=0.000 n=20)
Uint64-8                      2.552n ± 1%    2.453n ± 1%   -3.90% (p=0.000 n=20)
GlobalIntN1000-8              3.856n ± 0%    2.814n ± 2%  -27.02% (p=0.000 n=20)
IntN1000-8                    3.820n ± 0%    2.933n ± 2%  -23.22% (p=0.000 n=20)
Int64N1000-8                  3.219n ± 2%    2.934n ± 2%   -8.85% (p=0.000 n=20)
Int64N1e8-8                   3.221n ± 2%    2.935n ± 2%   -8.91% (p=0.000 n=20)
Int64N1e9-8                   3.276n ± 2%    2.934n ± 2%  -10.44% (p=0.000 n=20)
Int64N2e9-8                   3.217n ± 0%    2.935n ± 2%   -8.78% (p=0.000 n=20)
Int64N1e18-8                  3.502n ± 2%    3.778n ± 1%   +7.91% (p=0.000 n=20)
Int64N2e18-8                  4.968n ± 1%    4.359n ± 1%  -12.26% (p=0.000 n=20)
Int64N4e18-8                  4.963n ± 0%    6.546n ± 1%  +31.92% (p=0.000 n=20)
Int32N1000-8                  3.189n ± 1%    2.940n ± 2%   -7.81% (p=0.000 n=20)
Int32N1e8-8                   3.514n ± 1%    2.937n ± 2%  -16.41% (p=0.000 n=20)
Int32N1e9-8                   4.133n ± 0%    2.938n ± 0%  -28.91% (p=0.000 n=20)
Int32N2e9-8                   4.137n ± 0%    2.938n ± 2%  -28.97% (p=0.000 n=20)
Float32-8                     3.468n ± 1%    3.486n ± 0%   +0.52% (p=0.000 n=20)
Float64-8                     3.478n ± 0%    3.480n ± 0%        ~ (p=0.063 n=20)
ExpFloat64-8                  4.563n ± 0%    4.533n ± 0%   -0.67% (p=0.000 n=20)
NormFloat64-8                 4.768n ± 0%    4.764n ± 0%   -0.07% (p=0.001 n=20)
Perm3-8                       28.94n ± 0%    26.66n ± 0%   -7.88% (p=0.000 n=20)
Perm30-8                      175.9n ± 0%    143.4n ± 0%  -18.50% (p=0.000 n=20)
Perm30ViaShuffle-8            152.6n ± 1%    142.9n ± 0%   -6.29% (p=0.000 n=20)
ShuffleOverhead-8             119.6n ± 1%    120.7n ± 0%   +0.96% (p=0.000 n=20)
Concurrent-8                  2.452n ± 3%    2.360n ± 2%   -3.73% (p=0.007 n=20)

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 11ad9fdddc.386 │             4d84a369d1.386             │
                        │     sec/op     │    sec/op     vs base                  │
SourceUint64-32              2.091n ± 1%    2.101n ± 2%        ~ (p=0.672 n=20)
GlobalInt64-32               3.514n ± 2%    3.518n ± 2%        ~ (p=0.723 n=20)
GlobalInt63Parallel-32      0.3197n ± 0%
GlobalInt64Parallel-32                     0.3206n ± 0%
GlobalUint64-32              3.542n ± 1%    3.538n ± 1%        ~ (p=0.304 n=20)
GlobalUint64Parallel-32     0.3218n ± 0%   0.3231n ± 0%        ~ (p=0.071 n=20)
Int64-32                     2.552n ± 2%    2.554n ± 2%        ~ (p=0.693 n=20)
Uint64-32                    2.566n ± 1%    2.575n ± 2%        ~ (p=0.606 n=20)
GlobalIntN1000-32            5.965n ± 2%    6.292n ± 1%   +5.46% (p=0.000 n=20)
IntN1000-32                  4.652n ± 1%    4.735n ± 1%   +1.77% (p=0.000 n=20)
Int64N1000-32               14.485n ± 1%    5.489n ± 2%  -62.11% (p=0.000 n=20)
Int64N1e8-32                14.675n ± 1%    5.528n ± 2%  -62.33% (p=0.000 n=20)
Int64N1e9-32                16.805n ± 2%    5.438n ± 2%  -67.64% (p=0.000 n=20)
Int64N2e9-32                14.515n ± 1%    5.474n ± 1%  -62.28% (p=0.000 n=20)
Int64N1e18-32               16.165n ± 1%    9.053n ± 1%  -44.00% (p=0.000 n=20)
Int64N2e18-32               17.945n ± 2%    9.685n ± 2%  -46.03% (p=0.000 n=20)
Int64N4e18-32                18.35n ± 2%    12.18n ± 1%  -33.62% (p=0.000 n=20)
Int32N1000-32                3.608n ± 1%    4.862n ± 1%  +34.77% (p=0.000 n=20)
Int32N1e8-32                 3.767n ± 1%    4.758n ± 2%  +26.31% (p=0.000 n=20)
Int32N1e9-32                 4.130n ± 2%    4.772n ± 1%  +15.54% (p=0.000 n=20)
Int32N2e9-32                 4.206n ± 1%    4.847n ± 0%  +15.24% (p=0.000 n=20)
Float32-32                   22.18n ± 4%    22.18n ± 4%        ~ (p=0.195 n=20)
Float64-32                   20.75n ± 4%    21.21n ± 3%        ~ (p=0.394 n=20)
ExpFloat64-32                12.58n ± 3%    12.39n ± 2%        ~ (p=0.032 n=20)
NormFloat64-32               7.920n ± 3%    7.422n ± 1%   -6.29% (p=0.000 n=20)
Perm3-32                     40.27n ± 1%    38.00n ± 2%   -5.65% (p=0.000 n=20)
Perm30-32                    213.2n ± 2%    212.7n ± 1%        ~ (p=0.995 n=20)
Perm30ViaShuffle-32          164.2n ± 2%    187.5n ± 2%  +14.22% (p=0.000 n=20)
ShuffleOverhead-32           134.7n ± 2%    159.7n ± 1%  +18.52% (p=0.000 n=20)
Concurrent-32                3.301n ± 2%    3.470n ± 0%   +5.10% (p=0.000 n=20)

For #61716.

Change-Id: Id1481b04202883cd0b23e21bb58d1bca4e482bd3
Reviewed-on: https://go-review.googlesource.com/c/go/+/502500
Reviewed-by: Rob Pike <r@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
api/next/61716.txt
src/math/rand/v2/example_test.go
src/math/rand/v2/export_test.go
src/math/rand/v2/rand.go
src/math/rand/v2/rand_test.go
src/math/rand/v2/regress_test.go