]> Cypherpunks.ru repositories - gostls13.git/commitdiff
hash/crc64: use slicing by 8 when the size is greater or equal than 2k
authorruinan <ruinan.sun@arm.com>
Tue, 18 Oct 2022 01:32:41 +0000 (09:32 +0800)
committerGopher Robot <gobot@golang.org>
Thu, 27 Oct 2022 16:52:41 +0000 (16:52 +0000)
In the previous, we will only use the slicing by 8 look up table when
the data size is greater than 16k, but the threshold seems to be too
large. We may lose some performance for the small data size.
In this CL, we change the threshold to 2k, it shows that the performance
is improved greatly when the data size is 2k ~ 16k.

We did some tests for the Random2K ~ Random16K between various
cores(mostly on x86 and arm architecture). Here are the benchmark and
some results:

1. The benchmark for testing:

func BenchmarkRandom(b *testing.B) {
poly := 0x58993511
b.Run("Random256", func(b *testing.B) {
bench(b, uint64(poly), 256)
})
b.Run("Random512", func(b *testing.B) {
bench(b, uint64(poly), 512)
})
b.Run("Random1KB", func(b *testing.B) {
bench(b, uint64(poly), 1<<10)
})
b.Run("Random2KB", func(b *testing.B) {
bench(b, uint64(poly), 2<<10)
})
b.Run("Random4KB", func(b *testing.B) {
bench(b, uint64(poly), 4<<10)
})
b.Run("Random8KB", func(b *testing.B) {
bench(b, uint64(poly), 8<<10)
})
b.Run("Random16KB", func(b *testing.B) {
bench(b, uint64(poly), 16<<10)
})
}

2. Some results:

Apple silicon M1:
Benchmark                old         new         delta
Random/Random2KB-10      362MB/s     801MB/s    +121.41%
Random/Random4KB-10      360MB/s    1083MB/s    +200.93%
Random/Random8KB-10      359MB/s    1309MB/s    +264.88%
Random/Random16KB-10     358MB/s    1466MB/s    +309.79%

Neoverse N1:
Benchmark                old         new         delta
Random/Random2KB-160     397MB/s     493MB/s     +24.23%
Random/Random4KB-160     397MB/s     742MB/s     +86.86%
Random/Random8KB-160     398MB/s     995MB/s    +150.12%
Random/Random16KB-160    398MB/s    1196MB/s    +200.58%

Silver 4116:
Benchmark                old         new         delta
Random/Random2KB-48      252MB/s     418MB/s     +65.79%
Random/Random4KB-48      253MB/s     621MB/s    +145.72%
Random/Random8KB-48      254MB/s     796MB/s    +213.07%
Random/Random16KB-48     258MB/s     929MB/s    +260.46%

EPYC 7251:
Benchmark                old         new         delta
Random/Random2KB-32      255MB/s     380MB/s     +48.88%
Random/Random4KB-32      255MB/s     561MB/s    +119.73%
Random/Random8KB-32      255MB/s     738MB/s    +189.18%
Random/Random16KB-32     255MB/s     877MB/s    +243.80%

Change-Id: Ib7b4f6826c3edd6f315cac8057d52b6da252a652
Reviewed-on: https://go-review.googlesource.com/c/go/+/445475
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>

src/hash/crc64/crc64.go

index 063c63c6a3b092eee809334822e316c14b79e466..26b2573c8e62730c474e718825c47402b1fe3de9 100644 (file)
@@ -163,7 +163,9 @@ func update(crc uint64, tab *Table, p []byte) uint64 {
                } else if *tab == slicing8TableISO[0] {
                        helperTable = slicing8TableISO
                        // For smaller sizes creating extended table takes too much time
-               } else if len(p) > 16384 {
+               } else if len(p) >= 2048 {
+                       // According to the tests between various x86 and arm CPUs, 2k is a reasonable
+                       // threshold for now. This may change in the future.
                        helperTable = makeSlicingBy8Table(tab)
                } else {
                        break