]> Cypherpunks.ru repositories - gostls13.git/commitdiff
[release-branch.go1.21] runtime: don't eagerly collapse hugepages
authorMichael Anthony Knyszek <mknyszek@google.com>
Fri, 29 Sep 2023 19:16:38 +0000 (19:16 +0000)
committerGopher Robot <gobot@golang.org>
Thu, 12 Oct 2023 23:27:00 +0000 (23:27 +0000)
This has caused performance issues in production environments.

MADV_COLLAPSE can go into direct reclaim, but we call it with the heap
lock held. This means that the process could end up stalled fairly
quickly if just one allocating goroutine ends up in the madvise call, at
least until the madvise(MADV_COLLAPSE) call returns. A similar issue
occurred with madvise(MADV_HUGEPAGE), because that could go into direct
reclaim on any page fault for MADV_HUGEPAGE-marked memory.

My understanding was that the calls to madvise(MADV_COLLAPSE) were
fairly rare, and it's "best-effort" nature prevented it from going into
direct reclaim often, but this was wrong. It tends to be fairly
heavyweight even when it doesn't end up in direct reclaim, and it's
almost certainly not worth it.

Disable it until further notice and let the kernel fully dictate
hugepage policy. The updated scavenger policy is still more hugepage
friendly by delaying scavening until hugepages are no longer densely
packed, so we don't lose all that much.

The Sweet benchmarks show a minimal difference. A couple less realistic
benchmarks seem to slow down a bit; they might just be getting unlucky
with what the kernel decides to back with a huge page. Some benchmarks
on the other hand improve. Overall, it's a wash.

name                  old time/op            new time/op            delta
BiogoIgor                        13.1s ± 1%             13.2s ± 2%    ~     (p=0.182 n=9+10)
BiogoKrishna                     12.0s ± 1%             12.1s ± 1%  +1.23%  (p=0.002 n=9+10)
BleveIndexBatch100               4.51s ± 4%             4.56s ± 3%    ~     (p=0.393 n=10+10)
EtcdPut                         20.2ms ± 4%            19.8ms ± 2%    ~     (p=0.079 n=10+9)
EtcdSTM                          109ms ± 3%             111ms ± 3%  +1.63%  (p=0.035 n=10+10)
GoBuildKubelet                   31.2s ± 1%             31.3s ± 1%    ~     (p=0.780 n=9+10)
GoBuildKubeletLink               7.77s ± 0%             7.81s ± 2%    ~     (p=0.237 n=8+10)
GoBuildIstioctl                  31.8s ± 1%             31.7s ± 0%    ~     (p=0.136 n=9+9)
GoBuildIstioctlLink              7.88s ± 1%             7.89s ± 1%    ~     (p=0.720 n=9+10)
GoBuildFrontend                  11.7s ± 1%             11.8s ± 1%    ~     (p=0.278 n=10+9)
GoBuildFrontendLink              1.15s ± 4%             1.15s ± 5%    ~     (p=0.387 n=9+9)
GopherLuaKNucleotide             19.7s ± 1%             20.6s ± 0%  +4.48%  (p=0.000 n=10+10)
MarkdownRenderXHTML              194ms ± 3%             196ms ± 3%    ~     (p=0.356 n=9+10)
Tile38QueryLoad                  633µs ± 2%             629µs ± 2%    ~     (p=0.075 n=10+10)

name                  old average-RSS-bytes  new average-RSS-bytes  delta
BiogoIgor                       69.2MB ± 3%            68.4MB ± 1%    ~     (p=0.190 n=10+10)
BiogoKrishna                    4.40GB ± 0%            4.40GB ± 0%    ~     (p=0.605 n=9+9)
BleveIndexBatch100               195MB ± 3%             195MB ± 2%    ~     (p=0.853 n=10+10)
EtcdPut                          107MB ± 4%             108MB ± 3%    ~     (p=0.190 n=10+10)
EtcdSTM                         91.6MB ± 5%            92.6MB ± 4%    ~     (p=0.481 n=10+10)
GoBuildKubelet                  2.26GB ± 1%            2.28GB ± 1%  +1.22%  (p=0.000 n=10+10)
GoBuildIstioctl                 1.53GB ± 0%            1.53GB ± 0%  +0.21%  (p=0.017 n=9+10)
GoBuildFrontend                  556MB ± 1%             554MB ± 2%    ~     (p=0.497 n=9+10)
GopherLuaKNucleotide            39.0MB ± 3%            39.0MB ± 1%    ~     (p=1.000 n=10+8)
MarkdownRenderXHTML             21.2MB ± 2%            21.4MB ± 3%    ~     (p=0.190 n=10+10)
Tile38QueryLoad                 5.99GB ± 2%            6.02GB ± 0%    ~     (p=0.243 n=10+9)

name                  old peak-RSS-bytes     new peak-RSS-bytes     delta
BiogoIgor                       90.2MB ± 4%            89.2MB ± 2%    ~     (p=0.143 n=10+10)
BiogoKrishna                    4.49GB ± 0%            4.49GB ± 0%    ~     (p=0.190 n=10+10)
BleveIndexBatch100               283MB ± 8%             274MB ± 6%    ~     (p=0.075 n=10+10)
EtcdPut                          147MB ± 4%             149MB ± 2%  +1.55%  (p=0.034 n=10+8)
EtcdSTM                          117MB ± 5%             117MB ± 4%    ~     (p=0.905 n=9+10)
GopherLuaKNucleotide            44.9MB ± 1%            44.6MB ± 1%    ~     (p=0.083 n=8+8)
MarkdownRenderXHTML             22.0MB ± 8%            22.1MB ± 9%    ~     (p=0.436 n=10+10)
Tile38QueryLoad                 6.24GB ± 2%            6.29GB ± 2%    ~     (p=0.218 n=10+10)

name                  old peak-VM-bytes      new peak-VM-bytes      delta
BiogoIgor                       1.33GB ± 0%            1.33GB ± 0%    ~     (p=0.504 n=10+9)
BiogoKrishna                    5.77GB ± 0%            5.77GB ± 0%    ~     (p=1.000 n=10+9)
BleveIndexBatch100              3.53GB ± 0%            3.53GB ± 0%    ~     (p=0.642 n=10+10)
EtcdPut                         12.1GB ± 0%            12.1GB ± 0%    ~     (p=0.564 n=10+10)
EtcdSTM                         12.1GB ± 0%            12.1GB ± 0%    ~     (p=0.633 n=10+10)
GopherLuaKNucleotide            1.26GB ± 0%            1.26GB ± 0%    ~     (p=0.297 n=9+10)
MarkdownRenderXHTML             1.26GB ± 0%            1.26GB ± 0%    ~     (p=0.069 n=10+10)
Tile38QueryLoad                 7.47GB ± 2%            7.53GB ± 2%    ~     (p=0.280 n=10+10)

name                  old p50-latency-ns     new p50-latency-ns     delta
EtcdPut                          19.8M ± 5%             19.3M ± 3%  -2.74%  (p=0.043 n=10+9)
EtcdSTM                          81.4M ± 4%             83.4M ± 4%  +2.46%  (p=0.029 n=10+10)
Tile38QueryLoad                   241k ± 1%              240k ± 1%    ~     (p=0.393 n=10+10)

name                  old p90-latency-ns     new p90-latency-ns     delta
EtcdPut                          30.4M ± 5%             30.6M ± 5%    ~     (p=0.971 n=10+10)
EtcdSTM                           222M ± 3%              226M ± 4%    ~     (p=0.063 n=10+10)
Tile38QueryLoad                   687k ± 2%              691k ± 1%    ~     (p=0.173 n=10+8)

name                  old p99-latency-ns     new p99-latency-ns     delta
EtcdPut                          42.3M ±10%             41.4M ± 7%    ~     (p=0.353 n=10+10)
EtcdSTM                           486M ± 7%              487M ± 4%    ~     (p=0.579 n=10+10)
Tile38QueryLoad                  6.43M ± 2%             6.37M ± 3%    ~     (p=0.280 n=10+10)

name                  old ops/s              new ops/s              delta
EtcdPut                          48.6k ± 3%             49.5k ± 2%    ~     (p=0.065 n=10+9)
EtcdSTM                          9.09k ± 2%             8.95k ± 3%  -1.56%  (p=0.045 n=10+10)
Tile38QueryLoad                  28.4k ± 1%             28.6k ± 1%  +0.87%  (p=0.016 n=9+10)

Fixes #63335.
For #63334.
Related to #61718 and #59960.

Change-Id: If84c5a8685825d43c912a71418f2597e44e867e5
Reviewed-on: https://go-review.googlesource.com/c/go/+/531816
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
(cherry picked from commit 595deec3dda8e81d514389efdbb4ee2bc38dcabe)
Reviewed-on: https://go-review.googlesource.com/c/go/+/532255
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>

src/runtime/mgcscavenge.go

index 4c6d6be4f04b754412fdd2337537d1b957c47bd7..659ca8df2ee9bf59251178bcff208ebc45a1bf6a 100644 (file)
@@ -1145,21 +1145,11 @@ func (s *scavengeIndex) alloc(ci chunkIdx, npages uint) {
                // Mark that we're considering this chunk as backed by huge pages.
                sc.setHugePage()
 
-               // Collapse dense chunks into huge pages and mark that
-               // we did that, but only if we're not allocating to
-               // use the entire chunk. If we're allocating an entire chunk,
-               // this is likely part of a much bigger allocation. For
-               // instance, if the caller is allocating a 1 GiB slice of bytes, we
-               // don't want to go and manually collapse all those pages; we want
-               // them to be demand-paged. If the caller is actually going to use
-               // all that memory, it'll naturally get backed by huge pages later.
-               //
-               // This also avoids having sysHugePageCollapse fail. On Linux,
-               // the call requires that some part of the huge page being collapsed
-               // is already paged in.
-               if !s.test && npages < pallocChunkPages {
-                       sysHugePageCollapse(unsafe.Pointer(chunkBase(ci)), pallocChunkBytes)
-               }
+               // TODO(mknyszek): Consider eagerly backing memory with huge pages
+               // here. In the past we've attempted to use sysHugePageCollapse
+               // (which uses MADV_COLLAPSE on Linux, and is unsupported elswhere)
+               // for this purpose, but that caused performance issues in production
+               // environments.
        }
        s.chunks[ci].store(sc)
 }