]> Cypherpunks.ru repositories - gostls13.git/commit
[release-branch.go1.21] runtime: don't eagerly collapse hugepages
authorMichael Anthony Knyszek <mknyszek@google.com>
Fri, 29 Sep 2023 19:16:38 +0000 (19:16 +0000)
committerGopher Robot <gobot@golang.org>
Thu, 12 Oct 2023 23:27:00 +0000 (23:27 +0000)
commitef6993f3270c95e1158b11d9fc1a047528b34da6
treea2e36af713cc2687db08744f7460f759b043c844
parentbae01521f3ab27979b454f2ecc77ff9403965957
[release-branch.go1.21] runtime: don't eagerly collapse hugepages

This has caused performance issues in production environments.

MADV_COLLAPSE can go into direct reclaim, but we call it with the heap
lock held. This means that the process could end up stalled fairly
quickly if just one allocating goroutine ends up in the madvise call, at
least until the madvise(MADV_COLLAPSE) call returns. A similar issue
occurred with madvise(MADV_HUGEPAGE), because that could go into direct
reclaim on any page fault for MADV_HUGEPAGE-marked memory.

My understanding was that the calls to madvise(MADV_COLLAPSE) were
fairly rare, and it's "best-effort" nature prevented it from going into
direct reclaim often, but this was wrong. It tends to be fairly
heavyweight even when it doesn't end up in direct reclaim, and it's
almost certainly not worth it.

Disable it until further notice and let the kernel fully dictate
hugepage policy. The updated scavenger policy is still more hugepage
friendly by delaying scavening until hugepages are no longer densely
packed, so we don't lose all that much.

The Sweet benchmarks show a minimal difference. A couple less realistic
benchmarks seem to slow down a bit; they might just be getting unlucky
with what the kernel decides to back with a huge page. Some benchmarks
on the other hand improve. Overall, it's a wash.

name                  old time/op            new time/op            delta
BiogoIgor                        13.1s ± 1%             13.2s ± 2%    ~     (p=0.182 n=9+10)
BiogoKrishna                     12.0s ± 1%             12.1s ± 1%  +1.23%  (p=0.002 n=9+10)
BleveIndexBatch100               4.51s ± 4%             4.56s ± 3%    ~     (p=0.393 n=10+10)
EtcdPut                         20.2ms ± 4%            19.8ms ± 2%    ~     (p=0.079 n=10+9)
EtcdSTM                          109ms ± 3%             111ms ± 3%  +1.63%  (p=0.035 n=10+10)
GoBuildKubelet                   31.2s ± 1%             31.3s ± 1%    ~     (p=0.780 n=9+10)
GoBuildKubeletLink               7.77s ± 0%             7.81s ± 2%    ~     (p=0.237 n=8+10)
GoBuildIstioctl                  31.8s ± 1%             31.7s ± 0%    ~     (p=0.136 n=9+9)
GoBuildIstioctlLink              7.88s ± 1%             7.89s ± 1%    ~     (p=0.720 n=9+10)
GoBuildFrontend                  11.7s ± 1%             11.8s ± 1%    ~     (p=0.278 n=10+9)
GoBuildFrontendLink              1.15s ± 4%             1.15s ± 5%    ~     (p=0.387 n=9+9)
GopherLuaKNucleotide             19.7s ± 1%             20.6s ± 0%  +4.48%  (p=0.000 n=10+10)
MarkdownRenderXHTML              194ms ± 3%             196ms ± 3%    ~     (p=0.356 n=9+10)
Tile38QueryLoad                  633µs ± 2%             629µs ± 2%    ~     (p=0.075 n=10+10)

name                  old average-RSS-bytes  new average-RSS-bytes  delta
BiogoIgor                       69.2MB ± 3%            68.4MB ± 1%    ~     (p=0.190 n=10+10)
BiogoKrishna                    4.40GB ± 0%            4.40GB ± 0%    ~     (p=0.605 n=9+9)
BleveIndexBatch100               195MB ± 3%             195MB ± 2%    ~     (p=0.853 n=10+10)
EtcdPut                          107MB ± 4%             108MB ± 3%    ~     (p=0.190 n=10+10)
EtcdSTM                         91.6MB ± 5%            92.6MB ± 4%    ~     (p=0.481 n=10+10)
GoBuildKubelet                  2.26GB ± 1%            2.28GB ± 1%  +1.22%  (p=0.000 n=10+10)
GoBuildIstioctl                 1.53GB ± 0%            1.53GB ± 0%  +0.21%  (p=0.017 n=9+10)
GoBuildFrontend                  556MB ± 1%             554MB ± 2%    ~     (p=0.497 n=9+10)
GopherLuaKNucleotide            39.0MB ± 3%            39.0MB ± 1%    ~     (p=1.000 n=10+8)
MarkdownRenderXHTML             21.2MB ± 2%            21.4MB ± 3%    ~     (p=0.190 n=10+10)
Tile38QueryLoad                 5.99GB ± 2%            6.02GB ± 0%    ~     (p=0.243 n=10+9)

name                  old peak-RSS-bytes     new peak-RSS-bytes     delta
BiogoIgor                       90.2MB ± 4%            89.2MB ± 2%    ~     (p=0.143 n=10+10)
BiogoKrishna                    4.49GB ± 0%            4.49GB ± 0%    ~     (p=0.190 n=10+10)
BleveIndexBatch100               283MB ± 8%             274MB ± 6%    ~     (p=0.075 n=10+10)
EtcdPut                          147MB ± 4%             149MB ± 2%  +1.55%  (p=0.034 n=10+8)
EtcdSTM                          117MB ± 5%             117MB ± 4%    ~     (p=0.905 n=9+10)
GopherLuaKNucleotide            44.9MB ± 1%            44.6MB ± 1%    ~     (p=0.083 n=8+8)
MarkdownRenderXHTML             22.0MB ± 8%            22.1MB ± 9%    ~     (p=0.436 n=10+10)
Tile38QueryLoad                 6.24GB ± 2%            6.29GB ± 2%    ~     (p=0.218 n=10+10)

name                  old peak-VM-bytes      new peak-VM-bytes      delta
BiogoIgor                       1.33GB ± 0%            1.33GB ± 0%    ~     (p=0.504 n=10+9)
BiogoKrishna                    5.77GB ± 0%            5.77GB ± 0%    ~     (p=1.000 n=10+9)
BleveIndexBatch100              3.53GB ± 0%            3.53GB ± 0%    ~     (p=0.642 n=10+10)
EtcdPut                         12.1GB ± 0%            12.1GB ± 0%    ~     (p=0.564 n=10+10)
EtcdSTM                         12.1GB ± 0%            12.1GB ± 0%    ~     (p=0.633 n=10+10)
GopherLuaKNucleotide            1.26GB ± 0%            1.26GB ± 0%    ~     (p=0.297 n=9+10)
MarkdownRenderXHTML             1.26GB ± 0%            1.26GB ± 0%    ~     (p=0.069 n=10+10)
Tile38QueryLoad                 7.47GB ± 2%            7.53GB ± 2%    ~     (p=0.280 n=10+10)

name                  old p50-latency-ns     new p50-latency-ns     delta
EtcdPut                          19.8M ± 5%             19.3M ± 3%  -2.74%  (p=0.043 n=10+9)
EtcdSTM                          81.4M ± 4%             83.4M ± 4%  +2.46%  (p=0.029 n=10+10)
Tile38QueryLoad                   241k ± 1%              240k ± 1%    ~     (p=0.393 n=10+10)

name                  old p90-latency-ns     new p90-latency-ns     delta
EtcdPut                          30.4M ± 5%             30.6M ± 5%    ~     (p=0.971 n=10+10)
EtcdSTM                           222M ± 3%              226M ± 4%    ~     (p=0.063 n=10+10)
Tile38QueryLoad                   687k ± 2%              691k ± 1%    ~     (p=0.173 n=10+8)

name                  old p99-latency-ns     new p99-latency-ns     delta
EtcdPut                          42.3M ±10%             41.4M ± 7%    ~     (p=0.353 n=10+10)
EtcdSTM                           486M ± 7%              487M ± 4%    ~     (p=0.579 n=10+10)
Tile38QueryLoad                  6.43M ± 2%             6.37M ± 3%    ~     (p=0.280 n=10+10)

name                  old ops/s              new ops/s              delta
EtcdPut                          48.6k ± 3%             49.5k ± 2%    ~     (p=0.065 n=10+9)
EtcdSTM                          9.09k ± 2%             8.95k ± 3%  -1.56%  (p=0.045 n=10+10)
Tile38QueryLoad                  28.4k ± 1%             28.6k ± 1%  +0.87%  (p=0.016 n=9+10)

Fixes #63335.
For #63334.
Related to #61718 and #59960.

Change-Id: If84c5a8685825d43c912a71418f2597e44e867e5
Reviewed-on: https://go-review.googlesource.com/c/go/+/531816
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
(cherry picked from commit 595deec3dda8e81d514389efdbb4ee2bc38dcabe)
Reviewed-on: https://go-review.googlesource.com/c/go/+/532255
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
src/runtime/mgcscavenge.go