sort: optimize symMerge performance for blocks with one element
Use direct binary insertion instead of recursive calls to symMerge
when one of the blocks has only one element.
benchmark old ns/op new ns/op delta
BenchmarkStableString1K 421999 397629 -5.77%
BenchmarkStableInt1K 123422 120592 -2.29%
BenchmarkStableInt64K
9629094 9620200 -0.09%
BenchmarkStable1e2 123089 120209 -2.34%
BenchmarkStable1e4
39505228 36870029 -6.67%
BenchmarkStable1e6
8196612367 7630840157 -6.90%
Change-Id: I49905a909e8595cfa05920ccf9aa00a8f3036110
Reviewed-on: https://go-review.googlesource.com/2219
Reviewed-by: Robert Griesemer <gri@golang.org>