Comments (2)
it seems that collection LSCQ use function cacheRemap16Byte to remap index used by SCQ, to do best effort to avoid false-sharing, maybe we could refer to folly UnboundedQueue's index, to remap index with less instructions.
Looks like they are not the same function.
package main
const (
scqsize = 128
cacheLineSize = 64
)
func cacheRemap16ByteA(index uint64) uint64 {
const cacheLineSize = cacheLineSize / 2
rawIndex := index & uint64(scqsize-1)
cacheLineNum := (rawIndex) % (scqsize / uint64(cacheLineSize))
cacheLineIdx := rawIndex / (scqsize / uint64(cacheLineSize))
return cacheLineNum*uint64(cacheLineSize) + cacheLineIdx
}
func cacheRemap16ByteB(index uint64) uint64 {
return (index * 5) & (scqsize - 1)
}
func main() {
for i := uint64(0); i < 10; i++ {
println(cacheRemap16ByteA(i), cacheRemap16ByteB(i))
}
}
Output:
0 0
32 5
64 10
96 15
1 20
33 25
65 30
97 35
2 40
34 45
We can see that 0 32 64 96
is NOT in the same cache line, but 0 5 10 15
is in the same cache line (false sharing).
however, i see no noticeable change in performance via BenchmarkDefault, perhaps because the function is not on the critical path.
The cacheRemap
is an improvement only for multiple-cores(+20%
), for single-core, it is a performance degradation(-15%
). You can run the benchmark with different cores(and disable or enable the cacheRemap
).
from gopkg.
Looks like they are not the same function.
Yes, versionB provides another way to remap index.
We can see that
0 32 64 96
is NOT in the same cache line, but0 5 10 15
is in the same cache line (false sharing).
It seems that the result of cacheRemap16Byte
is used to index [scqsize]scqNodePointer
, and sizeof(scqNodePointer) == 16 B
, so I think 4 * 16 B
gap is enough to make ring[i]
and ring[i+5]
in different cacheline. Besides, versionB is a general method, you can change 5
to any other proper odd number.
The cacheRemap is an improvement only for multiple-cores(+20%), for single-core, it is a performance degradation(-15%). You can run the benchmark with different cores(and disable or enable the cacheRemap).
Yes I know, but the meaning of 「no noticeable change in performance via BenchmarkDefault」 is between cacheRemap16ByteA
and cacheRemap16ByteB
, not between single-core and multi-core for cacheRemap16ByteB
.
Supplement to the above: develop...shadw3002:gopkg:feat/improve-remap
The properties that remap should have
One-to-one map
visited := make([]bool, scqsize)
for i := uint64(0); i < scqsize; i++ {
visited[cacheRemap16Byte(i)] = true
}
for i := uint64(0); i < scqsize; i++ {
if !visited[i] {
panic("unexpected")
}
}
for scqsize = 2^n
, and k = 2*m + 1
, define remap(i)=k*i % scqsize
we can prove that: for any abs(i_1 - i_2) < scqsize
, if i_1 != i_2
, than remap(i_1) != remap(i_2)
.
prove:
- if remap(i_1) == remap(i_2)
- than (2*m + 1) * abs(i_1 - i_2) mod 2^n == 0
- than abs(i_1 - i_2) == x * 2^n >= 2^n == scqsize, which is opposite to
abs(i_1 - i_2) < scqsize
Shuffle index to access different cacheline
just simply let k = cachelineSize / entrySize + 1, for example 5.
var (
pre = -1
preline = -1
nowline = -1
)
for i := 0; i < scqsize+1; i++ {
line := int(cacheRemap16Byte(uint64(0)) / 4)
nowline = line
if preline == nowline {
fmt.Printf("%d %d is in the same line.", pre, i)
}
pre = i
preline = nowline
}
Benchmark
Because only a few instructions are reduced, I did not find the performance difference in the function benchPair
that can best reflect the performance difference in theory.
goos: darwin
goarch: amd64
pkg: github.com/bytedance/gopkg/collection/lscq
cpu: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
Default/Pair/LSCQ-12 45.09n ± 1% 46.64n ± 1% +3.44% (p=0.000 n=100)
However, I ran the benchmark 100 times separately and compared using the benchmarkstat tool, but I even found performance regression, so this improvement may not be necessary
from gopkg.
Related Issues (20)
- how to pass variable to gopool function? HOT 1
- GoPool's performance in comparison to other goroutine pool implementations HOT 12
- gc tunner not Use in not web application
- Can 'gopool' support the extended use case of 'errgroup'? HOT 1
- gopool的RegisterPool使用不了 HOT 1
- util/gctuner: test fail randomly HOT 2
- Broken change of golang.org/x/sys causes build failed in go1.16 HOT 2
- Go version: bump minimum version to 1.17
- 如何动态修改熔断器配置 HOT 1
- missing comments of fastrand.Float32 / fastrand.Float64 methods
- Reporting a vulnerability
- closed
- asynccache的OOM问题
- circuitbreaker Panel 接口添加 State 方法
- logger 中default 的 calldepth = 3 会导致输出Log 无法定位实际输出
- 使用zset(skiplist)时,发现其remove,incr, updateScore比想象中的慢非常多 HOT 1
- gopool中的f func()能否新增可入参的 func()类型? HOT 2
- skipmap在32位系统下运行crash HOT 2
- Test HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gopkg.