Comments (10)
See https://allegro.tech/2016/03/writing-fast-cache-service-in-go.html#omitting-garbage-collector.
from pebble.
https://blog.dgraph.io/post/caching-in-go/ seems relevant, too:
Meanwhile, we came across Caffeine, a Java library used by Cassandra, Finagle and other database systems. It uses TinyLFU, a highly efficient cache admission policy and uses various techniques to scale and perform well as the number of threads and cores grow, while providing a close to optimal hit ratio. You can read more about how it works in this article.
Caffeine meets all the five requirements I mentioned at the beginning, so we are looking into building a Go equivalent of Caffeine, something which can serve our needs and potentially fill the gap of a concurrent, performant, memory-bounded cache in the Go language. Let us know if you’d like to contribute or if you have something similar built already!
from pebble.
Definitely relevant. In my explorations of caching libraries I had also come across Caffeine. Go has nothing similar. It does have go-tinyflu which is a Go implementation of the TinyLFU algorithm. Not usable as-is, but it could provide inspiration. The buffering of read and write operations in Caffeine is a super interesting idea and looks to be key to performance. Instead of performing LRU/LFU manipulations synchronously, they are buffered and performed in batches.
There doesn't seem to be a good concurrent hash table in Go. Sharding the cache (or even just the hash table) and protecting each shard with a lock is a possibility. The Non-Blocking Hash Table from Azul (Java) looks super interesting. I highly recommend that paper for how uses a state machine approach to reason about correctness. There is also a video presentation of those slides.
from pebble.
A valuable feature of a cache is to load (compute) the value atomically on a miss, and have other callers block waiting. This is sometimes referred to as memoization. It avoids cache stampedes that overwhelm the underlying resource due to duplicated work.
A lock-free hash table is nifty, but doesn't offer this ability. You definitely want lock-free reads, but fine-grained locking for writes is probably better. Java's rewritten ConcurrentHashMap
does this and users have a lot of flexibility due to the various compute
methods. Otherwise you have to emulate it yourself using lock striping, which is probably okay and removes any marginal gains due to lock-free writes. Since Go lacks a good concurrent hashmap, the choice is probably which design is easier to implement.
from pebble.
@ben-manes That is a good point about cache stampedes, though I'm pretty sure RocksDB doesn't attempt to avoid them. I'm struggling to think of a workload that would exhibit a cache stampede problem that would last for more than a second on a storage engine like Pebble / RocksDB. That could just be a failure of imagination, though.
from pebble.
You're probably right. From an application-view, a self populating cache is quite valuable. At the low-level it may not be...
from pebble.
I did some experimentation with the Go GC today to see what impact the cache size would have on GC times if we left the cache in Go managed memory.
My experiment was a toy program which allocated a cache of 1 TB composed of Y fixed size blocks. I varied the block size in order to simulate different cache sizes, operating under the assumption that the impact of a large cache is the number of pointers that need to be chased. In order to ensure that GC was running regularly, another goroutine sat in a loop allocating memory.
blocks gc-cpu(ms)
64k 10
128k 14
256k 16
512k 20
1024k 31
2048k 54
4096k 110
Note that these are not pause times as the Go GC is concurrent and its stop-the-world phase is quite short. The above reflects the tax of a large cache on the CPU available to the program.
CockroachDB uses 32KB blocks. 512k 32KB blocks would fill a 16TB cache, and the CPU tax would be <1% (I'm assuming total CPU time available is 1s * num-CPUs). This is the best-case scenario as there is only one pointer per block and the GC throughput is related to the number of pointers that need to be traversed. The real cache would need a hash table to find blocks, as well as some sort of LRU structure for eviction. So figure 3 pointers per cached block, perhaps more.
This all seems to be reasonable overhead. I'm going to do some more experimentation and make this somewhat more realistic, but for now I'm thinking that we can leave the cache in Go managed memory.
from pebble.
There is early work being done in dgraph-io/ristretto by @karlmcguire. So far that's exploring tinylfu and concurrency, but they also want to look at allocation later. Might be worth collaborating in the future as both efforts are exploratory modes right now.
from pebble.
@ben-manes Yep. I'm keeping my eye on ristretto. I don't have any cycles to contribute to it at this time, though
from pebble.
As shown above, a large Go heap is not a significant problem in and of itself. The problem is significant churn in the heap. This shows up clearly in profiles of read-only workloads where the working set is larger than the cache size. In particular, the slice allocation in sstable.Reader.readBlock
shows up as a hot-spot for GC assist activity. I think there is something relatively straightforward to do here: keep a small cache of recently deallocated cache blocks and try to reuse that memory for the next allocation.
In Pebble (and in RocksDB), the data block sizes are fairly uniform, clustering around the configured block size. The final data block in each sstable has a wider size range being randomly in the range [0-blockSize]
. The index blocks are a bit different, with their sizes varying more widely. My current thinking is to target the 99% case of data blocks that are close to the configured block size and attempt reuse memory for those blocks. Non-standard size blocks would be GC'd as normal.
from pebble.
Related Issues (20)
- internal/metamorphic/crossversion/crossversion_test: TestMetaCrossVersion failed HOT 4
- old files were not deleted when compaction completed HOT 2
- github.com/cockroachdb/pebble/internal/metamorphic: TestMeta failed HOT 1
- snapshot: Create new `EventuallyFileOnlySnapshot` that holds readState
- go-linux-race is flaky HOT 9
- Question about string of (*Options).TargetByteDeletionRate, which is min_deletion_rate HOT 2
- db: possible memtable large value optimizations
- db: potentially trade write amp for space amp during compactions through vsstables HOT 1
- db: use strict-obsolete sstables when writing to and reading from shared storage HOT 2
- Table filter is not used in read operation HOT 4
- Question about block cache HOT 1
- Rate Limit Scan Statistics Function
- Make batchMaxRetainedSize configurable HOT 1
- metamorphic: Add support for EventuallyFileOnlySnapshots in metamorphic tests HOT 1
- Use WaitCtx for ScanStatistics Rate Limiting
- base: consider altering InternalIterator return value
- TestDiskHealthChecking_Filesystem is flaky on go-macos CI
- Potential OOM, no metrics surfaced by Pebble to verify HOT 1
- tool/logs: add test that current log format parses correctly
- db: version edit validation runs during panic handling
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pebble.