Comments (12)
You probably need to install the snappy lib/packages on your machine.
from grocksdb.
Thanks @aureliar8 I just need to mention the path separately for all the folders. It worked then
from grocksdb.
@aureliar8 @yihuang @kingster
I am facing one problem. I am using rocksdb for clustering of strings for similarity/ dedupe purposes. when I start the clustering process, memory consumption is 0, but as the process proceeds further memory increases slow and steadily and reached till max limit, due to which OS auto kills the process.
RocksDB Details.zip
You can see the logs and profiling details in this zip file. Can you suggest me something to resolve this issue ?
from grocksdb.
What's the max limit value in your case ?
According to the go profiles you send, your pure go program seems to make a lot of short lived allocation. (High alloc_space & low inuse_space). So this shouldn't impact negatively the memory footprint of the process.
This indicates that most the memory footprint comes from cgo code that the go profiler can't observe, so probably rocksdb itself.
I can see in the rocksdb logs that you use a LRU BlockCache with a capacity of 3GB.
I can't comment if this is a correct value but this can explain a memory increase of 3GB between after the start of the process.
from grocksdb.
Maybe lots of sst files, there are some amount of memory needed for each opened sst files, you can set max open files option.
from grocksdb.
Thanks @aureliar8 @yihuang
-
What's the max limit value in your case ?
Around 62 GB of memory is free. One process can use 100% of the memory available. -
I am using the below-mentioned configuration to create rocksdb.
bbto := grocksdb.NewDefaultBlockBasedTableOptions()
//todo:
// checkout the value for LRUCache and options
bbto.SetBlockCache(grocksdb.NewLRUCache(31457280))
filter := grocksdb.NewBloomFilter(10)
bbto.SetFilterPolicy(filter)
opts := grocksdb.NewDefaultOptions()
opts.SetBlockBasedTableFactory(bbto)
opts.SetCreateIfMissing(true)
opts.EnableBlobFiles(true)
opts.EnableBlobGC(true)
opts.IncreaseParallelism(4)
opts.SetMaxWriteBufferNumber(4)
opts.SetMinWriteBufferNumberToMerge(1)
opts.SetRecycleLogFileNum(4)
opts.SetWriteBufferSize(134217728)
opts.SetWritableFileMaxBufferSize(0)
opts.CompactionReadaheadSize(2097152)
opts.SetMaxBackgroundJobs(2)
opts.SetMaxTotalWalSize(1073741824)
opts.SetBlobCompactionReadaheadSize(2097152)
opts.SetDbLogDir(dataDir + "/" + name)
opts.SetInfoLogLevel(grocksdb.InfoInfoLogLevel)
opts.SetStatsDumpPeriodSec(180)
opts.EnableStatistics()
opts.SetLevelCompactionDynamicLevelBytes(false)
opts.SetMaxOpenFiles(5)
- Also I forgot to mention this thing that we are creating around 200 tables with this configuration.
for i := 00; i <= 99; i++ {
db, err := NewRocksDB(BasePath+"/table_"+idx, "Pentagram")
PentagramDB[i] = db
db1, err := NewRocksDB(BasePath+"/table_"+idx, "Cluster")
ClusterDB[i] = db1
}
Your comments will be insightful if you can recommend what should be optimal values for options considering 62 GB of free space.
-
In the documentation, it is mentioned that
This fork contains no defer in codebase (my side project requires as less overhead as possible). This introduces a loose convention of how/when to free c-mem, thus breaking the rule of [tecbot/gorocksdb](https://github.com/tecbot/gorocksdb).
Is this in any way affecting memory consumption? If yes, what will be the alternative to this?
from grocksdb.
I find it hard to believe that this go code creates a rocksdb instance that generates the logs you previously send.
In the go code
bbto.SetBlockCache(grocksdb.NewLRUCache(31457280)) // 30MiB
In the rocksdb logs
Block cache LRUCache@0x24cd180#2984523 capacity: 3.00 GB ...
If each rocksdb instance is indeed having a cache of 3.00GB, then the total memory needed by this LRU cache is 200*3GB = 600GB
You could try to rerun your experiment with a single rocksdb instance and see where the memory usage stops. Then you'll need to have this low enough so that it can be multiplied by 200.
Alternatively you can change a bit the architecture of your code by having less rocksdb instances. The column family feature
might heal you at creating disjoint "tables" in a single rocksdb instance.
Or I think it's also possible to make these these 200 rocksdb instance share their ressources (caches, buffer) but you'd have to look at the documentation.
from grocksdb.
@aureliar8 Based on the comments received from your side I changed the configuration. The logs I have shared previously had different configs as mentioned in the rocksdb log file.
Plus I am setting this in readoptions
ro := grocksdb.NewDefaultReadOptions()
ro.SetFillCache(false)
from grocksdb.
- In the documentation, it is mentioned that [...]
Is this in any way affecting memory consumption? If yes, what will be the alternative to this?
This should have no significant impact
from grocksdb.
I have experimented by 5 different approaches for only one table (one table will create two rocksdb instance)which has a number of records
part 1: Flush after all records processed. Quick but memory also increasing rapidly
part 2: Flush after every 1000 records processed. Quick but memory also increasing rapidly
part 3: Flush after every insert and after every 1000 records
Too Slow
part 4: Flush after every insert
Slow and steadily memory increasing
part 5: No manual flush
Quick but memory also increasing rapidly
You can see the logs in attached file.
RocksDB Details.zip
Rocksdb Configuration.
bbto := grocksdb.NewDefaultBlockBasedTableOptions()
//todo:
// checkout the value for LRUCache and options
bbto.SetBlockCache(grocksdb.NewLRUCache(31457280))
filter := grocksdb.NewBloomFilter(10)
bbto.SetFilterPolicy(filter)
opts := grocksdb.NewDefaultOptions()
opts.SetBlockBasedTableFactory(bbto)
opts.SetCreateIfMissing(true)
opts.EnableBlobFiles(true)
opts.EnableBlobGC(true)
opts.IncreaseParallelism(4)
opts.SetMaxWriteBufferNumber(4)
opts.SetMinWriteBufferNumberToMerge(1)
opts.SetRecycleLogFileNum(4)
opts.SetWriteBufferSize(64 << 20)
opts.SetWritableFileMaxBufferSize(0)
opts.CompactionReadaheadSize(2097152)
opts.SetMaxBackgroundJobs(2)
opts.SetMaxTotalWalSize(1073741824)
opts.SetBlobCompactionReadaheadSize(2097152)
opts.SetDbLogDir(dataDir + "/" + name)
opts.SetInfoLogLevel(grocksdb.InfoInfoLogLevel)
opts.SetStatsDumpPeriodSec(180)
opts.EnableStatistics()
opts.SetLevelCompactionDynamicLevelBytes(false)
opts.SetMaxOpenFiles(5)
from grocksdb.
We are also experiencing suspected memory leaking with our rocksdb based app, haven't investigated deep yet though.
from grocksdb.
Please use Column Family instead of creating 100 instances of RocksDB like this:
for i := 00; i <= 99; i++ {
db, err := NewRocksDB(BasePath+"/table_"+idx, "Pentagram")
PentagramDB[i] = db
db1, err := NewRocksDB(BasePath+"/table_"+idx, "Cluster")
ClusterDB[i] = db1
}
from grocksdb.
Related Issues (20)
- no ability to set track_and_verify_wals_in_manifest option HOT 1
- support a non-hacky way to override link flags HOT 3
- failed to compile: library not found for -lsnappy HOT 3
- ld: library not found for -lrocksdb HOT 2
- installed latest rocksdb HOT 1
- Include library files in repository HOT 1
- no dboption delayed_write_rate HOT 1
- `options.SetPrefixExtractor(grocksdb.NewNoopPrefixTransform())` maybe coredump HOT 2
- Documentation improvments: When should Destroy be called for db option types ? HOT 1
- Is it possible to implement CompactFiles ? HOT 1
- tagged version release for rocksdb v7.0.2? HOT 2
- Adjust block max size for Flash HOT 2
- i did the dbr.CompactRange(grocksdb.Range{nil, nil}) to trigger manual compaction but hv error below HOT 2
- no DeleteFilesInRange api HOT 5
- could not determine kind of name for C.rocksdb_backup_engine_open_opts HOT 2
- Add rocksdb sources to the go package HOT 4
- Documentation fix: Add -lbz2 link flag HOT 2
- cannot use _Ctype_ulong(s) (value of type _Ctype_ulong) as type _Ctype_ulonglong in variable declaration HOT 1
- need rocksdb_column_family_handle_get_name HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from grocksdb.