itsvikramagr / spark-benchmark Goto Github PK
View Code? Open in Web Editor NEWStructured streaming benchmark utils
Structured streaming benchmark utils
Hi,
I wanted to test the RocksDB StateStore implementation and check if it really resists the OOM exception.
before I explain my issue.... I compiled the spark version 3.2.0, I'm running it on linux manjaro, I wrote the test application in java and I tested it with spark-submit, my system has 2 cores and 4 logical cores, 16 GB Ram.
My test application is a simple word count that is the sample in structured streaming programming guide page. I also wrote a server in python that sends many random words to the client which is connected to the python listening socket.
My problem is that when I send the words to the spark application (my word count app) it throws OOM exception in both using RocksDBStateStore and HDFSStateStore. What is the problem?! Am I making a mistake in running the application?!
Config of SparkSession
SparkSession spark = SparkSession
.builder()
.appName("JavaStructuredNetworkWordCount")
.config("spark.sql.streaming.stateStore.providerClass",
"org.apache.spark.sql.execution.streaming.state.RocksDBStateStoreProvider")
.config("spark.local.dir", "/home/username/sparkTemp")
.config("spark.executor.memory", "15g")
.config("spark.driver.memory", "15g")
.config("spark.memory.offHeap.enabled", true)
.config("spark.memory.offHeap.use", true)
.config("spark.memory.offHeap.size", "50g")
.config("spark.executor.memoryOverhead", "50g")
.config("spark.sql.shuffle.partitions", 8)
.config("spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows", false)
.getOrCreate();
Execution command
/path/to/spark-submit --master local[*] --deploy-mode client --class org.example.Test4 --name Run /path/to/Test4-1.0-SNAPSHOT.jar --driver-memory 15g --executor-memory 15g
Thanks for help.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.