Comments (4)
You might be interested in my implementation of LZ4 in kafka4net for some hints:
https://github.com/vchekan/kafka4net/blob/master/src/Compression/Lz4KafkaStream.cs
Things, like bug in kafka checksum implementation can cause a lot of time to debug.
https://issues.apache.org/jira/browse/KAFKA-3160
Another advice, you might want to invest into java cross-validation framework, like this:
https://github.com/vchekan/kafka4net/blob/master/tools/binary-console/src/main/scala/com/ntent/kafka/main.scala
where I generate kafka messages using java driver and use it as golden standard with different types of compressions and buffer sizes. Additional bonus, I get confidence that my implementation works with java consumer.
from kafunk.
Hey, thanks for the pointer. I wasn't aware of the LZ4 library and was considering implementing the compression from scratch. Has your experience with the library been good?
Regarding the cross-validation framework - good call, I think this would be useful.
from kafunk.
I remember we have used compression for many years in production but do not recall, which one it was, snappy or lz4.
Compression + my frames implementation I test here (every buffer size from 1 byte to 256Kb, random content):
https://github.com/vchekan/kafka4net/blob/master/tests/CompressionTests.cs
Here I run java compatibility test for gzip, lz4, snappy codecs. Idea is to invoke java and generate random content messages. C# creates text file with desired message sizes, java generates messages of desired length, publish messages to kafka and writes text file with hash codes of generated messages. C# reads java's hashes, consumes messages and compares message hash to the one generated by java.
https://github.com/vchekan/kafka4net/blob/master/tests/RecoveryTest.cs#L1967
from kafunk.
Looks like there is a lot of interest in getting LZ4 for Kafunk at Jet now. Should be prioritizing this work soon.
from kafunk.
Related Issues (20)
- ACL
- Fault tolerance hierarchies
- Add default Partitioner to the producer emulating Java client
- Error messages in the consumer
- Consumer fails to commit offset periodically after partition rebalance event
- Consumer randomly stops consuming, but continues to commit the stagnant offsets. HOT 2
- Add a count to ProducerResult
- Escalate codec exceptions without retry HOT 1
- MessageTooBigException when consuming a Snappy compressed topic.
- Request adding a consume function readToOffsets HOT 1
- Review log severity levels to prevent false positives HOT 1
- Kafka Crash Cource should be "Course" HOT 2
- Support for v0.11 protocol HOT 1
- Consumer shouldn't fail when no partitions are assigned. HOT 1
- Consumer slow when using autoVersions=true
- Snappy unit tests fail on .net Core HOT 1
- Can we update the Readme?
- Consumer does not recover from leaderless partition HOT 2
- Transaction support
- What are the advantages of Kafunk over the .NET wrapper of librdkafka? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kafunk.