Giter VIP home page Giter VIP logo

Comments (40)

m1ome avatar m1ome commented on May 22, 2024 5

Starting to build up benchmarks.
https://github.com/m1ome/tile38-benchmark

Mostly this was related to issue #130 to test up speed.
But i think i can refactor it to work with entire system.

from tile38.

tidwall avatar tidwall commented on May 22, 2024 2

Hi Maciej,

I could use some help on identifying the priority benchmarks or metric that should be measured, and perhaps which tools might be best for the job.

Things I want to benchmark:

  • The general SET, GET, DEL crud commands, should be O(log n).
  • The geospatial search commands such as INTERSECTS, NEARBY, WITHIN, should also be O(log n).
  • The geofence commands SETHOOK and FENCE. Which may be more difficult to measure because much can depend on the limitations of the server and network. Also the recent changes that include persistent HTTP/2 and GRPC are still very new.

What do you think would be most valuable?

from tile38.

m1ome avatar m1ome commented on May 22, 2024 2

@tidwall i will look up to redis benchmark, maybe we can get same one but written in go for Tile38, main thing will be a concurrent access and measurement. First tool for benchmarking is just a rally point to head over to more specific one.

from tile38.

m1ome avatar m1ome commented on May 22, 2024 2

@Lars-Meijer stop defending him. He told me he will look at benchmark code month ago.
Sadness

from tile38.

tidwall avatar tidwall commented on May 22, 2024 2

I pushed an update to the master branch which includes a packaged tile38-benchmark tool.

  • Uses the recommended Redis protocol.
  • Tests common commands: PING, GET, SET, and NEARBY.
  • Modeled after redis-benchmark. Same inputs and output.
  • Supports pipelining, quiet mode, and CSV output.

You will now likely see performance on par with Lettuce.

Make sure to check out the --help menu and https://redis.io/topics/benchmarks


In case you want to test the performance of the tool itself you can run it against a Redis instance like such:

$ tile38-benchmark -p 6379 -t SET,GET --redis

And compare it to:

$ redis-benchmark -p 6379 -t SET,GET

Or with pipelining:

$ tile38-benchmark -p 6379 -t SET,GET -P 10 --redis
$ redis-benchmark -p 6379 -t SET,GET -P 10

from tile38.

johnsonc avatar johnsonc commented on May 22, 2024 2

from tile38.

tidwall avatar tidwall commented on May 22, 2024 1

Thanks for the kind words.

Fence benchmarks is something that I've been working on but is currently past due.

The performance currently depends greatly on if you are using SETHOOK or just a standard FENCE.

The notification and networking framework for Tile38 has been under some big changes that include persistent HTTP/2 and GRPC connections.

I'll fasttrack this request and try to get something posted soon.

Thanks!

from tile38.

tidwall avatar tidwall commented on May 22, 2024 1

@Lars-Meijer

Is someone still working on benchmarks? It may be useful to look into the Redis benchmark tool as mentioned earlier.

Yes, it's a work in progress but @m1ome is maintaining the brand new tile38-benchmark project. I haven't spent much time with it yet. It's built in Go and is focused specifically on Tile38 commands.

The redis-benchmark is quite good. I use it all the time for my other projects (Redcon, SummitDB, kvnode). Though it would require some modifications to support Tile38 commands, I can see it being used as a base.

And speaking about performance, is there any interest in maintaining redis-gis, and do you have any idea how it stacks up agains Tile38 especially in adding data to it (set)

Sorry, I'm not maintaining redis-gis at the moment and I don't have much information in the way of benchmarking. I would suspect that it could be a little faster for general purpose GSET/GGET/GSEARCH commands, because it's written in C and uses the Redis command pipeline. But it's woefully out of date in terms of features vs Tile38, and it's not stressed tested in a production environment.

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024 1

Thanks for the quick response. I ran the benchmarks and they seem to indicate that tile38 can process much more requests than what I saw originally, looks very promising.

from tile38.

tidwall avatar tidwall commented on May 22, 2024 1

Your explanation is on track.

Tile38 is a static executable and the entire executable is loaded into memory (~4MB). As more points are added the memory usage will level out.

Tile38, which uses the Go runtime, hangs on to memory for as long as possible. The memory will eventually be released, but when this happens depends on factors like how often the Go memory allocator reuses the memory pages, the ratio of unused/used memory in the Tile38 process, and how much system memory is available.

The GC command is expensive and I don't recommend using it except following a mass insert of points or when debugging or diagnostics.

from tile38.

tidwall avatar tidwall commented on May 22, 2024 1

When running benchmark with 1 client performance seems to be higher than when running 50 parallel clients, this seems a little counter intuitive to me?

Seem counter intuitive to me too. I'll investigate and get back to you.

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024 1

@tidwall that's why i am asked you too look upon test sources :)

The benchmarks seem to indicate the same kind of result (or slightly worse) as reading the AOF, so I am assuming they are mostly correct. But the weird thing is that I cannot get nearly the same performance when using lettuce in Java. It might be that I am doing something wrong, I will continue to look into it.

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024 1

The local loopback interface should be faster than over the network. The near 20% <= 0 milliseconds vs 0.5% <= 1 milliseconds hints to me that the network may be bottleneck.

Could be, I don't have the capacity to look into the network at this moment.
I am able to get the desired performance by starting multiple Tile38 instances and balancing load between them.

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024 1

I've tested some more and the results of the benchmarks seem very stable. They seem to indicate performance very well and match the results I get with my own test. They would also be good to test performance between versions.

As for the performance of the append only file, I would call the difference significant. It might be worthwhile to look into methods to increase the performance of the Append only file, or make it configurable to enable and disable is. Kafka is disk backed, and has a write up on their persistence https://kafka.apache.org/documentation.html#persistence, although I am not sure how useful this is.

Benchmarks over network have very similar results, even slightly better in some cases, tho that might also be some noise by some background programs and a chrome tab that was open during the previous benchmarks. It seems that there was a network bottleneck during earlier tests

I've attached the benchmark results for the tests over network (gigabit ethernet).

--- 192.168.2.13 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 99009ms
rtt min/avg/max/mdev = 0.253/0.366/0.419/0.033 ms

network-noappend-multicore.txt
network-noappend-singlecore.txt
network-normal-multicore.txt
network-normal-singlecore.txt
summary-network-noappend-multicore.txt
summary-network-noappend-singlecore.txt
summary-network-normal-multicore.txt
summary-network-normal-singlecore.txt

from tile38.

tidwall avatar tidwall commented on May 22, 2024

This is also mentioned in an older issue #27.

from tile38.

literadix avatar literadix commented on May 22, 2024

Josh !

Thank you very much for your quick help. How can I help you ? Is there anything which could be done by me ?

Thank you very much,

Maciej

from tile38.

johnsonc avatar johnsonc commented on May 22, 2024

The performance currently depends greatly on if you are using SETHOOK or just a standard FENCE.

So which do you think is better?
Btw. .We're seriously thinking of using Tile38 but we can't find any perf benchmarks on the internet so it makes it a bit tricky to approve of .. Please do let us know if/how we can help.
Thanks.

from tile38.

tidwall avatar tidwall commented on May 22, 2024

@johnsonc

So which do you think is better?

I think the SETHOOK is the way to go in most cases, because it allows for a dedicated queue/notification server that Tile38 connects to directly (rather than middle-tier custom solution that connects to Tile38).

The gap in performance between SETHOOK and a standard FENCE has closed signifigantly over the past couple releases.

There's support for GRPC, Redis PubSub, and HTTP. GRPC is the quickest, followed by Redis, and then HTTP.

  • HTTP performance depends greatly on the endpoint; Is it HTTP/1 or 2? Is there a proxy in the middle? What about keep-alives, etc.
  • Redis PubSub is what it is, always fast and darn good at delivering notifications.
  • GRPC is super quick and allows for a custom solution at the endpoint.

There's also support for Disque, which is a really great message queue system, but the Disque project is still in beta.

We're seriously thinking of using Tile38 but we can't find any perf benchmarks on the internet so it makes it a bit tricky to approve of

I totally understand. Benchmarks around geofencing is something Tile38 is sorely missing. But I can say that there's been tremendous strides made with regards to the performance of geofence delivery and Tile38 in general since this issue was first opened. We're adding a couple new protocols soon (Kakfa and MQTT) and I'm hopeful that we can get some number that are public around the same time.

Please do let us know if/how we can help

Feel free to share your experience, whether that's with benchmarking, testing, or implementation.

Thanks a bunch for considering Tile38!

from tile38.

johnsonc avatar johnsonc commented on May 22, 2024

I think the SETHOOK is the way to go in most cases, because it allows for a dedicated queue/notification server that Tile38 connects to directly (rather than middle-tier custom solution that connects to Tile38).

Thank you!

I totally understand. Benchmarks around geofencing is something Tile38 is sorely missing.

:(
It gets really tough to convince those folks who are out to compare and contrast between frameworks to make a choice for a production stack. Oh well..
Right now, we are comparing ElasticSearch with Tile38 and we've come up with a few metrics that would be crucial for a service that we're considering:
Scalability/Clustering, Throughput, Fault tolerance, Reliability, Monitoring - Logging and Reporting.

Feel free to share your experience, whether that's with benchmarking, testing, or implementation.

I'd be happy to! It might take us a bit, but I'll try to get back.
Thanks for this amazing work!

from tile38.

tidwall avatar tidwall commented on May 22, 2024

@m1ome

Thanks for getting a head start on this. Before you go to far down the road with a benchmark tool, we should discuss the strategy around what is being benchmarked and the tool itself in more detail.

It looks like the project that you are building is based on a go test --bench. Which is great for simple unit test benchmarks, but maybe not so great for benchmarking operations that goes over the network. I think that the output of go test --bench may be too simplistic, and the need for a Go environment may be prohibitive for reproducibility for non-dev users.

It's going to be most important that the benchmarking tool represents real world networks. As a foundation it needs to support:

  • Executing over the network
  • Pipelining
  • Concurrent clients
  • Customizable payload sizes.

The output should be something that includes stuff like:

====== SET ======
  1000000 requests completed in 13.86 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

99.76% `<=` 1 milliseconds
99.98% `<=` 2 milliseconds
100.00% `<=` 3 milliseconds
100.00% `<=` 3 milliseconds
72144.87 requests per second

With this kind of data we can start to produce graphs that present information around operation speed and network latency.

There's a good write up on benchmarking on the Redis website.

from tile38.

literadix avatar literadix commented on May 22, 2024

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024

Is someone still working on benchmarks? It may be useful to look into the Redis benchmark tool as mentioned earlier. source

And speaking about performance, is there any interest in maintaining redis-gis, and do you have any idea how it stacks up agains Tile38 especially in adding data to it (set) @tidwall ?

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024

I did some measurements on the memory usage of Tile38 by placing a number of points within roughly the bounding box of The Netherlands with a random 15 digit key, I used the Server command to check the memory usage, then ran GC and measured again.

Number of points Usage/point Total U/p After GC
10,000 416 B 3.97 MiB 250 B
50,000 251 B 12.01 MiB 190 B
100,000 306 B 29.01 MiB 182 B
250,000 226 B 53.97 MiB 178 B
500,000 177 B 84.611 MiB 176 B
1,000,000 271 B 258.63 MiB 175 B

My initial guess to explain the difference between the usage/point and the usage/point after GC was that Tile38 allocates more memory than it needs (to prevent needing to allocate memory very often), after GC the memory that is not used would be freed. Initially the usage per point seems to be higher, probably because the server itself takes up a few MB of RAM.

Is this explanation correct, or is there something else going on?

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024

@m1ome @tidwall
When running benchmark with 1 client performance seems to be higher than when running 50 parallel clients, this seems a little counter intuitive to me?

====== GET(POINT) ======
100000 requests completed in 1.557656048s
1 parallel clients
keep alive: true

100.00% <= 1 milliseconds
<1%(1) <= 5 milliseconds

64199.03 requests per second

====== GET(POINT) ======
100000 requests completed in 2.327297588s
50 parallel clients
keep alive: true

99.96% <= 1 milliseconds
<1%(23) <= 2 milliseconds
<1%(9) <= 3 milliseconds
<1%(4) <= 4 milliseconds
<1%(1) <= 5 milliseconds
<1%(1) <= 6 milliseconds
<1%(1) <= 7 milliseconds
<1%(1) <= 11 milliseconds
<1%(1) <= 13 milliseconds
<1%(1) <= 15 milliseconds

42968.29 requests per second

I do a (simple) benchmark with Lettuce in Java I cannot get more than 20k SET per second on the same machine, maybe @m1ome can elaborate a little on how the benchmark work?

When I use my Java program with Lettuce and insert 1,000,000 points in a clean database, the rate is around 20K/s, when I then restart the database it performs around 90K sets/s, does the AOF reader do something clever, or is there something going wrong with the connection?

Any help would be greatly appreciated

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024

Seem counter intuitive to me too. I'll investigate and get back to you.

It might be the benchmarks that are not correct, I've also seen the benchmark reporting increased performance on a system that was under heavy load (tho I have not extensively looked in to that result). My own tests seem to indicate that Tile38 is core bound, and reserving a core for Tile38 and running everything else on a different core definitely help performance.

from tile38.

m1ome avatar m1ome commented on May 22, 2024

@tidwall that's why i am asked you too look upon test sources :)

from tile38.

tidwall avatar tidwall commented on May 22, 2024

@Lars-Meijer Tile38 utilizes all cores if possible. It uses a single read-write locker sync.RWMutex that is shared across all connections. Read operations such as GET and NEARBY should be very quick. Write operations are slower and block connections in order to fill data structures and write to the AOF.

Loading from the AOF will likely be faster than benchmarking over the network (assuming the read performance of your SSD is better than your network). It also runs some optimized code which executes prior to the server starting, without any locking, and does not have the burden of writing anything to disk.

Regarding multi vs single core. You may want to try playing with the GOMAXPROCS system variable. For example running

$ GOMAXPROCS=1 tile38-server

which will force Tile38 to run on only one core. Perhaps this is a good thing in some cases.

@m1ome I'm looking at benchmarking options today. Thanks for your patience.

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024

Loading from the AOF will likely be faster than benchmarking over the network (assuming the read performance of your SSD is better than your network). It also runs some optimized code which executes prior to the server starting, without any locking, and does not have the burden of writing anything to disk.

I figured this, but I don't think it explains the slow performance in my tests in comparison to the benchmark (3x slower than the benchmarks (if they are correct), 4x slower than reading from the AOF.). I also find it unlikely that 20k updates per second saturate a gigabit ethernet network.

Is there an option to disable the AOF? And how much of an impact does it actually have, since there is absolutely no use for it in my use case.

which will force Tile38 to run on only one core. Perhaps this is a good thing in some cases.

I used the taskset (http://manpages.ubuntu.com/manpages/wily/man1/taskset.1.html) command to lock it to one core, and locked other high load processes to the other cores. It seems to perform a lot better when I do this.

from tile38.

tidwall avatar tidwall commented on May 22, 2024

Is there an option to disable the AOF and how much of an impact does it actually have, since there is absolutely no use for it in my usecase.

There isn't an option to disable the AOF because required for core operations.

I just pushed a custom build to the memoptz branch that provides a --appendonly no flag. This will disable the AOF so you can test performance on your side. It's likely to break some stuff like Leader/Follower syncing, but standard commands like GET/SET/NEARBY will work.

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024

I just pushed a custom build to the memoptz branch that provides a --appendonly no flag. This will disable the AOF so you can test performance on your side. It's likely to break some stuff like Leader/Follower syncing, but standard commands like GET/SET/NEARBY will work.

Thanks, first impressions it makes quite a bit of difference, I will test further tomorrow.
Edit: It should be noted that the server I am testing on has an HDD not an SSD so this might amplify the benefits

from tile38.

tidwall avatar tidwall commented on May 22, 2024

@Lars-Meijer Sounds good. I look forward to the results.

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024

I've merged master into the memoptz branch and tested with benchmark tool, tile38 running on 1 core with AOF on SSD:

====== SET (point) ======
100000 requests completed in 4.47 seconds
50 parallel clients
82 bytes payload
keep alive: 1

24.39% <= 0 milliseconds
// A lot more
100.00% <= 24 milliseconds
22392.72 requests per second

Without the AOF on my PC (SSD):

====== SET (point) ======
100000 requests completed in 3.34 seconds
50 parallel clients
82 bytes payload
keep alive: 1

32.85% <= 0 milliseconds
//a lot more
100.00% <= 23 milliseconds
29911.44 requests per second

With the Tile38 server on 2 cores it seems to be about 2k/s better, this is indeed very close to the performance I am seeing with Lettuce. Thanks very much for your quick support 👍

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024

These are some more results form my not-so-fast server (vm on 4 old xeon cores (vCPU's @ 2.6GHz), locked to 1 core and 8 Gb of DDR3 ram @ 1333MHz) with AOF disabled.

When I run the benchmark on a different machine:
====== SET (point) ======
100000 requests completed in 10.29 seconds
50 parallel clients
82 bytes payload
keep alive: 1

0.51% <= 1 milliseconds
//More
100.00% <= 48 milliseconds
9714.58 requests per second

When the benchmark are ran directly on the server:
====== SET (point) ======
100000 requests completed in 5.43 seconds
50 parallel clients
82 bytes payload
keep alive: 1

19.60% <= 0 milliseconds
// More
100.00% <= 44 milliseconds
18409.89 requests per second

It might be network overhead, but the server has is on the same network as I am and the ping latency is about 1 ms.

from tile38.

tidwall avatar tidwall commented on May 22, 2024

@Lars-Meijer Thanks for sharing your results.

The local loopback interface should be faster than over the network. The near 20% <= 0 milliseconds vs 0.5% <= 1 milliseconds hints to me that the network may be bottleneck.

from tile38.

Lars-Meijer avatar Lars-Meijer commented on May 22, 2024

For anyone else interested on my slightly better desktop machine at home I was able to get the following results

Test system:
Intel Core i5, 4 cores @ 3.3GHz, turbo boost disabled.
16 Gb RAM (1600MHz)
Ubuntu Desktop 16.10
250 GB Samsung Enterprise SSD (Similar to Samsung 830)

The benchmarks where run on the same machine as the tile38 server, I might do some benchmarks over network later :). Performance can probably be increased a tiny bit by running some lightweight server instead of a full desktop.

All benchmarks had the following parameters:
100000 requests
50 parallel clients
keep alive: 1

noappend-multicore.txt
noappend-singlecore.txt
normal-multicore.txt
normal-singlecore.txt
summary-noappend-multicore.txt
summary-noappend-singlecore.txt
summary-normal-multicore.txt
summary-normal-singlecore.txt

The summary files provide a good starting point if you are just interested in the numbers. They provide the % done in less than 1 ms, the latency for 100% completion and the insert rate

from tile38.

tidwall avatar tidwall commented on May 22, 2024

@Lars-Meijer I'm actually pretty surprised by the how large the difference is between append and noappend is 20%. Thanks for providing these benchmarks.

from tile38.

tidwall avatar tidwall commented on May 22, 2024

@johnsonc Thanks. :)

from tile38.

sfroment avatar sfroment commented on May 22, 2024

Hello,

I don't know if it's it's the right place to put my question but I believe so.
I'm starting to use tile38 and I was wondering if there was a way to get multiple leader, to be able to set across multiple instance?

Thanks.

from tile38.

tidwall avatar tidwall commented on May 22, 2024

Hi @sfroment,

I'm starting to use tile38 and I was wondering if there was a way to get multiple leader, to be able to set across multiple instance?

Sorry but Tile38 only supports one leader at a time for a single collection.
If you need to scale to multiple leaders then you'll have to geographically shard your collections. For example, one leader could hold a collection that is eastern united states and another for western united states. etc.

from tile38.

tidwall avatar tidwall commented on May 22, 2024

I'm closing this issue because it's pretty old, and there have been many enhancements and optimizations over the past few years.

from tile38.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.