Giter VIP home page Giter VIP logo

Comments (2)

beepy0 avatar beepy0 commented on September 28, 2024
  • Write notes

from thesis.

beepy0 avatar beepy0 commented on September 28, 2024

Went through different iterations while optimizing the algorithm.

  • Simple with a conditional and a modulo to cap the hashed result
  • Simple without conditional but with modulo
  • Simple without conditional and with modulo substitute - truncation function

Having a modulo operation in the update loop proved to significantly reduce the speed. Timed results show a difference of about 6-7 fold increase of update-time due to the added modulo. Since the value space (number of buckets) that the H3 function is hashing to is assumed to be practically always smaller than the whole theoretical value space of H3, we must have a way to restrict the resulting values to the number of buckets. The solution to that is discussed in the third approach from the list above.

The difference in execution speed of the conditional didn't prove significant in the initial testing. The timed sketch-update execution times across 40+ runs were averaged and then compared to the ones with the conditional, with improvements being around 3% *[1]. I will need to re-test using a distribution from the whole range of unsigned int as it was pointed to me that this then should have a more significant impact on performance. *[2]

An alternative to the modulo was proposed by my advisor Martin Kiefer [REF], a so-called truncation method where the bits of the resulting unsigned integer are capped so that only the n most right bits are kept. These n bits give a range of values that correspond to the number of buckets. There are two cases for that:

  • 1 If the buckets size is a power of 2, then n is ceil(log2(buckets size)), as there are enough bit combinations in an unsigned int of size n to fit all of them.

  • 2 If the buckets size is not a power of 2, we also use the same n, but then have to subtract buckets size from each resulting value, making the end result always lower than buckets size since for these types of sizes, you cannot fit buckets size twice into ceil(log2(buckets size)).
    It is also clear that this method presents another conditional, which could be mitigated by the following code *[3]:

  • 1. Re-run the tests on the xeon machine to see difference in conditional removal with a numbers-range around the zero.

  • 2. Test the conditional removal on the xeon machine with a numbers-range of the whole uint

  • 3. Include a snippet of the code

  • Try to see if runtime improves if you shift the truncation in the H3 block

  • Do more testing on the xeon machine, and with more widespread data

  • Try to further optimize the code

from thesis.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.