see: <a href="https://github.com/typelevel/cats-collections/blob/master/core/src/main/

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Sorry for the late reply <a class="user-mention notranslate" data-hovercard-type="user

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Closed via <a class="issue-link js-issue-link" data-error-text="Failed to load title"

use fast immutable bitset for better bloom filter about algebird HOT 6 CLOSED

johnynek commented on June 12, 2024

use fast immutable bitset for better bloom filter

from algebird.

Comments (6)

anish749 commented on June 12, 2024

I had tried out this implementation from cats and have the code in this branch
It does not have for persistence or moving the old implementation to algebird.legacy

It is indeed much faster.

The benchmark results are here. I did this a few months back, and I would need to check some of the finer details again.


// Immutable with Array copy
[info] Benchmark                                               (falsePositiveRate)  (nbrOfElements)   Mode  Cnt    Score     Error  Units
[info] BloomFilterCreateBenchmark.createBloomFilter                          0.001            10000  thrpt    3  171.513 ± 281.008  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                          0.001            50000  thrpt    3   11.661 ± 175.592  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                           0.01            10000  thrpt    3   36.594 ±  87.057  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                           0.01            50000  thrpt    3   25.702 ± 259.042  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                            0.1            10000  thrpt    3  409.528 ± 540.620  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                            0.1            50000  thrpt    3   76.085 ± 264.660  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterAggregator                0.001            10000  thrpt    3    3.059 ±   7.600  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterAggregator                0.001            50000  thrpt    3    0.164 ±   0.177  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterAggregator                 0.01            10000  thrpt    3    7.157 ±   9.560  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterAggregator                 0.01            50000  thrpt    3    0.341 ±   0.235  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterAggregator                  0.1            10000  thrpt    3   26.621 ±  40.213  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterAggregator                  0.1            50000  thrpt    3    1.200 ±   1.451  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterUsingFold                 0.001            10000  thrpt    3    3.738 ±   6.267  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterUsingFold                 0.001            50000  thrpt    3    0.159 ±   0.150  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterUsingFold                  0.01            10000  thrpt    3    7.476 ±  16.579  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterUsingFold                  0.01            50000  thrpt    3    0.335 ±   0.721  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterUsingFold                   0.1            10000  thrpt    3   24.621 ±  44.696  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterUsingFold                   0.1            50000  thrpt    3    1.295 ±   0.839  ops/s


// Cats Immutable
[info] Benchmark                                               (falsePositiveRate)  (nbrOfElements)   Mode  Cnt    Score     Error  Units
[info] BloomFilterCreateBenchmark.createBloomFilter                          0.001            10000  thrpt    3   44.939 ± 227.744  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                          0.001            50000  thrpt    3    5.687 ±  19.081  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                           0.01            10000  thrpt    3   78.949 ± 154.880  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                           0.01            50000  thrpt    3   11.778 ±   8.267  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                            0.1            10000  thrpt    3  189.331 ±  52.266  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                            0.1            50000  thrpt    3   22.157 ±  20.995  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterAggregator                0.001            10000  thrpt    3   53.450 ±  55.043  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterAggregator                0.001            50000  thrpt    3    7.222 ±  15.144  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterAggregator                 0.01            10000  thrpt    3   72.325 ± 100.853  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterAggregator                 0.01            50000  thrpt    3   10.576 ±  26.579  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterAggregator                  0.1            10000  thrpt    3  157.641 ± 351.298  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterAggregator                  0.1            50000  thrpt    3   18.624 ±  11.941  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterUsingFold                 0.001            10000  thrpt    3   58.734 ±  57.416  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterUsingFold                 0.001            50000  thrpt    3    8.304 ±   2.001  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterUsingFold                  0.01            10000  thrpt    3   78.388 ±  98.764  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterUsingFold                  0.01            50000  thrpt    3    5.545 ±  75.888  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterUsingFold                   0.1            10000  thrpt    3  183.115 ± 128.414  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilterUsingFold                   0.1            50000  thrpt    3   20.643 ±  52.748  ops/s

from algebird.

regadas commented on June 12, 2024

I'm looking a bit into this ... I'll make a draft PR so we can talk about it and I'll post some numbers as well.

@johnynek you want to keep the old impl in legacy right?

from algebird.

johnynek commented on June 12, 2024

@regadas Ideally, I really hate to leave people high-and-dry.... So, the most compatibility we can retain the better.

That said, the above numbers seem ambiguous: like, using the immutable bitset was only a win for some cases. It would be nice if we could tweak things so that we always have a win.

We can copy the implementation into this repo, I think, especially if we might want to try some optimizations to improve performance.

One thing that we didn't do with old implementations is try to get the size information at the type-level. I think that is a really good idea since it makes sure the bitsizes are aligned and serializations don't need to write out the size. So, that is something worth exploring.

The two ways to do this are to make a module:

abstract class SizedBloomFilter(bits: Int, hashes: Int) {
  sealed abstract class Bloom ...
 
  implicit val monoid: Monoid[Bloom] = ...
}

val sizedBloom = new SizedBloom(1024, 5)
import sizedBloom.Bloom
// now we can use Bloom here and have a path dependent type

or alternatively

sealed abstract class HasSize[A] {
  def sizeOf: Int
}
object HasSize {
    def apply[A](s: Int): HasSize[A] = new HasSize[A] { def sizeOf = s }
   
    sealed trait Size1
    implicit val size1HasSize: HashSize[Size1] = apply(1)
   
  // how we can do:
    sealed trait Bloom[Bits, Hashes] {
      def size(implicit b: HasSize[Bits]) = b.sizeOf
      ...
    }
}

from algebird.

regadas commented on June 12, 2024

Sorry for the late reply @johnynek!

So, the most compatibility we can retain the better.

Totally agree, I moved the current impl into a legacy package. It's also in our best interest to keep since we can run benchmarks against it 😄

the above numbers seem ambiguous: like, using the immutable bitset was only a win for some cases. It would be nice if we could tweak things so that we always have a win.

In the numbers below I think i managed to get it on par with the legacy impl in the cases where it didn't perform that well. Even with this we could potentially go forward with since the BloomFilter is simpler and there's only one BitSet impl. Tested also @anish749 createBloomFilterUsingFold and createBloomFilterAggregator and saw the same improvement (I can add them as well later).

I had to cheat in on case ... sumOption to be fast enough I had to mutate the same BitSet (tbh I think it's ok in this case since it's localized).

I believe there's some other minor things we can optimize where and there I'll need to play around a little bit more. I spotted yours and @non work on improving some of them and there's also some new ones like def ++(a: Array[Int]): BitSet that I was thinking in adding but it's already there (I think I'll just copy it as well 😄 )

[info] Benchmark                                                    (falsePositiveRate)  (nbrOfElements)   Mode  Cnt          Score          Error  Units
[info] BloomFilterCreateBenchmark.createBloomFilter                               0.001              100  thrpt    3      12434.779 ±      357.553  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                               0.001             1000  thrpt    3       1183.123 ±       50.355  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                               0.001            10000  thrpt    3        108.720 ±        2.832  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                                0.01              100  thrpt    3      17978.982 ±      760.818  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                                0.01             1000  thrpt    3       1705.537 ±       44.473  ops/s
[info] BloomFilterCreateBenchmark.createBloomFilter                                0.01            10000  thrpt    3        152.526 ±       28.693  ops/s
[info] BloomFilterDistanceBenchmark.distanceOfDenseVsDense                          N/A              N/A  thrpt    3   41105793.914 ±  5011634.200  ops/s
[info] BloomFilterDistanceBenchmark.distanceOfEmptyVsDense                          N/A              N/A  thrpt    3    7843706.236 ±   468487.379  ops/s
[info] BloomFilterDistanceBenchmark.distanceOfEmptyVsEmpty                          N/A              N/A  thrpt    3  285827731.116 ± 31228610.287  ops/s
[info] BloomFilterDistanceBenchmark.distanceOfEmptyVsSparse                         N/A              N/A  thrpt    3    7900630.518 ±   643684.348  ops/s
[info] BloomFilterDistanceBenchmark.distanceOfSparseVsDense                         N/A              N/A  thrpt    3    1264613.293 ±   185974.448  ops/s
[info] BloomFilterDistanceBenchmark.distanceOfSparseVsSparse                        N/A              N/A  thrpt    3    1208178.559 ±  1140412.698  ops/s
[info] BloomFilterLegacyCreateBenchmark.createBloomFilter                         0.001              100  thrpt    3      11678.810 ±      542.307  ops/s
[info] BloomFilterLegacyCreateBenchmark.createBloomFilter                         0.001             1000  thrpt    3       1166.464 ±       64.033  ops/s
[info] BloomFilterLegacyCreateBenchmark.createBloomFilter                         0.001            10000  thrpt    3        113.420 ±        2.142  ops/s
[info] BloomFilterLegacyCreateBenchmark.createBloomFilter                          0.01              100  thrpt    3      16004.431 ±     2822.803  ops/s
[info] BloomFilterLegacyCreateBenchmark.createBloomFilter                          0.01             1000  thrpt    3       1615.137 ±      234.551  ops/s
[info] BloomFilterLegacyCreateBenchmark.createBloomFilter                          0.01            10000  thrpt    3        160.307 ±       30.518  ops/s
[info] BloomFilterLegacyDistanceBenchmark.distanceOfDenseVsDense                    N/A              N/A  thrpt    3    2725811.932 ±    99564.870  ops/s
[info] BloomFilterLegacyDistanceBenchmark.distanceOfEmptyVsDense                    N/A              N/A  thrpt    3   10708004.851 ±   434568.616  ops/s
[info] BloomFilterLegacyDistanceBenchmark.distanceOfEmptyVsEmpty                    N/A              N/A  thrpt    3  282776646.316 ± 24966839.395  ops/s
[info] BloomFilterLegacyDistanceBenchmark.distanceOfEmptyVsSparse                   N/A              N/A  thrpt    3      60006.340 ±     2724.431  ops/s
[info] BloomFilterLegacyDistanceBenchmark.distanceOfSparseVsDense                   N/A              N/A  thrpt    3      42009.918 ±     2578.536  ops/s
[info] BloomFilterLegacyDistanceBenchmark.distanceOfSparseVsSparse                  N/A              N/A  thrpt    3    5827613.935 ±  2155471.021  ops/s
[info] BloomFilterLegacyQueryBenchmark.queryBloomFilter                           0.001              100  thrpt    3    2791972.056 ±   319797.588  ops/s
[info] BloomFilterLegacyQueryBenchmark.queryBloomFilter                           0.001             1000  thrpt    3    2738206.605 ±   357713.894  ops/s
[info] BloomFilterLegacyQueryBenchmark.queryBloomFilter                           0.001            10000  thrpt    3    2659377.826 ±  2544150.099  ops/s
[info] BloomFilterLegacyQueryBenchmark.queryBloomFilter                            0.01              100  thrpt    3    3877831.480 ±  1068871.706  ops/s
[info] BloomFilterLegacyQueryBenchmark.queryBloomFilter                            0.01             1000  thrpt    3    3860338.377 ±   327106.629  ops/s
[info] BloomFilterLegacyQueryBenchmark.queryBloomFilter                            0.01            10000  thrpt    3    3880580.772 ±   294212.699  ops/s
[info] BloomFilterQueryBenchmark.queryBloomFilter                                 0.001              100  thrpt    3    2781139.309 ±   810642.587  ops/s
[info] BloomFilterQueryBenchmark.queryBloomFilter                                 0.001             1000  thrpt    3    2625591.510 ±   539291.673  ops/s
[info] BloomFilterQueryBenchmark.queryBloomFilter                                 0.001            10000  thrpt    3    2624722.080 ±   405730.511  ops/s
[info] BloomFilterQueryBenchmark.queryBloomFilter                                  0.01              100  thrpt    3    3973993.973 ±   120529.588  ops/s
[info] BloomFilterQueryBenchmark.queryBloomFilter                                  0.01             1000  thrpt    3    3595605.220 ±   239115.046  ops/s
[info] BloomFilterQueryBenchmark.queryBloomFilter                                  0.01            10000  thrpt    3    3349576.523 ±  1495983.818  ops/s

from algebird.

regadas commented on June 12, 2024

@johnynek I was looking at this and I didn't actually commented the sized bloom filter;

bitsizes are aligned and serializations don't need to write out the size

These are very good points! it's easy to make a mistake and yeah improving on serialization is always a big plus. I guess we should really explore the idea, but before that I just wanted to see how far we could take the new BitSet and see if we could actually replace the existing one.

from algebird.

regadas commented on June 12, 2024

Closed via #840

from algebird.

use fast immutable bitset for better bloom filter about algebird HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent