Giter VIP home page Giter VIP logo

sudhir01 / bloom-filter Goto Github PK

View Code? Open in Web Editor NEW

This project forked from amccurry/bloom-filter

0.0 2.0 0.0 116 KB

This implementation of Bloom Filter is written in java, and uses a murmur hash. The class is a java generic class that requires a ToBytes object for converting your keys into byte arrays. However if you already have byte array keys that you wish to use them with the bloom filter. There are methods that except and test byte arrays with offsets and lengths so that you can reuse your byte array buffers.

bloom-filter's Introduction

Bloom Filter

This implementation of Bloom Filter is written in java, and uses a murmur hash.  The class 
is a java generic class that requires a ToBytes object for converting your keys into byte 
arrays.  However if you already have byte array keys that you wish to use them with the bloom 
filter.  There are methods that except and test byte arrays with offsets and lengths so 
that you can reuse your byte array buffers.

Thread Safety

By default this bloom filter is thread safe through the use of AtomicLongArray.
However if your implementation does not require thread safety you may relax this feature in 
the constructor.  Note: In testing I haven't seen any performance difference with using the 
AtomicLongArray over the standard long array implementation.

Below is a simple example of how to use this class.


import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.util.ArrayList;
import java.util.List;
import java.util.UUID;

import com.nearinfinity.bloomfilter.BloomFilter;
import com.nearinfinity.bloomfilter.ToBytes;

public class Readme {

	public static void main(String[] args) throws IOException, ClassNotFoundException {
		
		BloomFilter<String> bloomFilter = new BloomFilter<String>(0.001, 100, new ToBytes<String>() {
			private static final long serialVersionUID = -2257818636984044019L;
			@Override
			public byte[] toBytes(String key) {
				return key.getBytes();
			}
		});
		
		List<String> knownValues = new ArrayList<String>();
		
		for (int i = 0; i < 100; i++) {
			String key = UUID.randomUUID().toString();
			knownValues.add(key);
			bloomFilter.add(key);
		}
		
		for (String key : knownValues) {
			if (!bloomFilter.test(key)) {
				throw new RuntimeException("False Negative!");
			}
		}
		
		ByteArrayOutputStream baos = new ByteArrayOutputStream();
		ObjectOutputStream outputStream = new ObjectOutputStream(baos);
		outputStream.writeObject(bloomFilter);
		outputStream.close();
		
		ObjectInputStream inputStream = new ObjectInputStream(new ByteArrayInputStream(baos.toByteArray()));
		BloomFilter<String> newBloomFilter = (BloomFilter<String>) inputStream.readObject();
		inputStream.close();
		
		for (String key : knownValues) {
			if (!newBloomFilter.test(key)) {
				throw new RuntimeException("False Negative!");
			}
		}
		
		int falsePositive = 0;
		int sampleSize = 100000;
		for (int i = 0; i < sampleSize; i++) {
			if (newBloomFilter.test(UUID.randomUUID().toString())) {
				falsePositive++;
			}
		}
		
		System.out.println("[" + 
				falsePositive +"] false positives out of [" + 
				sampleSize + "] while using [" + 
				newBloomFilter.getMemorySize() + "] bytes of memory");
	}

}

bloom-filter's People

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.