An autoscaling Bloom filter with ultra-low memory footprint for PHP. Ok Bloomer employs a layered filtering strategy that allows it to expand while maintaining an upper bound on the false positive rate. Each layer is comprised of a bitmap that remembers the hash signatures of the items inserted so far. If an item gets caught in the filter, then it has probably been seen before. However, if an item passes through the filter, then it definitely has never been seen before. Bloom filters find uses in caching systems, stream deduplication, DNA sequence counting, and many more.
- Ultra-low memory footprint
- Autoscaling works on streaming data
- Bounded maximum false positive rate
- Open-source and free to use commercially
Install into your project using Composer:
$ composer require scienide/okbloomer
- PHP 7.4 or above
A probabilistic data structure that estimates the prior occurrence of a given item with a maximum false positive rate.
# | Name | Default | Type | Description |
---|---|---|---|---|
1 | maxFalsePositiveRate | 0.01 | float | The false positive rate to remain below. |
2 | numHashes | 4 | int, null | The number of hash functions used, i.e. the number of slices per layer. Set to null for auto. |
3 | layerSize | 32000000 | int | The size of each layer of the filter in bits. |
use OkBloomer\BloomFilter;
$filter = new BloomFilter(0.01, 4, 32000000);
$filter->insert('foo');
echo $filter->exists('foo');
echo $filter->existsOrInsert('bar');
echo $filter->exists('bar');
true
false
true
To run the unit tests:
$ composer test
To run static code analysis:
$ composer analyze
To run the benchmarks:
$ composer benchmark
- [1] P. S. Almeida et al. (2007). Scalable Bloom Filters.