Releases: Cydhra/vers
1.5.1 - Wavelet Matrix Serde Support
What's Changed
- remove redundant bits_per_element, reducing
WaveletMatrix
struct size by @somethingelseentirely in #12 - add serde support to
WaveletMatrix
New Contributors
- @somethingelseentirely made their first contribution in #11
1.5.0
New Features
- Vers now implements a
WaveletMatrix
to encode arbitrary alphabets. The implementation supports alphabets exceeding 64 bits, but its API works with primitive integers all the same. - The Wavelet Matrix supports rank and select queries, predecessor and successor queries, and statistical queries (like range-max, range-min, range-median, range-select-k, ...), and reconstruction of values all in
O(k)
wherek
is the alphabet bit size. - The Wavelet Matrix supports several iterators over its encoded sequence, including a sorted iterator.
Constructors, Iterators, and Convenience
New Features
RsVec
implements several comparison methods (sparse_equals
,full_equals
), as well as thePartialEq
trait which dynamically switches between those implementations according to my benchmark results.BitVec
andRsVec
implement new constructors. Take a look at docs.rs to see what's possible now!BitVec
,BinaryRmq
, andFastRmq
can be created from u64 Iterators.- Several constructors are also available as
From<T>
implementations now - Added the
simd
crate-feature which gates some (unsafe) vectorization inRsVec
, and a vectorized alternative iterator (RsVec::bit_set_iter0
,RsVec::bit_set_iter1
) RsVec::iter1
andRsVec::iter0
now also have an owning version:RsVec::into_iter1
RsVec::iter1
andRsVec::iter0
are now double-ended iterators
Improvements
- Added several examples to the documentation of
BitVec
- Iterating an Elias-Fano Vector is about 20% faster now
- FastRMQ is 3-4% faster on vectors that fit in L3
Fixes
BitVec::get_bits
andRsVec::get_bits
could fail when querying 64 bits from an index not aligned to the vector limbs
Another bug in select iterators removed by simplification
The select iterators failed under certain conditions because they attempted to select bits from the limb that contained the last returned element, even if that limb does not contain any more set bits. This happened because the iterator forgot to account for popcounts in limbs preceding the current limb. The iterator now no longer keeps track of which limb contained the last selected bit, since searching the limb every time is essentially free due to caching effects.
Searching it every time forces the iterator to keep track of all bits, even if they are spread across multiple limbs in a block.
Bugfix in select-1 iterator
Fixed Issue #6
- the select1 iterator (
iter1
) could fail on bit vectors with more than one super-block because the number of ones in super-blocks was calculated wrongly while iterating
Replaced extant intrinsics with util function
Hotfix patch
- Fix: The pdep fallback wasn't implemented in FastRMQ, which meant the crate would not compile outside of x86_64 targets.
- Fix: The documentation still stated that the intrinsics were forcibly enabled (which they were due to the previous bug, but now they aren't).
- General touch-up of the documentation
Fallback implementations for non-x86 platforms
This release provides fallbacks for the pdep intrinsic, which removes the necessity for the BMI2 feature.
Obviously, the crate is pretty slow without BMI2, but it can now theoretically be used on all platforms.
I also included a small benchmark for space overhead, and I am happy to report that Vers crushes its competitors in this regard.
Changed parameters for docs.rs
This release changes nothing but remove a space from an argument given to docs.rs for building documentation, because docs.rs is terrible at parsing CLI args apparently.
Added new API to vers::BitVec
Changes
- added
mask_*
functions tovers::BitVector
which apply lazily-evaluated masks onto a bitvector. - added
apply_mask_*
functions tovers::BitVector
to update the bit vector in-place. - added
count_ones
andcount_zeros
functions tovers::BitVector
and masked bit vectors. - added
set_bit
function tovers::BitVector
. - added
Eq
andHash
traits tovers::BitVector
. It's probably not a great idea to use large bit vectors as keys to a hashmap. - added iterators over 1 and 0 to
vers::RsVec
to iterate quickly over (the indices of) ones and zeros, exploiting the select data structure.
Fixes
get_bits
invers::BitVector
could panic when queried for 0 bits.- added compiler flags for docs.rs so the documentation should no longer fail building
For developers
- disabled
simd
forrsdict
in benchmarks, since it depends onpacked_simd
which is currently not compiling
Fixed undefined behavior and updated documentation
- Due to the conservative use of
unsafe
, the library technically produced undefined behavior on target platforms that do not supportPOPCNT
orBMI2
. This has been fixed by failing compilation if either is not present - Documentation stated that some behavior was undefined when it was just unpredictable but well-defined. Those issues have been fixed
- Fixed a bug that would lead to undefined behavior in release mode and crash in debug mode, where a shift operation with a modulus of 64 was possible
Technically, none of these changes should be breaking, since they only affect undefined behavior
Also, I added sucds
as a new crate in comparison benchmarks, which is faster in general, but much worse in worst-cases.