Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #402 from unum-cloud/main-dev
* Improve: Swift test for issue #399 (#400) * Fix: Integer overflow in aligned-alloc Fixes ClickHouse/ClickHouse#61780 Co-authored-by: Antonio Andelic <[email protected]> * Make: Disable Windows NPM builds Relates to the #377 and the comment: #377 (comment) This temporarily disables the failing CI pipeline to generate and update docs. * Fix: Going beyond level 0 in clustering * Improve: Error handling in `index_dense_gt` This commit drops `std::vector` dependency, making compilation time shorter and error handling universal across abstraction layers. * Improve: Remove `std::function` calls * Improve: Remove `std::thread` from `index_dense_gt` * Improve: `std::vector` -> `buffer_gt` in plugins * Add: `usearch_change_threads_search` * Fix: `index_dense_t::make(path)` * Fix: Exhastive Search In the past, if we got "too lucky" traversing the graph, we could exit early before accumulating K top matches, even if the index had more than K entries. This patch changes that behavior, making output more predicatable. * Fix: Replacement leaves isolated nodes This patch addresses the issue #399, originally observed in the Swift layer. Reimplementing it in C++ helped locate the issue and lead to refactoring the `update` procedure in the lowest-layer `index_gt`. Now, `add` and `update` share less code. The `add` is one branch shorter (not that it would be noticeable), and `update` brings additional logic to avoid spilling `updated_slot` into top-results and consequently introducing self-loops. Closes #399 * Fix: Misc warnings & compilation issues * Fix: Misc warnings & compilation issues * Fix: Detect `ring_gt` being full Relates to #355 * Fix: Re-init `available_threads_` after load Both `view` and `load` would `reset` the thread contexts. After that, the very first `search` and `add` would fail, as no thread-local contexts are initialized. It would require a `reserve` call with a non-zero second arcgument to define the number of concurrent threads, for which the queues & buffers need to be allocated. That design is counter-intuitive, so this patch re-inits the same number of threads as before the `load` & `view` or one, if none existed. * Fix: `uint32_t` to `uint40_t` cast (#404) Co-authored-by: Ash Vardanian <[email protected]> * Docs: Mention `b1` in `README.md` Co-authored-by: Adolfo Garcia <[email protected]> * Docs: Cover new users * Improve: Updates stability & catch bug * Fix: Dereferencing `member_iterator_t` * Add: Java `get` API (#407) * Fix: Compilation with `uint40_t` keys * Add: `AutoClosable` using `c_destroy` for Java (#408) * Fix: Rare deadlock on tiny collections * Improve: `enable_key_lookups=false` memory usage As noted by Robert Schulze, we can avoid populating `slot_lookup_` during insertions, if `enable_key_lookups` is not set. This would lead to lower memory consumption for large indexes of tiny vectors, particularly common in GIS. Co-authored-by: Robert Schulze <[email protected]> * Fix: Preset `enable_key_lookups=true` in C * Fix: `std::is_pod` deprecated in C++20 * Fix: Unused type aliases * Improve: Avoid `#pragma region` pre GCC 13 (#386) * Do not use #pragma region if not supported by the compiler * `pragma region` supported by GCC 13+ https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85487 --------- Co-authored-by: Mikhail Bautin <[email protected]> Co-authored-by: Ash Vardanian <[email protected]> * Fix: Capacity checks in C# tests * Docs: Add doc-strings in C# * Make: Verbose C# logging in debug builds * Docs: Describe serialization in GoLang * Make: Drop Java & JS API references Compiling and introspecting docstrings in Sphinx is extremely flaky. It's safer to simply describe the usage patterns in the `README.md`. * Add: Load from path in Java (#410) * Improve: Avoid sorting on small "refines" * Docs: Cover JIT-compiled Python examples * Fix: `index.copy()` trying to `memcpy(*, NULL, *)` * Docs: UB in C++ Example (#415) Co-authored-by: Ash Vardanian <[email protected]> * Improve: Catch UB with tests * Docs: Rearrange * Fix: Reserving contexts post-reload * Improve: Detect more failures in tests * Improve: Log failing lines * Fix: Clamp before down-casting to `i8_t` (#422) Previously we were down-casting floats to the target type (e.g. int8_t), and then clamping to [-100, 100] range. This means that e.g. 129 would be cast to -127 and then converted to -100, in stead of becoming 100 The fix does clamping first, and then casts the resulting number (which is guaranteed to be in range [-100, 100], due to clamping) from source type to target int8_t. Given the clamping, this will never overflow. --------- Co-authored-by: Ash Vardanian <[email protected]> * Fix: Concurrency bug in high-K search When calling `index_dense_gt`, the thread lock was not propagating with the `search_result_t`. That is a an error-prone API. When too many threads are running in parallel (ideally, more than physical CPU cores) another thread may start reusing the `context_t` before the original caller finishes exporting entries with `dump_to`. This solution is backwards compatible and passes the tests. * Make: Manually bump version to 2.13.0 We can't yet rely on the SemVer tool semantic-release/release-notes-generator#633 (comment) * Improve: Attempt to implement batch-metrics * Add: Batch-capable metrics * Add: `misaligned_ptr_gt` comparators * Improve: Separate `error_t` constructors This makes debugging easier, as it's simpler to trace where the error message is being set. * Improve: Support `enum` slots Tracing implicit conversions of `std::uint32_t` and other primitive types isn't always easy in concurrent apps. This commit adds support for `enum` types to be used for safer implementation of `index_gt` specializations. * Improve: Ranges-V3 compatibility * Add: Preliminary support for batch metrics * Add: Batch-parallel refinements * Add: `MANIFEST.in` for `py.typed` (#425) Adding type annotation for Python native modules solves the `Skipping analyzing "usearch.index" module` warning due to `missing library stubs or py.typed marker`. Closes #424 * Fix: Clear cast buffer before bitwise ORs for `b1x8_t` (#428) When converting floating point arrays to binary, we use bitwise OR operations to set the relevant bits in the output buffer to 1. We do nothing if the bit is zero, so we assume that the bit is zero to start with. The `memset` statement makes sure this assumption holds. * Fix: `esm` duplicate import bug in `jest` (#420) Closes #418 Closes #426 * Fix: build.gradle deprecations * Fix: ESM build support (#433) Closes #426 Relates to #420 * Fix: `capacity()` assertion in Rust (#436) Closes #432 * Fix: Computing `stats(i).max_edges` * Add: Returning `computed_distances_in_refines` In high-connectivity graphs, the number of distance computations can be dominated by the number of "refine" heuristic computations performed by the core structure. The extended `add_result_t` now includes both: - `computed_distances_in_refines` - `computed_distances_in_reverse_refines` This commit also extends the documentation. * Add: Profiling attributes for `index_gt` * Fix: Preserve thread limits after `fork()` * Improve: Benchmarks self-recall support and `.bbin` This patch adds support for partial datasets, without ground truth neighborhood data. It also adds support for `.bbin` binary, `.hbin` half-precision, and `.dbin` double-precision input vector files/ * Fix: Printing progress between exit * Docs: Fix spelling * Fix: Wolfram bindings (#437) Co-authored-by: Ash Vardanian <[email protected]> * Fix: Pre-reserve enough threads for C users This indirectly fixes the crash in C# layer * Revert: Parallel metrics * Fix: Updating the `entry_slot_` node * Improve: Enable single-threaded update tests * Make: Bump version * Fix: `flat_hash_multi_set_gt::reset` double-free * Fix: Not enough memory in `fork()` * Make: Bump to SimSIMD v5 * Fix: Missing SimSIMD v5 capability names * Improve: Detecting copy & move issues * Fix: Compilation w. explicit `template class` * Improve: Bypass UBSan NULL dereferencing warning * Improve: Minimize alignment issues * Make: Disable `bf16` on MacOS * Make: Link to GitHub repo * Fix: Conditional `call_key` compilation in MSVC * Fix: Unary minus on unsigned distances * Make: Disable `bf16` in JS * Fix: Compatibility with pre-v2.10 Closes #423 * Improve: Test wrong number of dimensions in Rust (#413) Closes #412 --------- Co-authored-by: Julius Brummack <[email protected]> * Fix: Handle wrong dimensionality in Rust * Fix: Overwriting `SIMSIMD_DYNAMIC_DISPATCH` * Make: Upgrade Java version * Fix: Spelling mistakes * Fix: `sprintf` deprecated * Fix: `-Wpass-failed=transform-warning` * Fix: Memory pinning on `Add` in C# * Make: Specify Java distribution * Fix: Pin memory in gets (C#) * Make: Skip `PersistAndRestore` in CI on MacOS * Make: Upgrade Docker action This fixes a GitHub CI warning about the deprecated NodeJS version. * Fix: `view_from_buffer` is unsafe in Rust Closes #453 Co-authored-by: Andrew Dirksen <[email protected]>" * Fix: `view_from_buffer` is unsafe in Rust Closes #453 Co-authored-by: Andrew Dirksen <[email protected]> * Make: Upgrade SimSIMD * Docs: Index header has no capacity The lack of capacity data is intended. Reserving memory is a non-persistent operation by nature, and we shouldn't save that metadata on the disk. Closes #452 Co-authored-by: Christopher Yim <[email protected]> * Fix: Aggressive neighborhood checks on updates * Fix: `update()` bug detect with non-POD keys This bug was tough to spot. Apple Clang was the only one that caught it. The `-O0` flags were explicitly added to expose more symbols for debugging. More `uint40_t` tests were added. * Fix: Align vector type w index in C# (#456) Co-authored-by: Ash Vardanian <[email protected]> * Make: Versioning in pre-release CI * Make: Switch from SemanticRelease to TinySemVer --------- Co-authored-by: Jaysen Marais <[email protected]> Co-authored-by: Antonio Andelic <[email protected]> Co-authored-by: Narek Galstyan <[email protected]> Co-authored-by: Adolfo Garcia <[email protected]> Co-authored-by: Trevor McCulloch <[email protected]> Co-authored-by: Robert Schulze <[email protected]> Co-authored-by: Mikhail Bautin <[email protected]> Co-authored-by: Mikhail Bautin <[email protected]> Co-authored-by: SheldonFung <[email protected]> Co-authored-by: James Braza <[email protected]> Co-authored-by: cinchen <[email protected]> Co-authored-by: Mark Reed <[email protected]> Co-authored-by: John <[email protected]> Co-authored-by: batracos <[email protected]> Co-authored-by: Ziyang Hu <[email protected]> Co-authored-by: Julius Brummack <[email protected]> Co-authored-by: Julius Brummack <[email protected]> Co-authored-by: Andrew Dirksen <[email protected]> Co-authored-by: Christopher Yim <[email protected]> Co-authored-by: Britt <[email protected]>
- Loading branch information