Releases · cvxgrp/pymde

23 Feb 02:22

akshayka

v0.3.0

9d6bcac

0.3.0 Latest

Latest

The release dramatically decreases the time it takes PyMDE to construct embeddings by shipping a custom Rust implementations of approximate and exact k-nearest neighbor algorithms. This release also introduces a new API to choose between approximate and exact neighbor construction.

Extremely fast k-nearest neighbor preprocessing, written in Rust.

Computing local-structure preserving embeddings (such as t-SNE, UMAP, and LargeVis) requires the construction of a k-nearest neighbor graph given an original data matrix. Like the UMAP package, previous versions of PyME used pynndescent to construct this graph. This release drops the dependency on pynndescent in favor of a custom Rust implementation.

Extremely fast approximate k-nearest neighbors, written in Rust. This release includes a custom implementation the nn-descent algorithm for the Euclidean metric, written in Rust. This has a few benefits:

Dramatically smaller "time-to-first embedding". Computing $k=15$ nearest neighbors for the MNIST dataset (70,000 vectors, 784 dimension takes less than 1 second on a modern MacBook Pro, while pynndescent takes approximately one minute due to Numba's JIT.
Competitive with warmed pynndescent. Our rust implementation is competitive with burned-in or warm pynndescent (post JIT), with comparable execution time and recall on simple benchmarks.
Easy to install. The elimination of Numba as a transitive dependency means that PyMDE is now easy to install on all major platforms, Python 3.10+, with no restrictions on NumPy version.

Performance is obtained by custom SIMD kernels, an auto-vectorization-friendly fallback, and rayon for parallelism.

Extremely fast exact k-nearest neighbor calculation, written in Rust. This release also includes a custom Rust implementation for exact k-nearest neighbor calculations.

Extremely fast. On modern machines, the exact algorithm is even faster than nn-descent (both our own and pynndescent). Performance is obtained by heavily exploiting parallelism and via BLAS (Accelerate on macOS, OpenBLAS on Linux and Windows).
Competitive with faiss. Our implementation is competitive with faiss, but without the dependency on OpenMP (which makes it difficult to install faiss alongside PyTorch).

Choosing the k-nearest neighbor algorithm

pymde.preserve_neighbors now takes an optional argument, knn_method, which controls whether an approximate or exact kNN algorithm is used. In practice the performance of the embedding is typically the same. When omitted we choose the method based on simple heuristic.

New interactive examples

PyMDE now ships with interactive marimo notebook examples that bring embeddings to life, letting you select into matplotlib plots and automatically retrieve the original data in Python for downstream analysis. See our example notebooks at https://pymde.org/examples.

Full Changelog: v0.2.3...v0.3.0

Assets 2

01 Jul 22:34

akshayka

v0.2.0

336dc32

v0.2.0

What's Changed

fix: don't initialize eigsh with vector in null(L); break: numpy >= 2.0 by @akshayka in #85

Full Changelog: v0.1.18...v0.2.0

Contributors

akshayka

Assets 2

21 Nov 17:34

akshayka

v0.1.18

073251e

v0.1.18

This release includes fixes to make pymde.plot code compatibile with
newer versions of matplotlib.

Assets 2

18 Nov 20:32

akshayka

v0.1.17

7b926a0

v0.1.17

Make installation on Apple Silicon easier.

Build arm64 wheels, and better build isolation.

Assets 2

17 Nov 23:53

akshayka

v0.1.16

bf7d0d5

v0.1.16

Fix related to SVD computation, used when the standardization constraint is used.

Assets 2

06 Oct 16:44

akshayka

v0.1.13

c5589e5

v0.1.13

Adds a function pymde.seed() for controlling randomness.

PyMDE's internal random state can be set by passing an integer seed
to this function (e.g., pymde.seed(0)). This is useful when
exact reproducibility is required.

See https://pymde.org/getting_started/index.html#reproducibility

Assets 2

21 Jun 17:21

akshayka

v0.1.12

5606e7f

v0.1.12

addresses some deprecation warnings raised when using torch 1.9.0
includes some plotting fixes

Assets 2

Releases: cvxgrp/pymde

0.3.0

Extremely fast k-nearest neighbor preprocessing, written in Rust.

Choosing the k-nearest neighbor algorithm

New interactive examples

Uh oh!

v0.2.0

What's Changed

Contributors

Uh oh!

v0.1.18

Uh oh!

v0.1.17

Uh oh!

v0.1.16

Uh oh!

v0.1.13

Uh oh!

v0.1.12

Uh oh!