The latest version of umap-learn is now out. Version 0.5 includes some major new features, including ParametricUMAP, DensMAP, AlignedUMAP, model composition, and model updating. Thank you to everyone who contributed! 1/14
ParametricUMAP uses a neural network to learn a UMAP embedding. This allows for a number of significant advantages. 2/14
ParametricUMAP provides extremely fast new data embedding (comparable to PCA if you use a GPU), UMAP based autoencoders, and powerful semi-supervised learning, particularly in low label regimes. 3/14
Special thanks to Hyunghoon Cho and team for their contribution of DensMAP based on their paper. 6/14
AlignedUMAP allows sequences of different UMAP embeddings to be aligned with each other according to relations among the datasets. This can be particularly useful for situations such as time evolving data. 7/ 14
UMAP also now allows use of an “update” method to generate a new model updated with new additional data. 10/14
The approximate nearest neighbour search from UMAP is now fully moved to an ANN library PyNNDescent (github.com/lmcinnes/pynnd…). In turn PyNNDescent has seen significant development and is faster, multithreaded, and supports new metrics such as Wasserstein distance. 11/14
Finally a large number of bug-fixes, plotting improvements, and performance improvements were contributed as well. 12/14
Thank you to *everyone* who contributed, including those who helped in improving the documentation. 13/14
Pynndescent, an approximate nearest neighbor search library, got a major update recently. Index construction is now multicore by default. Querying is now much faster -- competitive with some of the fastest ANN libraries around.
(1/4)
Performance in particularly strong for higher accuracy (>90%) queries.
(2/4)
The library also comes equipped with a Transformer class fitting in with the new KNeighborsTransformer in scikit-learn (scikit-learn.org/stable/modules…) to allow you to speed up various sklearn models and pipelines.
(3/4)