Transform method: why is it using hash #637
aarondbaron
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In the transform method, there is this code:
Why is this code here at all? If one were checking to see how the transform method operates in comparison to fit_transform, one simple strategy would have been to just apply the transform to the original samples. Then you'd be able to see that the transform method actually takes quite a long time to compute. Having this hash check does not seem right, and the purpose of having it here at all isn't clear.
More generally, the transform method doesn't seem all that useful when trying to scale to larger datasets. It appears that, similar to the approach mentioned by Van der Maaten about T-SNE, the best way to achieve a transform is to train a neural network to learn the embedding space, which also seems to be what Parametric UMAP is supposed to do. Is there any use of even having the transform at all, if it's not going to be faster than just performing fit_transform? Should it be deprecated in favor of using Parametric UMAP instead?
Beta Was this translation helpful? Give feedback.
All reactions