Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check if the dimension reduction techniques can also be split in "train" and "inference" steps #10

Open
taxe10 opened this issue Mar 13, 2024 · 1 comment
Assignees

Comments

@taxe10
Copy link
Member

taxe10 commented Mar 13, 2024

In the use case when data is actively being collected, we may not want to "fit" the dimension reduction technique, but instead use a previous fit to run "inference". Is this possible with PCA and UMAP?

@runboj
Copy link
Contributor

runboj commented Mar 13, 2024

After looking at the documents, yes, both algorithms have an inference capability.

For PCA, we can use the fit_transform() function on new data (https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html,).

For UMAP, we can use the transform function (test_embedding = trans.transform(X_test)) on new data (https://umap-learn.readthedocs.io/en/latest/transform.html).

taxe10 pushed a commit that referenced this issue Apr 4, 2024
Fixed running outside containers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants