Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse Input Data #35

Open
3bst0r opened this issue Aug 31, 2016 · 2 comments
Open

Sparse Input Data #35

3bst0r opened this issue Aug 31, 2016 · 2 comments

Comments

@3bst0r
Copy link

3bst0r commented Aug 31, 2016

Is there a way to input sparse data? I suspect this is not a straight-forward thing to do, because of the lack of a standard way to store sparse matrices in a text file, i.e. python probably does it different than matlab (did not check though).


OT: I just watched a video of you presenting t-SNE at Google and I want to compliment you on your explanation skills. Very clear and understandable.

@lvdmaaten
Copy link
Owner

This is not implemented right now. Both Matlab and Numpy support compressed sparse row matrices, so it would be possible to add this.

For now, a potential way to circumvent this would be to do some kind of (logistic) PCA preprocessing in Matlab / Python and use the reduced data as input into t-SNE. (This is assuming that the full matrix is too big to keep in memory.)

@larahronn
Copy link

If I´m not mistaken, Truncated SVD is also good for preprocessing sparse data and it can work with scipy.sparse.

http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.TruncatedSVD.html

(I am brand new to the field, this is my first ever comment on github - breaking the ice here ;)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants