Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: fashion mnist and non robustness filtering #41

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

flekschas-ozette
Copy link
Collaborator

This PR fixes one small bug (when the categorical labels are not of type string) and filters out non-robust points from the metric scatter. The later allows the user to keep non-robust points in the categorical scatter (drawn as dark gray points) without being visually impacted by their metric value.

Most importantly, this PR adds a useful for how to compare different embedding methods (with and without existing label information) using Fashion MNIST data.

Also check that `self._data[_LABEL_COLUMN][0]` is a string before checking for `+`
@flekschas-ozette flekschas-ozette requested a review from manzt June 20, 2023 21:38
@flekschas
Copy link
Collaborator

@manzt shall we still merge this? The string check for checking whether the label is a marker could still be useful I think

@manzt
Copy link
Collaborator

manzt commented Sep 25, 2023

Good question! I think the changes to the src are good, but do you you still want to add the notebook?

@manzt
Copy link
Collaborator

manzt commented Sep 25, 2023

Actually, wasn't this implemented in #43?

@flekschas-ozette
Copy link
Collaborator Author

Not this part:

self.has_markers = (
    isinstance(self._data[_LABEL_COLUMN][0], str)
    and "+" in self._data[_LABEL_COLUMN][0]
)

We currently assume that the label column is a string but it doesn't have to be. Ints can also define categories (like in the clustering case where the labels are just cluster ints)

I think having this notebook could be useful in general? Like when someone finds this tool but doesn't have labels yet? What do you think?

@manzt manzt force-pushed the main branch 5 times, most recently from 2406222 to acad765 Compare October 17, 2024 06:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants