Skip to content
This repository has been archived by the owner on Aug 27, 2024. It is now read-only.

LabelBinarizer fails when training set is missing some labels #605

Open
olliethomas opened this issue Apr 29, 2022 · 1 comment
Open

LabelBinarizer fails when training set is missing some labels #605

olliethomas opened this issue Apr 29, 2022 · 1 comment
Labels
bug Something isn't working code-change An actual code change.
Milestone

Comments

@olliethomas
Copy link
Member

the label binarizer class takes a dataset with 2 labels that are not binary, and maps the min value to 0, and the max value to 1. It stores these values to do the inverse of this operation later.

There is a problem where the dataset size is very small and only contains one label due to the following:
assert dataset.y[y_col].nunique() == 2
I don't know the best way to deal with this, maybe provide the labels as an arg?

@olliethomas olliethomas added the bug Something isn't working label Apr 29, 2022
@tmke8
Copy link
Member

tmke8 commented Apr 29, 2022

Yeah, just optionally providing the labels sounds like a good plan.

@tmke8 tmke8 added this to the EthicML 1.0 milestone May 30, 2022
@tmke8 tmke8 added the code-change An actual code change. label May 30, 2022
@tmke8 tmke8 changed the title LabelBinarizer LabelBinarizer fails when training set is missing some labels Jun 12, 2022
@tmke8 tmke8 modified the milestones: EthicML 1.0, EthicML 2.0 Sep 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working code-change An actual code change.
Projects
None yet
Development

No branches or pull requests

2 participants