Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training batches for imbalanced datasets #151

Open
antriksh63 opened this issue Jun 11, 2018 · 3 comments
Open

Training batches for imbalanced datasets #151

antriksh63 opened this issue Jun 11, 2018 · 3 comments

Comments

@antriksh63
Copy link

I have an imbalanced data set with around 100 entries of positive class and 4000 entries of negative class. One way to create the training batches would be to take 100 entries of positive and 100 entries of negative class and then allow the code to proceed as normal. However, this has high chances of overfitting.
I think one thing that can be done is to have equal number of entries of positive and negative entries in the training batch.( batch_size/2 positive and negative entries). How can I do this?

@self-ms
Copy link

self-ms commented Nov 13, 2023

If your data is embedded and labels are available, you can use the following repository:
https://github.com/ms-unlimit/Transformer-Based-Machine-Learning-Framework

@xuqiangxq
Copy link

xuqiangxq commented Nov 13, 2023 via email

@liangtingStduy
Copy link

liangtingStduy commented Nov 13, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants