Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom classification model design #185

Closed
2 of 3 tasks
MinhxNguyen7 opened this issue May 1, 2024 · 1 comment
Closed
2 of 3 tasks

Custom classification model design #185

MinhxNguyen7 opened this issue May 1, 2024 · 1 comment
Assignees
Labels
help wanted Extra attention is needed Perception

Comments

@MinhxNguyen7
Copy link
Contributor

MinhxNguyen7 commented May 1, 2024

Description

Design of the custom classification model, second shot in the new pipeline (#179). This model will classify the shape, character, shape color, and character color in one go, given the shape crop.

I'd like help with working through the ideas to design the model.

Design

Considerations

  • The model should be performant, ideally more so than YOLOv8n-det.
  • Something of a similar speed to YOLOv8n-det is acceptable, considering we currently process at ~2fps, and only ~1fps should be necessary.
  • Our datasets are not big and not that diverse. My hypothesis for our low real-world generalization performance is that our targets are too similar, perfect, and high-contrast compared to real-world data.
  • Our target crops will be of somewhat different sizes, and we should think about resizing it in a "good" way.

Ideas

  • YOLOv8n-det should be considered.
  • Use an autoencoder trained on diverse, unlabeled data (can be pre-trained or in-house). This should help with feature extraction and mitigate some of the problems with our smaller datasets.
    • If we do our own, we can only care about the area in the mask.
  • We need to augment our data manually since YOLO isn't helping. Since it's our own model, we should be able to do on-line augmentation with something like albumentations.
  • Maybe collect more IRL data that we can use in validation. We probably don't have enough to train on it.
  • Some traditional CV pre-processing.
    • Sharpening and contrast enhancement. Could be useful, but, in theory, the model should be able to learn this easily. That being said, these are such easy steps that might make it easier to train.
  • Force the first shot to return square, slightly-enlarged bounding boxes to maximize information to this model and decrease the chance that we accidentally crop part of it.

Action Items

  • Merge our datasets from different sources to make them more diverse Merge YOLO datasets #177.
  • Maybe collect more IRL data. Even without labeling, we could use it in an unsupervised manner.
  • Implement resnet-50 as baseline
@MinhxNguyen7 MinhxNguyen7 added help wanted Extra attention is needed Perception labels May 1, 2024
@MinhxNguyen7 MinhxNguyen7 self-assigned this May 1, 2024
@MinhxNguyen7
Copy link
Contributor Author

Path forward has been decided (Resnet)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed Perception
Projects
None yet
Development

No branches or pull requests

1 participant