Does the SAM2 implementation need any code modifications to handle different input resolutions? #10

GoldenFishes · 2024-12-02T02:15:18Z

Thank you so much for your great work! I would like to know if I need to modify the SAM2 code to handle different input resolutions. Which parts of the code should I modify? Also, what is the actual speed difference for different resolutions?

heyoeyo · 2024-12-02T12:15:59Z

Thanks for checking out the repo!

For the original SAM2 code, the video predictor already supports resolution changes without modifying the code, you just need to change the image_size config parameter. The image predictor does require some (small) changes to make the bb_feat_sizes parameter dynamic. There's more of a description in issue #257.

There's about a 4x speed up going from 1024 down to 512px. Unfortunately, the SAM v2 models don't handle the resolution change very well, so doing intermediate resolutions (e.g 768px) to get better speed/accuracy tradeoff usually isn't an option.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does the SAM2 implementation need any code modifications to handle different input resolutions? #10

Does the SAM2 implementation need any code modifications to handle different input resolutions? #10

GoldenFishes commented Dec 2, 2024

heyoeyo commented Dec 2, 2024

Does the SAM2 implementation need any code modifications to handle different input resolutions? #10

Does the SAM2 implementation need any code modifications to handle different input resolutions? #10

Comments

GoldenFishes commented Dec 2, 2024

heyoeyo commented Dec 2, 2024