thoughts on training with other in-domain datasets, before fine-tuning on actual dataset #1577
Unanswered
roybenhayun
asked this question in
Q&A
Replies: 1 comment 2 replies
-
Hey @roybenhayun, that could definitely help -- I always recommend setting up the train/test pipeline with the limited labels you currently have so you can understand the baseline performance first before trying to do more complicated things like domain transfer. Depending on your task, the imagery, and what you expect the model to do, you might find that few labels are actually fine :) |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
there is an interesting statement in section 4.3 "Dataset benchmarks" of the TorchGeo paper:
we are currently training a semantic segmentation model to which we will have a limited number of labels. so we have to think about how to compensate for lack of a significant number of ground truth data.
at the time of writing this post we have two steps:
task = SemanticSegmentationTask(
model="unet",
backbone="resnext50_32x4d",
weights="imagenet",
...
we hope to get more labels in the future.
for now, would be happy to understand if we could do another "intermediate" training on "similar in-domain dataset".
the above article statement could mean that it is possible, and even recommended. then the steps would be:
could it be any other remote-sensing semantic segmentation e.g., buildings or else, could some datasets actually deteriorate the training e.g., a very different geographic region, what is similar enough or would any remote-sensing stuff be good, etc etc. still generally thinking about the statement and how to implement it methodically.
any suggestion, advice, tips and questions would be welcomed!
thanks in advance.
Beta Was this translation helpful? Give feedback.
All reactions