dataset for fsns experiment #98

runfengxu · 2020-08-01T01:26:45Z

When I convert the image data from tfrecord format to jpg formet, I found that, each jpg file is actually 4 square images concatenated together. And the the FileBasedDataset does nothing regarding that. And I don't see the FSNSLocalizationNet do separate localization for these 4 images. How to understand this?

if self.uses_original_data:
# handle each individual view as increase in batch size
batch_size, num_channels, height, width = images.shape
images = F.reshape(images, (batch_size, num_channels, height, 4, -1))
images = F.transpose(images, (0, 3, 1, 2, 4))
images = F.reshape(images, (batch_size * 4, num_channels, height, width // 4))

does it consider 4 different images as an additional dimension for the localization?

Bartzi · 2020-08-03T08:52:01Z

Yes, FSNS is organized in such a way that one sample is actually comprised of 4 samples.
The code snippet you refer to handles this case. If the flag uses_original_data is set to True the incoming image with a shape of (batch_size, 3, 150, 600) (height 150 pixels and width 600 pixels) is reorganized to a batch with the following shape (4, 3, 150, 150). We basically convert one image to 4 images and handle them independently. Later, they are fused together again.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dataset for fsns experiment #98

dataset for fsns experiment #98

runfengxu commented Aug 1, 2020 •

edited

Loading

Bartzi commented Aug 3, 2020

dataset for fsns experiment #98

dataset for fsns experiment #98

Comments

runfengxu commented Aug 1, 2020 • edited Loading

Bartzi commented Aug 3, 2020

runfengxu commented Aug 1, 2020 •

edited

Loading