Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataset for fsns experiment #98

Open
runfengxu opened this issue Aug 1, 2020 · 1 comment
Open

dataset for fsns experiment #98

runfengxu opened this issue Aug 1, 2020 · 1 comment

Comments

@runfengxu
Copy link

runfengxu commented Aug 1, 2020

When I convert the image data from tfrecord format to jpg formet, I found that, each jpg file is actually 4 square images concatenated together. And the the FileBasedDataset does nothing regarding that. And I don't see the FSNSLocalizationNet do separate localization for these 4 images. How to understand this?

if self.uses_original_data:
# handle each individual view as increase in batch size
batch_size, num_channels, height, width = images.shape
images = F.reshape(images, (batch_size, num_channels, height, 4, -1))
images = F.transpose(images, (0, 3, 1, 2, 4))
images = F.reshape(images, (batch_size * 4, num_channels, height, width // 4))

does it consider 4 different images as an additional dimension for the localization?

@Bartzi
Copy link
Owner

Bartzi commented Aug 3, 2020

Yes, FSNS is organized in such a way that one sample is actually comprised of 4 samples.
The code snippet you refer to handles this case. If the flag uses_original_data is set to True the incoming image with a shape of (batch_size, 3, 150, 600) (height 150 pixels and width 600 pixels) is reorganized to a batch with the following shape (4, 3, 150, 150). We basically convert one image to 4 images and handle them independently. Later, they are fused together again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants