Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mosaic Transform #6534

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Mosaic Transform #6534

wants to merge 16 commits into from

Conversation

abhi-glitchhg
Copy link
Contributor

Part of #6323

@datumbox
Copy link
Contributor

@abhi-glitchhg Just checking with you to see if you got stack anywhere. :) Let me know if you face any issues.

@abhi-glitchhg
Copy link
Contributor Author

abhi-glitchhg commented Sep 15, 2022

Hey @datumbox, Thanks for checking on me! 🤗; I was a bit busy for some time.

I have gone through the mosaic implementation and have understood it;

I have a basic implementation locally. Hopefully, by this weekend, I will clean up and update this PR.
Thanks,
Abhijit :)

@abhi-glitchhg
Copy link
Contributor Author

Still WIP

@abhi-glitchhg abhi-glitchhg marked this pull request as ready for review September 20, 2022 08:53
@abhi-glitchhg abhi-glitchhg marked this pull request as draft September 20, 2022 09:22
@abhi-glitchhg
Copy link
Contributor Author

first of all, I apologize for the inactivity on this pr. I'll be more regular from now on.

I have used Pedestrian Dataset to check the implementation. Download the dataset
I have tested the implementation with following code; to create image tensor of shape B*4*C*H*W I have used for loop, there might be some efficient way to do this.

import torch

from torchvision.prototype import transforms, datapoints

from torchvision.prototype.transforms import functional as F

from torchvision import utils


import os
import numpy as np
import torch
from PIL import Image

from references.detection.transforms import Mosaic


class PennFudanDataset(torch.utils.data.Dataset):
    def __init__(self, root, transforms ):
        self.root = root
        self.transforms=  transforms
        # load all image files, sorting them to
        # ensure that they are aligned
        self.imgs = list(sorted(os.listdir(os.path.join(root, "PNGImages"))))
        self.masks = list(sorted(os.listdir(os.path.join(root, "PedMasks"))))

    def __getitem__(self, idx):
        # load images and masks
        img_path = os.path.join(self.root, "PNGImages", self.imgs[idx])
        mask_path = os.path.join(self.root, "PedMasks", self.masks[idx])
        img = Image.open(img_path).convert("RGB")
        img = F.pil_to_tensor(img)
        # note that we haven't converted the mask to RGB,
        # because each color corresponds to a different instance
        # with 0 being background
        mask = Image.open(mask_path)
        # convert the PIL Image into a numpy array
        mask = np.array(mask)
        # instances are encoded as different colors
        obj_ids = np.unique(mask)
        # first id is the background, so remove it
        obj_ids = obj_ids[1:]

        # split the color-encoded mask into a set
        # of binary masks
        masks = mask == obj_ids[:, None, None]

        # get bounding box coordinates for each mask
        num_objs = len(obj_ids)
        boxes = []
        for i in range(num_objs):
            pos = np.where(masks[i])
            xmin = np.min(pos[1])
            xmax = np.max(pos[1])
            ymin = np.min(pos[0])
            ymax = np.max(pos[0])
            boxes.append([xmin, ymin, xmax, ymax])

        # convert everything into a torch.Tensor
        boxes = torch.as_tensor(boxes, dtype=torch.float32)
        # there is only one class
        labels = torch.ones((num_objs,), dtype=torch.int64)
        masks = torch.as_tensor(masks, dtype=torch.uint8)

        #image_id = torch.tensor([idx])
        #area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
        # suppose all instances are not crowd
        #iscrowd = torch.zeros((num_objs,), dtype=torch.int64)

        img = datapoints.Image(img)
        boxes = datapoints.BoundingBox(boxes, format=datapoints.BoundingBoxFormat.XYXY, spatial_size=F.get_spatial_size(img) )
        labels = datapoints.Label(labels)
        if self.transforms is not None:
            img, boxes, labels = self.transforms(img, boxes,labels)

        return img, boxes, labels

    def __len__(self):
        return len(self.imgs)


def collate_fn(batch):
    return tuple(zip(*batch))

dataset = PennFudanDataset(root="./../PennFudanPed", transforms= transforms.Resize((350,324))  ) #change the root parameter according to your dir structure. 

data_loader = torch.utils.data.DataLoader(
 dataset, batch_size=4, shuffle=True, num_workers=1,
 collate_fn=collate_fn)

B = 16  # Batch size
counter=0 

batched_images=[]
batched_boxes = []
batched_labels = []
for i in data_loader:
    image,boxes, labels= i 
    image = torch.stack(image)
    boxes = list(boxes)
    labels = [*labels[0], *labels[1], *labels[2], *labels[3]]
    batched_images.append(image)
    batched_boxes.append(boxes)
    batched_labels.append(labels)
    counter+=1
    if (counter>B):
        break
    
batched_images= torch.stack(batched_images)
mosaic=  Mosaic()
output = mosaic(batched_images, batched_boxes, batched_labels)

for i in range(B):
    viz = utils.draw_bounding_boxes(F.to_image_tensor(output[0][i]), boxes= output[1][i])
    F.to_pil_image(viz).show()

Comment on lines +602 to +604
super().__init__()
self.min_frac = min_frac
self.max_frac = max_frac
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we need to check if the min_frac and max_frac arguments are in between 0 and 1.

@abhi-glitchhg abhi-glitchhg marked this pull request as ready for review January 26, 2023 10:42
@oke-aditya
Copy link
Contributor

Aah we need to review, this. Well I will try my best to find time and review this 😄 as well as understand how this works :)

@abhi-glitchhg
Copy link
Contributor Author

abhi-glitchhg commented Feb 14, 2023

Aah we need to review, this. Well I will try my best to find time and review this 😄 as well as understand how this works :)

yeah; sure! lmk if something is not clear

@byronyi
Copy link

byronyi commented May 31, 2023

Gentle ping for any updates.

@abhi-glitchhg abhi-glitchhg deleted the mosaic branch July 12, 2024 15:12
@abhi-glitchhg abhi-glitchhg restored the mosaic branch July 12, 2024 15:12
@abhi-glitchhg abhi-glitchhg reopened this Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants