Skip to content

My implementation of the ICML 2022 submission "LSeg" paper by Boyi Li et al. Includes a dense prediction transformer from scratch and use of the CLIP model.

Notifications You must be signed in to change notification settings

TAOGenna/pytorch-language-driven-semantic-segmentation

Repository files navigation

Language driven semantic segmentation

My implementation of the "Lseg: language driven semantic segmentation" paper by Boyi Li et al.

Architecture

A dense prediction transformer (DPT) with a modified head encodes at pixel level, and the CLIP model encodes a set of words. Both embeddings are later combined in a multimodal latent space (orange tensor in the image) which will be later compared to the ground truth labels of an annotated image.

Results

Dataset

We will train our model only on the ADE20K and COCOPanoptic datasets. We use MSeg-API to download and relabel them. I recommend following its instructions step by step but with a few modifications:

  • mseg-api should be cloned in the repo main directory.
  • In the scripts from the mseg-api/download_scripts folder you need to comment parts regarding other datasets.
  • Place the download for the dataset into data/ (data is a directory in the repo directory). This is done when you define MSEG_DST_DIR.

Once everything is downloaded, we use mseg-semantic utils link to interact with the data and create the dataloader.

  • I needed to change ade20k_images_dir = "data/mseg_dataset/ADE20K/" to ade20k_images_dir = "data/mseg_dataset/ADE20K/ADEChallengeData2016/" in Lseg/utils/util.py , otherwise an error shows up.
  • Use test_data_utils.ipynb to check we are fetching the images correctly. In my case, for the COCO dataset, nor the folder train2017 nor val2017 was inside data/COCOPanoptic/images/ so I had to create the folder myself and put both inside.

Acknowledgements

About

My implementation of the ICML 2022 submission "LSeg" paper by Boyi Li et al. Includes a dense prediction transformer from scratch and use of the CLIP model.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published