This is the unofficial implementation of the paper LEDNet.
This repo contains the whole model architecture for the LEDNet model. You can also view the soon to be released official implementation at this repo. Apart from this, I have tried to replicate the model to the very best given in the paper and have even used some inspiration for the missing information from the ENet model. Hope you will find this model useful 😄
Currently this repo contains the directory model containing the whole model architecture excluding the model architecture. So to use as a model in your segmentation task you can just place the directory model in your working directory. After that
from model import return_model
....
# So the below line will initialize the LEDNet model for 128*128 images
seg_model = return_model(input_nc = 3, output_nc = 22, netG = 'lednet_128')
# Also the input_channels and output_channels can be handled accordinglyAlso the model architecture has already been tested. Soon I will update the training and testing procedure for the VOC segmentation task and also for the Cityscapes dataset.
Although most of the things have been taken up directly from what was specified in the original paper, but due to some changes it is best to specify them here:
- As specified in the original Enet paper, I have used
PReLuactivation in the encoder part, but usedReLuin the decoder part. However have bias terms in all of the network. - In every downsampling block after the concatenation of the two parallel operations there is application of activation.
- Also
BatchNorm2dhas been in everySSnbtmodule, as the results were not that great in its absence. - There is application of
Dropout2din everySSnbtmodule after concatenation of its left and right branch. - Most importantly, for the upsampling in the end and also in the
APNmodule, I have usedBilinear Interpolation. I also tried usingConvTraspose2dinitially but it lead very poor results and also checkered effects in the final results. - Also this model works for the case of 128*128 and 256*256 images, in contrast to input size of images in the original paper.