A GradCAM automatic script to visualize the model result
If the transformer you apply into is a swin'-like transformer(No Class Token) or ViT-like (Have a Class token)
The shape of the tensor may look like [Batch,49,768] then you should deal with your model with the following steps to avoid some terrible RuntimeError
Class XXXFormer(nn.Moudle):
def __init(self,...):
super().__init__()
.....
self.avgpool = nn.AdaptiveAvgPool1d(1) #this is essential
def forward(self,x):
x = self.forward_feartrue(x) # Supose that the out put is [Batch,49,768]
x = self.avgpool(x.transpose(1,2)) # [Batch,49,768] --> [Batch,768,49] --> [Batch,768,1]
x = torch.flatten(x,1) # [Batch,768]
Find your last transformer block and select the LayerNorm() attribute as your target layer if you have more than one LayerNorm() attribute you can get them all in a list or just select one of them
Your target layer may look like
# choose one LayerNorm() attribute for your target layer
target_Layer1 = [vit.block[-1].norm1]
target_Layer2 = [vit.block[-1].norm2]
# or stack up them all
target_Layer3 = [vit.block[-1].norm1,vit.block.norm2]
The reason may be like this as shown in the picture
- Automatic_Swim_variant_CAM.py
- Automatic_ViT_variant_CAM.py
the two .py file shown above is the main Python script you need to run just set up your image file and run these two scripts!!
parser.add_argument('--path', default='./image', help='the path of image')
parser.add_argument('--method', default='all', help='the method of GradCam can be specific ,default all')
parser.add_argument('--aug_smooth', default=True, choices=[True, False],
help='Apply test time augmentation to smooth the CAM')
parser.add_argument('--use_cuda', default=True, choices=[True, False],
help='if use GPU to compute')
parser.add_argument(
'--eigen_smooth',
default=False, choices=[True, False],
help='Reduce noise by taking the first principle componenet'
'of cam_weights*activations')
parser.add_argument('--modelname', default="ViT-B-16", help='Any name you want')
Method |
---|
CrossFormer (ICLR 2022) |
Vision Transformer (ICLR 2021) |