Fanghc95
diff --git a/‎README.md
+25-21 b/‎README.md
+25-21
diff --git a/‎curve/1.jpg
-21.9 KB b/‎curve/1.jpg
-21.9 KB
diff --git a/‎curve/2.jpg
-55 KB b/‎curve/2.jpg
-55 KB
diff --git a/‎curve/FDDB.png
83.7 KB b/‎curve/FDDB.png
83.7 KB
diff --git a/‎curve/FDDB_DiscROC.png
-6.37 KB b/‎curve/FDDB_DiscROC.png
-6.37 KB
diff --git a/‎curve/Widerface.jpg
221 KB b/‎curve/Widerface.jpg
221 KB
diff --git a/‎curve/o_1.png
-82.8 KB b/‎curve/o_1.png
-82.8 KB
diff --git a/‎curve/o_2.png
-81.8 KB b/‎curve/o_2.png
-81.8 KB
diff --git a/‎curve/o_3.png
-89.3 KB b/‎curve/o_3.png
-89.3 KB
diff --git a/‎curve/r_1.png
-81.4 KB b/‎curve/r_1.png
-81.4 KB
diff --git a/‎curve/r_2.png
-81 KB b/‎curve/r_2.png
-81 KB
diff --git a/‎curve/r_3.png
-86.7 KB b/‎curve/r_3.png
-86.7 KB
diff --git a/‎data/config.py
+34-3 b/‎data/config.py
+34-3
diff --git a/‎data/data_augment.py
+4 b/‎data/data_augment.py
+4
diff --git a/‎detect.py
+16-10 b/‎detect.py
+16-10
diff --git a/‎layers/modules/multibox_loss.py
+2-2 b/‎layers/modules/multibox_loss.py
+2-2
diff --git a/‎model_best.pth.tar
-3.65 MB b/‎model_best.pth.tar
-3.65 MB
diff --git a/‎models/mobilev1.py ‎models/net.py
+22-16 b/‎models/mobilev1.py ‎models/net.py
+22-16
@@ -1,31 +1,30 @@
 # RetinaFace in PyTorch
 
-A [PyTorch](https://pytorch.org/) implementation of [RetinaFace: Single-stage Dense Face Localisation in the Wild](https://arxiv.org/abs/1905.00641). Model size only 1.7M, when Retinaface use mobilenet0.25 as backbone net. The official code in Mxnet can be found [here](https://github.com/deepinsight/insightface/tree/master/RetinaFace).
+A [PyTorch](https://pytorch.org/) implementation of [RetinaFace: Single-stage Dense Face Localisation in the Wild](https://arxiv.org/abs/1905.00641). Model size only 1.7M, when Retinaface use mobilenet0.25 as backbone net. We also provide resnet50 as backbone net to get better result. The official code in Mxnet can be found [here](https://github.com/deepinsight/insightface/tree/master/RetinaFace).
 
 ## WiderFace Val Performance in single scale When using Resnet50 as backbone net.
 | Style | easy | medium | hard |
 |:-|:-:|:-:|:-:|
-| Pytorch (same parameter with Mxnet) | 94.47 % | 93.54% | 89.21% |
-| Pytorch (original image scale) | 95.55 % | 94.09% | 84.05% |
+| Pytorch (same parameter with Mxnet) | 94.82 % | 93.84% | 89.60% |
+| Pytorch (original image scale) | 95.48% | 94.04% | 84.43% |
 | Mxnet | 94.86% | 93.87% | 88.33% |
 | Mxnet(original image scale) | 94.97% | 93.89% | 82.27% |
 
-ps: The resnet50-based demo will be updated recently.
-
 ## WiderFace Val Performance in single scale When using Mobilenet0.25 as backbone net.
 | Style | easy | medium | hard |
 |:-|:-:|:-:|:-:|
-| Pytorch (same parameter with Mxnet) | 86.85 % | 85.84% | 79.69% |
-| Pytorch (original image scale) | 90.58 % | 87.94% | 73.96% |
+| Pytorch (same parameter with Mxnet) | 88.67% | 87.09% | 80.99% |
+| Pytorch (original image scale) | 90.70% | 88.16% | 73.82% |
 | Mxnet | 88.72% | 86.97% | 79.19% |
 | Mxnet(original image scale) | 89.58% | 87.11% | 69.12% |
-<p align="center"><img src="curve/r_3.png" width="640"\></p>
+<p align="center"><img src="curve/Widerface.jpg" width="640"\></p>
 
-## FDDB Performance When using Mobilenet0.25 as backbone net.
-| Dataset | performance |
+## FDDB Performance.
+| FDDB(pytorch) | performance |
 |:-|:-:|
-| FDDB(pytorch) | 97.93% |
-<p align="center"><img src="curve/FDDB_DiscROC.png" width="640"\></p>
+| Mobilenet0.25 | 98.64% |
+| Resnet50 | 99.22% |
+<p align="center"><img src="curve/FDDB.png" width="640"\></p>
 
 ### Contents
 - [Installation](#installation)
@@ -62,24 +61,31 @@ ps: wider_val.txt only include val file names but not label information.
 ##### Data1
 We also provide the organized dataset we used as in the above directory structure.
 
-Link: from [baidu cloud](https://pan.baidu.com/s/1jIp9t30oYivrAvrgUgIoLQ) Password: ruck
+Link: from [google cloud](https://drive.google.com/open?id=11UGV3nbVv1x9IC--_tK3Uxf7hA6rlbsS) or [baidu cloud](https://pan.baidu.com/s/1jIp9t30oYivrAvrgUgIoLQ) Password: ruck
 
 ## Training
-We trained Mobilenet0.25 on imagenet dataset and get 46.75%  in top 1. We use it as pretrain model  which has been put in repository named ``model_best.pth.tar``.
-1. Before training, you can check the mobilenet*0.25 network configuration (e.g. batch_size, min_sizes and steps etc..) in ``data/config.py and train.py``.
+We provide restnet50 or mobilenet0.25 as backbone network.
+We trained Mobilenet0.25 on imagenet dataset and get 46.58%  in top 1. If you do not wish to train the model, we also provide trained model. Pretrain model  and trained model are put in [google cloud](https://drive.google.com/open?id=1oZRSG0ZegbVkVwUd8wUIQx8W7yfZ_ki1) and [baidu cloud](https://pan.baidu.com/s/12h97Fy1RYuqMMIV-RpzdPg) Password: fstq . The model could be put as follows:
+'''Shell
+  ./weights/
+      mobilenet0.25_Final.pth
+      mobilenetV1X0.25_pretrain.tar
+      Resnet50_Final.pth
+'''
+1. Before training, you can check network configuration (e.g. batch_size, min_sizes and steps etc..) in ``data/config.py and train.py``.
 
 2. Train the model using WIDER FACE:
   ```Shell
-  python train.py
+  CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --network resnet50 or
+  CUDA_VISIBLE_DEVICES=0 python train.py --network mobile0.25
   ```
 
-If you do not wish to train the model, we also provide trained model in `./weights/Final_Retinaface.pth`.
 
 ## Evaluation
 ### Evaluation widerface val
 1. Generate txt file
 ```Shell
-python test_widerface.py --trained_model weight_file
+python test_widerface.py --trained_model weight_file --network mobile0.25 or resnet50
 ```
 2. Evaluate txt results. Demo come from [Here](https://github.com/wondervictor/WiderFace-Evaluation)
 ```Shell
@@ -97,14 +103,12 @@ python evaluation.py
 
 2. Evaluate the trained model using:
 ```Shell
-python test.py --dataset FDDB
+python test_fddb.py --trained_model weight_file --network mobile0.25 or resnet50
 ```
 
 3. Download [eval_tool](https://bitbucket.org/marcopede/face-eval) to evaluate the performance.
 
-## RetinaFace-MobileNet0.25
 <p align="center"><img src="curve/1.jpg" width="640"\></p>
-<p align="center"><img src="curve/2.jpg" width="640"\></p>
 
 ## References
 - [FaceBoxes](https://github.com/zisianw/FaceBoxes.PyTorch)
 
@@ -1,11 +1,42 @@
 # config.py
 
-cfg = {
-    'name': 'Retinaface',
+cfg_mnet = {
+    'name': 'mobilenet0.25',
     'min_sizes': [[16, 32], [64, 128], [256, 512]],
     'steps': [8, 16, 32],
     'variance': [0.1, 0.2],
     'clip': False,
     'loc_weight': 2.0,
-    'gpu_train': True
+    'gpu_train': True,
+    'batch_size': 32,
+    'ngpu': 1,
+    'epoch': 250,
+    'decay1': 190,
+    'decay2': 220,
+    'image_size': 640,
+    'pretrain': True,
+    'return_layers': {'stage1': 1, 'stage2': 2, 'stage3': 3},
+    'in_channel': 32,
+    'out_channel': 64
 }
+
+cfg_re50 = {
+    'name': 'Resnet50',
+    'min_sizes': [[16, 32], [64, 128], [256, 512]],
+    'steps': [8, 16, 32],
+    'variance': [0.1, 0.2],
+    'clip': False,
+    'loc_weight': 2.0,
+    'gpu_train': True,
+    'batch_size': 24,
+    'ngpu': 4,
+    'epoch': 100,
+    'decay1': 70,
+    'decay2': 90,
+    'image_size': 840,
+    'pretrain': True,
+    'return_layers': {'layer2': 1, 'layer3': 2, 'layer4': 3},
+    'in_channel': 256,
+    'out_channel': 256
+}
+
@@ -9,10 +9,14 @@ def _crop(image, boxes, labels, landm, img_dim):
     pad_image_flag = True
 
     for _ in range(250):
+        """
         if random.uniform(0, 1) <= 0.2:
             scale = 1.0
         else:
             scale = random.uniform(0.3, 1.0)
+        """
+        PRE_SCALES = [0.3, 0.45, 0.6, 0.8, 1.0]
+        scale = random.choice(PRE_SCALES)
         short_side = min(width, height)
         w = int(scale * short_side)
         h = w
 
@@ -4,7 +4,7 @@
 import torch
 import torch.backends.cudnn as cudnn
 import numpy as np
-from data import cfg
+from data import cfg_mnet, cfg_re50
 from layers.functions.prior_box import PriorBox
 from utils.nms.py_cpu_nms import py_cpu_nms
 import cv2
@@ -14,12 +14,13 @@
 
 parser = argparse.ArgumentParser(description='Retinaface')
 
-parser.add_argument('-m', '--trained_model', default='./weights/Final_Retinaface.pth',
+parser.add_argument('-m', '--trained_model', default='./weights/Resnet50_Final.pth',
                     type=str, help='Trained state_dict file path to open')
+parser.add_argument('--network', default='resnet50', help='Backbone network mobile0.25 or resnet50')
 parser.add_argument('--cpu', action="store_true", default=False, help='Use cpu inference')
-parser.add_argument('--confidence_threshold', default=0.05, type=float, help='confidence_threshold')
+parser.add_argument('--confidence_threshold', default=0.02, type=float, help='confidence_threshold')
 parser.add_argument('--top_k', default=5000, type=int, help='top_k')
-parser.add_argument('--nms_threshold', default=0.3, type=float, help='nms_threshold')
+parser.add_argument('--nms_threshold', default=0.4, type=float, help='nms_threshold')
 parser.add_argument('--keep_top_k', default=750, type=int, help='keep_top_k')
 parser.add_argument('-s', '--save_image', action="store_true", default=True, help='show detection results')
 parser.add_argument('--vis_thres', default=0.6, type=float, help='visualization_threshold')
@@ -64,8 +65,13 @@ def load_model(model, pretrained_path, load_to_cpu):
 
 if __name__ == '__main__':
     torch.set_grad_enabled(False)
+    cfg = None
+    if args.network == "mobile0.25":
+        cfg = cfg_mnet
+    elif args.network == "resnet50":
+        cfg = cfg_re50
     # net and model
-    net = RetinaFace(phase="test")
+    net = RetinaFace(cfg=cfg, phase = 'test')
     net = load_model(net, args.trained_model, args.cpu)
     net.eval()
     print('Finished loading model!')
@@ -150,11 +156,11 @@ def load_model(model, pretrained_path, load_to_cpu):
                             cv2.FONT_HERSHEY_DUPLEX, 0.5, (255, 255, 255))
 
                 # landms
-                cv2.circle(img_raw, (b[5], b[6]), 4, (0, 0, 255), 4)
-                cv2.circle(img_raw, (b[7], b[8]), 4, (0, 255, 255), 4)
-                cv2.circle(img_raw, (b[9], b[10]), 4, (255, 0, 255), 4)
-                cv2.circle(img_raw, (b[11], b[12]), 4, (0, 255, 0), 4)
-                cv2.circle(img_raw, (b[13], b[14]), 4, (255, 0, 0), 4)
+                cv2.circle(img_raw, (b[5], b[6]), 1, (0, 0, 255), 4)
+                cv2.circle(img_raw, (b[7], b[8]), 1, (0, 255, 255), 4)
+                cv2.circle(img_raw, (b[9], b[10]), 1, (255, 0, 255), 4)
+                cv2.circle(img_raw, (b[11], b[12]), 1, (0, 255, 0), 4)
+                cv2.circle(img_raw, (b[13], b[14]), 1, (255, 0, 0), 4)
             # save image
 
             name = "test.jpg"
 
@@ -3,8 +3,8 @@
 import torch.nn.functional as F
 from torch.autograd import Variable
 from utils.box_utils import match, log_sum_exp
-from data import cfg
-GPU = cfg['gpu_train']
+from data import cfg_mnet
+GPU = cfg_mnet['gpu_train']
 
 class MultiBoxLoss(nn.Module):
     """SSD Weighted Loss Function
 
@@ -6,11 +6,11 @@
 import torch.nn.functional as F
 from torch.autograd import Variable
 
-def conv_bn(inp, oup, stride = 1):
+def conv_bn(inp, oup, stride = 1, leaky = 0):
     return nn.Sequential(
         nn.Conv2d(inp, oup, 3, stride, 1, bias=False),
         nn.BatchNorm2d(oup),
-        nn.ReLU(inplace=True)
+        nn.LeakyReLU(negative_slope=leaky, inplace=True)
     )
 
 def conv_bn_no_relu(inp, oup, stride):
@@ -19,34 +19,37 @@ def conv_bn_no_relu(inp, oup, stride):
         nn.BatchNorm2d(oup),
     )
 
-def conv_bn1X1(inp, oup, stride):
+def conv_bn1X1(inp, oup, stride, leaky=0):
     return nn.Sequential(
         nn.Conv2d(inp, oup, 1, stride, padding=0, bias=False),
         nn.BatchNorm2d(oup),
-        nn.ReLU(inplace=True)
+        nn.LeakyReLU(negative_slope=leaky, inplace=True)
     )
 
-def conv_dw(inp, oup, stride):
+def conv_dw(inp, oup, stride, leaky=0.1):
     return nn.Sequential(
         nn.Conv2d(inp, inp, 3, stride, 1, groups=inp, bias=False),
         nn.BatchNorm2d(inp),
-        nn.ReLU(inplace=True),
+        nn.LeakyReLU(negative_slope= leaky,inplace=True),
 
         nn.Conv2d(inp, oup, 1, 1, 0, bias=False),
         nn.BatchNorm2d(oup),
-        nn.ReLU(inplace=True),
+        nn.LeakyReLU(negative_slope= leaky,inplace=True),
     )
 
 class SSH(nn.Module):
     def __init__(self, in_channel, out_channel):
         super(SSH, self).__init__()
         assert out_channel % 4 == 0
+        leaky = 0
+        if (out_channel <= 64):
+            leaky = 0.1
         self.conv3X3 = conv_bn_no_relu(in_channel, out_channel//2, stride=1)
 
-        self.conv5X5_1 = conv_bn(in_channel, out_channel//4, stride=1)
+        self.conv5X5_1 = conv_bn(in_channel, out_channel//4, stride=1, leaky = leaky)
         self.conv5X5_2 = conv_bn_no_relu(out_channel//4, out_channel//4, stride=1)
 
-        self.conv7X7_2 = conv_bn(out_channel//4, out_channel//4, stride=1)
+        self.conv7X7_2 = conv_bn(out_channel//4, out_channel//4, stride=1, leaky = leaky)
         self.conv7x7_3 = conv_bn_no_relu(out_channel//4, out_channel//4, stride=1)
 
     def forward(self, input):
@@ -65,15 +68,18 @@ def forward(self, input):
 class FPN(nn.Module):
     def __init__(self,in_channels_list,out_channels):
         super(FPN,self).__init__()
-        self.output1 = conv_bn1X1(in_channels_list[0], out_channels, stride = 1)
-        self.output2 = conv_bn1X1(in_channels_list[1], out_channels, stride = 1)
-        self.output3 = conv_bn1X1(in_channels_list[2], out_channels, stride = 1)
+        leaky = 0
+        if (out_channels <= 64):
+            leaky = 0.1
+        self.output1 = conv_bn1X1(in_channels_list[0], out_channels, stride = 1, leaky = leaky)
+        self.output2 = conv_bn1X1(in_channels_list[1], out_channels, stride = 1, leaky = leaky)
+        self.output3 = conv_bn1X1(in_channels_list[2], out_channels, stride = 1, leaky = leaky)
 
-        self.merge1 = conv_bn(out_channels, out_channels)
-        self.merge2 = conv_bn(out_channels, out_channels)
+        self.merge1 = conv_bn(out_channels, out_channels, leaky = leaky)
+        self.merge2 = conv_bn(out_channels, out_channels, leaky = leaky)
 
     def forward(self, input):
-        names = list(input.keys())
+        # names = list(input.keys())
         input = list(input.values())
 
         output1 = self.output1(input[0])
@@ -97,7 +103,7 @@ class MobileNetV1(nn.Module):
     def __init__(self):
         super(MobileNetV1, self).__init__()
         self.stage1 = nn.Sequential(
-            conv_bn(3, 8, 2),    # 3
+            conv_bn(3, 8, 2, leaky = 0.1),    # 3
             conv_dw(8, 16, 1),   # 7
             conv_dw(16, 32, 2),  # 11
             conv_dw(32, 32, 1),  # 19