hyungkwonko
diff --git a/‎LICENSE
-21 b/‎LICENSE
-21
diff --git a/‎README.md
+38-19 b/‎README.md
+38-19
diff --git a/‎README_landmark.md
+50 b/‎README_landmark.md
+50
diff --git a/‎datasets/sg2.py
+4-4 b/‎datasets/sg2.py
+4-4
diff --git a/‎datasets/vggface2_sg2.py
+5-5 b/‎datasets/vggface2_sg2.py
+5-5
diff --git a/‎docs/sgf_result.jpg
-30.7 KB b/‎docs/sgf_result.jpg
-30.7 KB
@@ -4,9 +4,7 @@ This is an unofficial implementation of the paper ["Surrogate Gradient Field for
 
 ![sgf_result](./docs/sgf_result.jpg)
 
-### (Jul. 16, 2021) Current issues in the result (TODO, working on)
-- the ID of face changes
-    - how to fix? Will add more supervision (binary attributes)
+The author leveraged diverse labels (e.g., age, gender, smile, ...) using [MS Face API](https://azure.microsoft.com/en-us/services/cognitive-services/face/). In the experiment, I only used pose values in a soft manner (0.0 ~ 1.0) for my own research. Empirically, the result shows a smooth transition compared to the manipulation learned by hard labels. I believe adding more labels as the authors did in their work will make the transition more robust (e.g., id or characteristics of the input image is sustained while manipulating it).
 
 
 ## Requirements
@@ -35,7 +33,7 @@ pip install numba
 ## Run SGF
 To see the manipulation result:
 ```
-python sgf.py --G_path 'path/to/generator.pkl' --SE_path 'path/to/se.pth' --AUX_path 'path/to/aux.pth' --save_result 1
+python sgf_pose.py --G_path 'path/to/generator.pkl' --SE_path 'path/to/se.pth' --AUX_path 'path/to/aux.pth' --save_result 1
 ```
 
 ---
@@ -52,7 +50,7 @@ python generate.py --outdir=data/test/images --seeds=100500,101000 --resize 256
 ```
 
 ### Step 2: Label images [`c`]
-- Label images using Azure Face API / open source Face landmark detection algorithm
+- Label images using Azure Face API / open source Face landmark detection algorithm to infer pose (yaw, roll, pitch)
 ```
 python face_align.py --indir train
 python face_align.py --indir val
@@ -64,29 +62,42 @@ python face_align.py --indir test
 python face_align.py --indir test --plot 1
 ```
 
+- Next, infer the face pose values (e.g., yaw, roll, pitch)
+```
+python pose_estimation.py --image_dir data/train/
+python pose_estimation.py --image_dir data/val/
+python pose_estimation.py --image_dir data/test/
+```
+
+- If you want to see the pose result
+```
+python pose_estimation.py --image_dir data/test/ --save_img 1
+```
 
 ### Step 3: Fine-tune Squeeze and Excitation Network using images [`x`] and labels [`c`]
 - Used is SE ResNet 50 pretrained on VGG Face2 dataset
 ```
-python finetune.py --pretrained_path 'path/to/model.pkl'
-python finetune.py --mode test --model_path 'path/to/model.pth'
+python finetune_pose.py
+python finetune_pose.py --mode test --model_path path/to/model.pth
 ```
 
+
 ### Step 4: Train Auxiliary (FC-layer) Network [`mapping: (z, c) -> z`]
 - 6 FC layers for Z space, and 15 layers for W space
 - AdaIN is used to mix features (`z` and `c`) in the same way as StyleGAN v1
 - Refer to Appendix B in the paper
 
 ```
-python fc_layer.py --ckpt_dir 'path/to/save_dir'
-python fc_layer.py --mode test --ckpt_dir 'path/to/save_dir' --ckpt_fname 'filename.pth'
+python fc_layer_pose.py --ckpt_dir 'path/to/save_dir'
+python fc_layer_pose.py --mode test --ckpt_dir 'path/to/save_dir' --ckpt_fname 'filename.pth'
 ```
 
+
 ### Step 5: Calculate gradient in the surrogate gradient field and update [`z`]
 - Refer to Algo 1 in the original paper
 - Manipulate C to suit your purpose
 ```
-python sgf.py --G_path 'path/to/generator.pkl' --SE_path 'path/to/se.pth' --AUX_path 'path/to/aux.pth' --save_result 1
+python sgf_pose.py --G_path 'path/to/generator.pkl' --SE_path 'path/to/se.pth' --AUX_path 'path/to/aux.pth' --save_result 1
 ```
 
 
@@ -96,12 +107,20 @@ Many thanks to the first author of the original paper, [Minjun Li](https://minju
 ## References
 - [Li, M., Jin, Y., & Zhu, H. (2021). Surrogate Gradient Field for Latent Space Manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision ](https://arxiv.org/abs/2104.09065)
 
-Also, the implementaion is based on many works:
-- [Face Alignment](https://arxiv.org/abs/1703.07332)
-    - [Official code](https://github.com/1adrianb/face-alignment)
-- [StyleGAN2](https://arxiv.org/abs/1912.04958)
-    - [Official code](https://github.com/NVlabs/stylegan2-ada-pytorch)
-- [SENet](https://arxiv.org/abs/1709.01507?spm=a2c41.13233144.0.0) & [VGG Face2 dataset](https://arxiv.org/abs/1710.08092)
-    - [Official code](https://github.com/ox-vgg/vgg_face2)
-    - [Pytorch code](https://github.com/cydonia999/VGGFace2-pytorch)
-- [AdaIN](https://arxiv.org/abs/1703.06868)
+
+## Credits
+
+**StyleGAN2-ADA:**  
+https://github.com/NVlabs/stylegan2-ada-pytorch  
+Copyright (c) 2021, NVIDIA Corporation  
+NVIDIA Source Code License https://github.com/NVlabs/stylegan2-ada-pytorch/blob/main/LICENSE.txt   
+
+**Face Alignment:**  
+https://github.com/1adrianb/face-alignment  
+Copyright (c) 2017, Adrian Bulat  
+License (BSD 3-Clause) https://github.com/1adrianb/face-alignment/blob/master/LICENSE  
+
+**VGG Face2 Datset & Squeeze and Excitation Network:**   
+https://github.com/cydonia999/VGGFace2-pytorch  
+Copyright (c) 2018 cydonia  
+License (MIT) https://github.com/cydonia999/VGGFace2-pytorch/blob/master/LICENSE  
@@ -0,0 +1,50 @@
+# To train on landmark labels
+
+
+### Step 1: Sample image generation using StyleGAN2 [`x`]
+- Generate 100K samples images using StyleGAN2 to train SENet
+```
+python generate.py --outdir=data/train/images --seeds=0,100000 --resize 256
+python generate.py --outdir=data/val/images --seeds=100000,100500 --resize 256
+python generate.py --outdir=data/test/images --seeds=100500,101000 --resize 256
+```
+
+### Step 2: Label images [`c`]
+- Label images using Azure Face API / open source Face landmark detection algorithm
+```
+python face_align.py --indir train
+python face_align.py --indir val
+python face_align.py --indir test
+```
+
+- If you want to see the landmark result
+```
+python face_align.py --indir test --plot 1
+```
+
+
+### Step 3: Fine-tune Squeeze and Excitation Network using images [`x`] and labels [`c`]
+- Used is SE ResNet 50 pretrained on VGG Face2 dataset
+```
+python finetune.py --pretrained_path 'path/to/model.pkl'
+python finetune.py --mode test --model_path 'path/to/model.pth'
+```
+
+
+### Step 4: Train Auxiliary (FC-layer) Network [`mapping: (z, c) -> z`]
+- 6 FC layers for Z space, and 15 layers for W space
+- AdaIN is used to mix features (`z` and `c`) in the same way as StyleGAN v1
+- Refer to Appendix B in the paper
+
+```
+python fc_layer.py --ckpt_dir 'path/to/save_dir'
+python fc_layer.py --mode test --ckpt_dir 'path/to/save_dir' --ckpt_fname 'filename.pth'
+```
+
+
+### Step 5: Calculate gradient in the surrogate gradient field and update [`z`]
+- Refer to Algo 1 in the original paper
+- Manipulate C to suit your purpose
+```
+python sgf.py --G_path 'path/to/generator.pkl' --SE_path 'path/to/se.pth' --AUX_path 'path/to/aux.pth' --save_result 1
+```
@@ -6,7 +6,7 @@
 
 class StyleGAN2_Data(datasets.ImageFolder):
 
-    def __init__(self, root='data', split='train', latent_dim=512):
+    def __init__(self, root='data', split='train', lname='landmarks', latent_dim=512):
         super(StyleGAN2_Data, self).__init__(root)
 
         assert os.path.exists(root), "root: {} not found.".format(root)
@@ -20,14 +20,14 @@ def __init__(self, root='data', split='train', latent_dim=512):
 
         # self.labels = np.load(os.path.join(root, split, 'npy', 'landmarks.npy'))
         if split == 'train' or split == 'train_all':
-            self.labels_original = np.load(os.path.join(root, split, 'npy', 'landmarks.npy'))
+            self.labels_original = np.load(os.path.join(root, split, 'npy', f'{lname}.npy'))
             self.labels = self.scale_label(self.labels_original)
 
         elif split == 'val' or 'test':
-            self.labels_original = np.load(os.path.join(root, 'train_all', 'npy', 'landmarks.npy'))
+            self.labels_original = np.load(os.path.join(root, 'train_all', 'npy', f'{lname}.npy'))
             self.scale_label(self.labels_original)
 
-            self.labels_original = np.load(os.path.join(root, split, 'npy', 'landmarks.npy'))
+            self.labels_original = np.load(os.path.join(root, split, 'npy', f'{lname}.npy'))
             self.labels = self.scale_val_label(self.labels_original)
         else:
             raise ValueError(f"split was not set correctly split = ['train', 'val', 'test'] not {split}")
 
@@ -8,32 +8,32 @@
 
 class StyleGAN2_Data(datasets.ImageFolder):
 
-    def __init__(self, root='data/', split='train', transform=None, scale_size=-1):
+    def __init__(self, root='data/', split='train', lname='landmarks', transform=None, scale_size=-1):
         super(StyleGAN2_Data, self).__init__(root)
 
         assert os.path.exists(root), "root: {} not found.".format(root)
 
         self.root = os.path.join(root, split)
         self.split = split
         self.transform = transform
-        self.scaler = MinMaxScaler(feature_range = (-1, 1))
+        self.scaler = MinMaxScaler(feature_range = (0, 1))
         self.scale_size = scale_size
 
         if split == 'train':
-            self.labels_original = np.load(os.path.join(root, 'train', 'npy', 'landmarks.npy'))
+            self.labels_original = np.load(os.path.join(root, 'train', 'npy', f'{lname}.npy'))
             if scale_size > 0:
                 self.labels = self.scale_label(self.labels_original / scale_size * INPUT_SIZE)
             else:
                 self.labels = self.scale_label(self.labels_original)
 
         elif split == 'val' or 'test':
-            self.labels_original = np.load(os.path.join(root, 'train', 'npy', 'landmarks.npy'))
+            self.labels_original = np.load(os.path.join(root, 'train', 'npy', f'{lname}.npy'))
             if scale_size > 0:
                 self.scale_label(self.labels_original / scale_size * INPUT_SIZE)
             else:
                 self.scale_label(self.labels_original)
 
-            self.labels_original = np.load(os.path.join(root, split, 'npy', 'landmarks.npy'))
+            self.labels_original = np.load(os.path.join(root, split, 'npy', f'{lname}.npy'))
             if scale_size > 0:
                 self.labels = self.scale_val_label(self.labels_original / scale_size * INPUT_SIZE)
             else: