Zju-George
diff --git a/‎README.md
+15 b/‎README.md
+15
diff --git a/‎example1-infer3D/README.md
+9-6 b/‎example1-infer3D/README.md
+9-6
diff --git a/‎example1-infer3D/src/reconstruction.py
+25-21 b/‎example1-infer3D/src/reconstruction.py
+25-21
diff --git a/‎example2-alignment/warp-perspective-method/aligned.jpg
-195 KB b/‎example2-alignment/warp-perspective-method/aligned.jpg
-195 KB
diff --git a/‎example2-alignment/warp-perspective-method/main.py
-86 b/‎example2-alignment/warp-perspective-method/main.py
-86
diff --git a/‎example2-imageAlignment/README.md
+21 b/‎example2-imageAlignment/README.md
+21
diff --git a/‎example2-alignment/assets/butterfly.png ‎example2-imageAlignment/assets/butterfly.png b/‎example2-alignment/assets/butterfly.png ‎example2-imageAlignment/assets/butterfly.png
diff --git a/‎example2-alignment/assets/form.jpg ‎example2-imageAlignment/assets/form.jpg b/‎example2-alignment/assets/form.jpg ‎example2-imageAlignment/assets/form.jpg
diff --git a/‎example2-alignment/assets/scanned-form.jpg ‎example2-imageAlignment/assets/scan.jpg b/‎example2-alignment/assets/scanned-form.jpg ‎example2-imageAlignment/assets/scan.jpg
diff --git a/‎example2-alignment/assets/scan1.jpg ‎example2-imageAlignment/assets/scan1.jpg b/‎example2-alignment/assets/scan1.jpg ‎example2-imageAlignment/assets/scan1.jpg
diff --git a/‎example2-alignment/assets/scan2.jpg ‎example2-imageAlignment/assets/scan2.jpg b/‎example2-alignment/assets/scan2.jpg ‎example2-imageAlignment/assets/scan2.jpg
diff --git a/‎example2-alignment/assets/scan3.jpg ‎example2-imageAlignment/assets/scan3.jpg b/‎example2-alignment/assets/scan3.jpg ‎example2-imageAlignment/assets/scan3.jpg
diff --git a/‎example2-alignment/assets/scan4.jpg ‎example2-imageAlignment/assets/scan4.jpg b/‎example2-alignment/assets/scan4.jpg ‎example2-imageAlignment/assets/scan4.jpg
diff --git a/‎example2-alignment/assets/scan5.jpg ‎example2-imageAlignment/assets/scan5.jpg b/‎example2-alignment/assets/scan5.jpg ‎example2-imageAlignment/assets/scan5.jpg
diff --git a/‎example2-alignment/assets/scan6.jpg ‎example2-imageAlignment/assets/scan6.jpg b/‎example2-alignment/assets/scan6.jpg ‎example2-imageAlignment/assets/scan6.jpg
diff --git a/‎example2-imageAlignment/feature-points-matching-method/README.md
+13 b/‎example2-imageAlignment/feature-points-matching-method/README.md
+13
diff --git a/‎example2-imageAlignment/feature-points-matching-method/aligned.jpg
399 KB b/‎example2-imageAlignment/feature-points-matching-method/aligned.jpg
399 KB
diff --git a/‎example2-alignment/warp-perspective-method/main.cpp ‎example2-imageAlignment/feature-points-matching-method/main.cpp b/‎example2-alignment/warp-perspective-method/main.cpp ‎example2-imageAlignment/feature-points-matching-method/main.cpp
diff --git a/‎example2-imageAlignment/feature-points-matching-method/main.py
+94 b/‎example2-imageAlignment/feature-points-matching-method/main.py
+94
diff --git a/‎example2-alignment/uv-method/corners.txt ‎example2-imageAlignment/line-detection-method/corners.txt b/‎example2-alignment/uv-method/corners.txt ‎example2-imageAlignment/line-detection-method/corners.txt
diff --git a/‎example2-alignment/uv-method/main.py ‎example2-imageAlignment/line-detection-method/main.py b/‎example2-alignment/uv-method/main.py ‎example2-imageAlignment/line-detection-method/main.py
diff --git a/‎example2-alignment/uv-method/utils.py ‎example2-imageAlignment/line-detection-method/utils.py b/‎example2-alignment/uv-method/utils.py ‎example2-imageAlignment/line-detection-method/utils.py
diff --git a/‎example2-alignment/requirements.txt ‎example2-imageAlignment/requirements.txt b/‎example2-alignment/requirements.txt ‎example2-imageAlignment/requirements.txt
@@ -0,0 +1,15 @@
+## This repo contains several computer vision examples.
+
+- Example 1: Inferring the 3D coordinates of a point on the plane
+- Example 2: Image Alignment using feature-points-matching-method(recommended) and line-detection-method.
+
+## Install
+
+- Download
+    ```bash
+    git clone [email protected]:Zju-George/ComputerVisionExamples.git
+    ```
+- Synchronous update
+    ```bash
+    git pull
+    ```
@@ -1,4 +1,4 @@
-## Example1：估计平面一点的3D坐标
+## 推测平面一点的3D坐标
 
 ### 问题描述
 
@@ -25,8 +25,8 @@
 
 ### 离线步骤(Offline steps)
 
-1. 利用张正友相机标定法求出**相机内参**，包括相机焦距和畸变系数。
-   1. 打印下方棋盘格图片，尽量**平铺**于平面上，接着用相机从不同方位拍几张图片。<img src="https://github.com/Zju-George/3DReconstructionExample/raw/main/assets/checkerboard.png" alt="HMI" width="433" height="305" align="bottom" />
+1. **相机内参标定**。
+   1. 打印下方棋盘格图片，尽量**平铺**于平面上，接着用相机从不同方位拍几张图片。<img src="https://github.com/Zju-George/ComputerVisionExamples/raw/main/example1-infer3D/assets/checkerboard.png" alt="HMI" width="433" height="305" align="bottom" />
 
    2. 将拍的 jpg 图片放置于 `assets/` 下。
    3. 进入 `src/` 目录，执行 `python calibration.py`。
@@ -37,14 +37,14 @@
         第一个返回值 `ret` 存储着相机标定的重投影误差(reprojection error)，值越小意味着标定越精确。一般来说`ret` 不应大于 2，但如果值大于 5，大概率是上述某步做得有问题。第二个返回值 `mtx` 是相机投影矩阵，第三个返回值 `dist` 是相机的畸变系数。
 
 
-2. 准备PNP(perspective n points)算法所需要的数据。PNP算法用来求解**相机外参**，相机外参是相机坐标系相对于模型坐标系的变换。特别地，变换可分解为一个平移向量和一个旋转向量，这两个向量就是我们要求的相机外参。
+2. **相机外参标定**。PNP(perspective n points)算法用来求解**相机外参**，相机外参是相机坐标系相对于模型坐标系的变换。特别地，变换可分解为一个平移向量和一个旋转向量，这两个向量就是我们要求的相机外参。
    1. **固定相机位置**。(**特别注意**：如果相机位置改变，须**重新**走一遍步骤 2 ！)
    2. 在三维场景中放置若干(**最少4个**)方便精准定位(须定位三维坐标与像素坐标)的标识点。注： PNP 算法“背后”是**最小二乘优化**。因此，原则上标识点越多，相机外参计算也会越准确。
    3. 尽可能准确地测量标识点的三维坐标。例如下图，将三维坐标系原点设置为窗户左下角，并建立右手坐标系。测量并记录 1-6 点的三维坐标。
-        <img src="https://github.com/Zju-George/3DReconstructionExample/raw/main/assets/image.jpg" alt="HMI" width="640" height="480" align="bottom" />
+        <img src="https://github.com/Zju-George/ComputerVisionExamples/raw/main/example1-infer3D/assets/image.jpg" alt="HMI" width="640" height="480" align="bottom" />
    4. 调用固定好的相机拍一张图像，将图像保存至 `assets/pnp.png`。
    5. 进入 `src/` 目录，执行 `python 2dmarker.py`。`2dmarker.py` 会弹出一个窗口并加载 `assets/pnp.png`。当鼠标左键点击图像上某点，会显示点击位置的像素坐标，如下图所示。我们要做的就是获取上述标识点的像素坐标。另外，如须自动保存带像素坐标的图像，可执行 `python 2dmarker.py --save`，默认是不自动保存的。
-        <img src="https://github.com/Zju-George/3DReconstructionExample/raw/main/assets/2dmarker.png" alt="HMI" width="640" height="480" align="bottom" />
+        <img src="https://github.com/Zju-George/ComputerVisionExamples/raw/main/example1-infer3D/assets/2dmarker.png" alt="HMI" width="640" height="480" align="bottom" />
 3. 将步骤一获得的**相机内参**和步骤二的**坐标数据**填入 `reconstruction.py`。
     ```python
         # original data to prepare 
@@ -67,3 +67,6 @@
       1. 从摄像头那拿到当前图像。
       2. 定位到图像上要求的点(例如亮斑)的像素坐标 **(u, v)**。
       3. 在计算后返回该点的三维坐标 **(x, y, 0)**。
+
+## TODO:
+原则上相机内参标定和外参标定可以一步到位，但是具体**项目用的世界坐标系**和**棋盘格的世界坐标系**需要去**对应一下**。也就是说，上述步骤 2 的相机外参标定其实可以和步骤 1 结合，就不需要自己摆道具和测量了。
@@ -22,20 +22,27 @@ def __init__(self, camera_matrix=None, distortion_coeffs=None, model_points=None
 
 
 class Reconstruction(object):
-    def __init__(self, data=None, learning_rate=0.000002, thrsh=5, max_steps=100, image=None, draw=False):
+    def __init__(self, data=None, learning_rate=0.000002, thrsh=5, max_steps=100, draw=False):
         self.data = data
         self.learning_rate = learning_rate
         self.thrsh = thrsh
         self.max_steps = max_steps
-        self.image = image
         self.draw = draw
+        self.image = None
 
-        self.rotation_vector, self.translation_vector = self.solve_pnp()
+        self.rotation_vector, self.translation_vector = self.camera_extrinsics_calibration()
         self.target2D = np.zeros(2)
         self.coordinate3D = np.zeros(3)
         self.loss = 0.
 
-
+    def camera_extrinsics_calibration(self):
+        # TODO: could also try checkboard to simplify the extrinsics calibration
+        (success, rotation_vector, translation_vector) = cv2.solvePnP(self.data.model_points, 
+        self.data.image_points, self.data.camera_matrix, self.data.distortion_coeffs, flags=8)
+        # Logger.debug(f'rotation_vector:\n {rotation_vector}')
+        # Logger.debug(f'translation_vector:\n {translation_vector}')
+        return rotation_vector, translation_vector
+    
     def l2_loss(self, point1, point2):
         return (point1[0]-point2[0])**2 + (point1[1]-point2[1])**2
 
@@ -55,20 +62,17 @@ def compute_grad(self):
 
         return np.array([grad_x, grad_y])
 
-    def solve_pnp(self):
-        (success, rotation_vector, translation_vector) = cv2.solvePnP(self.data.model_points, 
-        self.data.image_points, self.data.camera_matrix, self.data.distortion_coeffs, flags=8)
-        # Logger.debug(f'rotation_vector:\n {rotation_vector}')
-        # Logger.debug(f'translation_vector:\n {translation_vector}')
-        return rotation_vector, translation_vector
-
     def init_guess(self):
-        # TODO: bilinear interpolation to init coordinate3D, now just init as (0., 0., 0.)
+        # TODO: use bilinear interpolation to init coordinate3D. right now for simplicity just init as (0., 0., 0.)
         self.coordinate3D = np.array([0., 0., 0.])
 
         point2D, _ = cv2.projectPoints(self.coordinate3D, 
         self.rotation_vector, self.translation_vector, self.data.camera_matrix, self.data.distortion_coeffs)
-        self.loss = self.l2_loss(point2D.reshape(-1), self.target2D)
+        point2D = point2D.reshape(-1)
+        if self.draw:
+            cv2.circle(self.image, (int(point2D[0]), int(point2D[1])), 2, (0, 255, 0), thickness = 2)
+            cv2.imshow('image', self.image)
+        self.loss = self.l2_loss(point2D, self.target2D)
 
     def opt_step(self):
         grad = self.compute_grad()
@@ -84,8 +88,9 @@ def opt_step(self):
             cv2.imshow('image', self.image)
         return
 
-    def opt(self, target2D):
+    def opt(self, img, target2D):
         start = time.time()
+        self.image = img
         self.target2D = target2D
         self.init_guess()
         steps = 0
@@ -95,7 +100,7 @@ def opt(self, target2D):
             if self.loss < self.thrsh:
                 break
         end = time.time()
-        Logger.info(f'coordinate3D result: {self.coordinate3D}; loss: {self.loss}; optimization steps: {steps}; time cost: {end-start}s')
+        Logger.info(f'coordinate3D result: {self.coordinate3D}\n loss: {self.loss}\n optimization steps: {steps}\n time cost: {end-start}s')
         return self.coordinate3D
 
     def hangon(self):
@@ -117,12 +122,11 @@ def hangon(self):
 
     # init data and reconstruction object
     data = ReconstructionData(camera_matrix=camera_matrix, distortion_coeffs=distortion_coeffs, model_points=model_points, image_points=image_points)
-    image = cv2.imread('../assets/pnp.png')
-    reconstruction = Reconstruction(data=data, image=image, draw=args.draw)
+    reconstruction = Reconstruction(data=data, draw=args.draw)
 
-    # TODO: target2D should be got at runtime (on the fly)
+    # TODO: target2D and image should be got at RUNTIME (eg. cv2.videoCapture(0))
     # TODO: 'while True' structure here to be done
+    image = cv2.imread('../assets/pnp.png')
     target2D = np.array([252, 257], dtype='double')
-    reconstruction.opt(target2D)
-    reconstruction.hangon()
-    
+    reconstruction.opt(image, target2D)
+    reconstruction.hangon()
@@ -0,0 +1,21 @@
+## Problem description
+
+- We have two images. One is the **reference image**, the other is the **scanned image**. For example, below left is a **reference image** and below right is the **scanned image**.
+
+<p align="middle">
+    <img src="https://github.com/Zju-George/ComputerVisionExamples/raw/main/example2-imageAlignment/assets/form.jpg" width="400"/>
+    <img src="https://github.com/Zju-George/ComputerVisionExamples/raw/main/example2-imageAlignment/assets/scan.jpg" width="400"/>
+</p>
+
+- The aim is to align the scanned image to the reference image.
+
+<p align="middle">
+    <img src="https://www.learnopencv.com/wp-content/uploads/2018/03/image-alignment-using-opencv.jpg">
+</p>
+
+## Install
+
+```cmd
+pip install -r requirements.txt
+```
+
@@ -0,0 +1,13 @@
+## Usage
+
+```text
+usage: main.py [-h] [--draw] [--nomask] [--ref REF] [--scan SCAN]
+```
+- For example, you can run `main.py` in the command line:
+    ```cmd
+    python main.py --draw --ref ../assets/form.jpg --scan ../assets/scan.jpg
+    ```
+- `--draw` means to draw the intermediate result.
+- `--nomask` means draw all the feature points matches, but the default is to only draw the inliers.
+- `--ref` means the path of the reference image.
+- `--scan` means the the path of the scanned image.
@@ -0,0 +1,94 @@
+import os
+import cv2
+import numpy as np
+import matplotlib.pyplot as plt 
+import argparse
+
+parser = argparse.ArgumentParser()
+parser.add_argument('--draw', action="store_true")
+parser.add_argument('--nomask', action="store_true")
+parser.add_argument('--ref', type=str, default='../assets/butterfly.png')
+parser.add_argument('--scan', type=str, default='../assets/scan1.jpg')
+args = parser.parse_args()
+
+MAX_MATCHES = 1000
+GOOD_MATCH_PERCENT = 0.6
+
+def pltImshow(title, img):
+  plt.title(title)
+  img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
+  plt.imshow(img)
+  plt.show()
+  return
+
+def alignImages(imgScan, imgRef):
+  print(f'Alignment configs:\nMAX_MATCHES={MAX_MATCHES}\nGOOD_MATCH_PERCENT={GOOD_MATCH_PERCENT}')
+
+  # Convert images to grayscale
+  imgScanGray = cv2.cvtColor(imgScan, cv2.COLOR_BGR2GRAY)
+  imgRefGray = cv2.cvtColor(imgRef, cv2.COLOR_BGR2GRAY)
+  
+  # Detect ORB features and compute descriptors
+  orb = cv2.ORB_create(MAX_MATCHES)
+  kpsScan, desScan = orb.detectAndCompute(imgScanGray, None)
+  kpsRef, desRef = orb.detectAndCompute(imgRefGray, None)
+  
+  # Match features
+  matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMINGLUT)
+  matches = matcher.match(desScan, desRef, None)
+
+  # Sort matches by score
+  matches.sort(key=lambda x: x.distance, reverse=False)
+
+  # Using only TOP-numGoodMatches matches
+  numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
+  matches = matches[:numGoodMatches]
+
+  # Draw top matches
+  imgMatches = cv2.drawMatches(imgScan, kpsScan, imgRef, kpsRef, matches, None)
+
+  # Extract location of good matches
+  pointsScan = np.zeros((len(matches), 2), dtype=np.float32)
+  pointsRef = np.zeros((len(matches), 2), dtype=np.float32)
+
+  for i, match in enumerate(matches):
+    pointsScan[i, :] = kpsScan[match.queryIdx].pt
+    pointsRef[i, :] = kpsRef[match.trainIdx].pt
+  
+  # Find homography. Possible flags: cv2.RANSAC
+  homographyMatrix, mask = cv2.findHomography(pointsScan, pointsRef, cv2.RHO) 
+  matchesMask = mask.ravel().tolist()
+
+  # Draw only inliers matches
+  if not args.nomask:
+    imgMatches = cv2.drawMatches(imgScan, kpsScan, imgRef, kpsRef, matches, None, matchesMask=matchesMask)
+
+  # Use homography to transform the image, also the target image size
+  height, width, channels = imgRef.shape
+  imgAlign = cv2.warpPerspective(imgScan, homographyMatrix, (width, height))
+  
+  return imgAlign, homographyMatrix, imgMatches
+
+
+if __name__ == '__main__':
+  # Read reference image
+  print(f'Reading reference image: {args.ref}')
+  imgRef = cv2.imread(args.ref, cv2.IMREAD_COLOR)
+
+  # Read scanned image to be aligned
+  print(f'Reading image to align: {args.scan}')
+  imgScan = cv2.imread(args.scan, cv2.IMREAD_COLOR)
+  
+  imgAlign, homographyMatrix, imgMatches = alignImages(imgScan, imgRef)
+  # print("Estimated homography: \n",  homographyMatrix)
+
+  # Save aligned image to disk.
+  outFilename = "aligned.jpg"
+  print("Saving aligned image: ", outFilename)
+  cv2.imwrite(outFilename, imgAlign)
+
+  # Stack imgAlign & imgRef
+  if args.draw:
+    pltImshow('Matches', imgMatches)
+    pltImshow('Aligned & Reference', np.hstack((imgAlign, imgRef)))
+