Skip to content

Commit 8b9d176

Browse files
author
孙啸寒
committed
update code and README
1 parent d9386d6 commit 8b9d176

File tree

23 files changed

+177
-113
lines changed

23 files changed

+177
-113
lines changed

README.md

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
## This repo contains several computer vision examples.
2+
3+
- Example 1: Inferring the 3D coordinates of a point on the plane
4+
- Example 2: Image Alignment using feature-points-matching-method(recommended) and line-detection-method.
5+
6+
## Install
7+
8+
- Download
9+
```bash
10+
git clone [email protected]:Zju-George/ComputerVisionExamples.git
11+
```
12+
- Synchronous update
13+
```bash
14+
git pull
15+
```

example1-infer3D/README.md

+9-6
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## Example1:估计平面一点的3D坐标
1+
## 推测平面一点的3D坐标
22

33
### 问题描述
44

@@ -25,8 +25,8 @@
2525

2626
### 离线步骤(Offline steps)
2727

28-
1. 利用张正友相机标定法求出**相机内参**,包括相机焦距和畸变系数
29-
1. 打印下方棋盘格图片,尽量**平铺**于平面上,接着用相机从不同方位拍几张图片。<img src="https://github.com/Zju-George/3DReconstructionExample/raw/main/assets/checkerboard.png" alt="HMI" width="433" height="305" align="bottom" />
28+
1. **相机内参标定**
29+
1. 打印下方棋盘格图片,尽量**平铺**于平面上,接着用相机从不同方位拍几张图片。<img src="https://github.com/Zju-George/ComputerVisionExamples/raw/main/example1-infer3D/assets/checkerboard.png" alt="HMI" width="433" height="305" align="bottom" />
3030

3131
2. 将拍的 jpg 图片放置于 `assets/` 下。
3232
3. 进入 `src/` 目录,执行 `python calibration.py`
@@ -37,14 +37,14 @@
3737
第一个返回值 `ret` 存储着相机标定的重投影误差(reprojection error),值越小意味着标定越精确。一般来说`ret` 不应大于 2,但如果值大于 5,大概率是上述某步做得有问题。第二个返回值 `mtx` 是相机投影矩阵,第三个返回值 `dist` 是相机的畸变系数。
3838

3939

40-
2. 准备PNP(perspective n points)算法所需要的数据。PNP算法用来求解**相机外参**,相机外参是相机坐标系相对于模型坐标系的变换。特别地,变换可分解为一个平移向量和一个旋转向量,这两个向量就是我们要求的相机外参。
40+
2. **相机外参标定**。PNP(perspective n points)算法用来求解**相机外参**,相机外参是相机坐标系相对于模型坐标系的变换。特别地,变换可分解为一个平移向量和一个旋转向量,这两个向量就是我们要求的相机外参。
4141
1. **固定相机位置**。(**特别注意**:如果相机位置改变,须**重新**走一遍步骤 2 !)
4242
2. 在三维场景中放置若干(**最少4**)方便精准定位(须定位三维坐标与像素坐标)的标识点。注: PNP 算法“背后”是**最小二乘优化**。因此,原则上标识点越多,相机外参计算也会越准确。
4343
3. 尽可能准确地测量标识点的三维坐标。例如下图,将三维坐标系原点设置为窗户左下角,并建立右手坐标系。测量并记录 1-6 点的三维坐标。
44-
<img src="https://github.com/Zju-George/3DReconstructionExample/raw/main/assets/image.jpg" alt="HMI" width="640" height="480" align="bottom" />
44+
<img src="https://github.com/Zju-George/ComputerVisionExamples/raw/main/example1-infer3D/assets/image.jpg" alt="HMI" width="640" height="480" align="bottom" />
4545
4. 调用固定好的相机拍一张图像,将图像保存至 `assets/pnp.png`
4646
5. 进入 `src/` 目录,执行 `python 2dmarker.py``2dmarker.py` 会弹出一个窗口并加载 `assets/pnp.png`。当鼠标左键点击图像上某点,会显示点击位置的像素坐标,如下图所示。我们要做的就是获取上述标识点的像素坐标。另外,如须自动保存带像素坐标的图像,可执行 `python 2dmarker.py --save`,默认是不自动保存的。
47-
<img src="https://github.com/Zju-George/3DReconstructionExample/raw/main/assets/2dmarker.png" alt="HMI" width="640" height="480" align="bottom" />
47+
<img src="https://github.com/Zju-George/ComputerVisionExamples/raw/main/example1-infer3D/assets/2dmarker.png" alt="HMI" width="640" height="480" align="bottom" />
4848
3. 将步骤一获得的**相机内参**和步骤二的**坐标数据**填入 `reconstruction.py`
4949
```python
5050
# original data to prepare
@@ -67,3 +67,6 @@
6767
1. 从摄像头那拿到当前图像。
6868
2. 定位到图像上要求的点(例如亮斑)的像素坐标 **(u, v)**
6969
3. 在计算后返回该点的三维坐标 **(x, y, 0)**
70+
71+
## TODO:
72+
原则上相机内参标定和外参标定可以一步到位,但是具体**项目用的世界坐标系****棋盘格的世界坐标系**需要去**对应一下**。也就是说,上述步骤 2 的相机外参标定其实可以和步骤 1 结合,就不需要自己摆道具和测量了。

example1-infer3D/src/reconstruction.py

+25-21
Original file line numberDiff line numberDiff line change
@@ -22,20 +22,27 @@ def __init__(self, camera_matrix=None, distortion_coeffs=None, model_points=None
2222

2323

2424
class Reconstruction(object):
25-
def __init__(self, data=None, learning_rate=0.000002, thrsh=5, max_steps=100, image=None, draw=False):
25+
def __init__(self, data=None, learning_rate=0.000002, thrsh=5, max_steps=100, draw=False):
2626
self.data = data
2727
self.learning_rate = learning_rate
2828
self.thrsh = thrsh
2929
self.max_steps = max_steps
30-
self.image = image
3130
self.draw = draw
31+
self.image = None
3232

33-
self.rotation_vector, self.translation_vector = self.solve_pnp()
33+
self.rotation_vector, self.translation_vector = self.camera_extrinsics_calibration()
3434
self.target2D = np.zeros(2)
3535
self.coordinate3D = np.zeros(3)
3636
self.loss = 0.
3737

38-
38+
def camera_extrinsics_calibration(self):
39+
# TODO: could also try checkboard to simplify the extrinsics calibration
40+
(success, rotation_vector, translation_vector) = cv2.solvePnP(self.data.model_points,
41+
self.data.image_points, self.data.camera_matrix, self.data.distortion_coeffs, flags=8)
42+
# Logger.debug(f'rotation_vector:\n {rotation_vector}')
43+
# Logger.debug(f'translation_vector:\n {translation_vector}')
44+
return rotation_vector, translation_vector
45+
3946
def l2_loss(self, point1, point2):
4047
return (point1[0]-point2[0])**2 + (point1[1]-point2[1])**2
4148

@@ -55,20 +62,17 @@ def compute_grad(self):
5562

5663
return np.array([grad_x, grad_y])
5764

58-
def solve_pnp(self):
59-
(success, rotation_vector, translation_vector) = cv2.solvePnP(self.data.model_points,
60-
self.data.image_points, self.data.camera_matrix, self.data.distortion_coeffs, flags=8)
61-
# Logger.debug(f'rotation_vector:\n {rotation_vector}')
62-
# Logger.debug(f'translation_vector:\n {translation_vector}')
63-
return rotation_vector, translation_vector
64-
6565
def init_guess(self):
66-
# TODO: bilinear interpolation to init coordinate3D, now just init as (0., 0., 0.)
66+
# TODO: use bilinear interpolation to init coordinate3D. right now for simplicity just init as (0., 0., 0.)
6767
self.coordinate3D = np.array([0., 0., 0.])
6868

6969
point2D, _ = cv2.projectPoints(self.coordinate3D,
7070
self.rotation_vector, self.translation_vector, self.data.camera_matrix, self.data.distortion_coeffs)
71-
self.loss = self.l2_loss(point2D.reshape(-1), self.target2D)
71+
point2D = point2D.reshape(-1)
72+
if self.draw:
73+
cv2.circle(self.image, (int(point2D[0]), int(point2D[1])), 2, (0, 255, 0), thickness = 2)
74+
cv2.imshow('image', self.image)
75+
self.loss = self.l2_loss(point2D, self.target2D)
7276

7377
def opt_step(self):
7478
grad = self.compute_grad()
@@ -84,8 +88,9 @@ def opt_step(self):
8488
cv2.imshow('image', self.image)
8589
return
8690

87-
def opt(self, target2D):
91+
def opt(self, img, target2D):
8892
start = time.time()
93+
self.image = img
8994
self.target2D = target2D
9095
self.init_guess()
9196
steps = 0
@@ -95,7 +100,7 @@ def opt(self, target2D):
95100
if self.loss < self.thrsh:
96101
break
97102
end = time.time()
98-
Logger.info(f'coordinate3D result: {self.coordinate3D}; loss: {self.loss}; optimization steps: {steps}; time cost: {end-start}s')
103+
Logger.info(f'coordinate3D result: {self.coordinate3D}\n loss: {self.loss}\n optimization steps: {steps}\n time cost: {end-start}s')
99104
return self.coordinate3D
100105

101106
def hangon(self):
@@ -117,12 +122,11 @@ def hangon(self):
117122

118123
# init data and reconstruction object
119124
data = ReconstructionData(camera_matrix=camera_matrix, distortion_coeffs=distortion_coeffs, model_points=model_points, image_points=image_points)
120-
image = cv2.imread('../assets/pnp.png')
121-
reconstruction = Reconstruction(data=data, image=image, draw=args.draw)
125+
reconstruction = Reconstruction(data=data, draw=args.draw)
122126

123-
# TODO: target2D should be got at runtime (on the fly)
127+
# TODO: target2D and image should be got at RUNTIME (eg. cv2.videoCapture(0))
124128
# TODO: 'while True' structure here to be done
129+
image = cv2.imread('../assets/pnp.png')
125130
target2D = np.array([252, 257], dtype='double')
126-
reconstruction.opt(target2D)
127-
reconstruction.hangon()
128-
131+
reconstruction.opt(image, target2D)
132+
reconstruction.hangon()
Binary file not shown.

example2-alignment/warp-perspective-method/main.py

-86
This file was deleted.

example2-imageAlignment/README.md

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
## Problem description
2+
3+
- We have two images. One is the **reference image**, the other is the **scanned image**. For example, below left is a **reference image** and below right is the **scanned image**.
4+
5+
<p align="middle">
6+
<img src="https://github.com/Zju-George/ComputerVisionExamples/raw/main/example2-imageAlignment/assets/form.jpg" width="400"/>
7+
<img src="https://github.com/Zju-George/ComputerVisionExamples/raw/main/example2-imageAlignment/assets/scan.jpg" width="400"/>
8+
</p>
9+
10+
- The aim is to align the scanned image to the reference image.
11+
12+
<p align="middle">
13+
<img src="https://www.learnopencv.com/wp-content/uploads/2018/03/image-alignment-using-opencv.jpg">
14+
</p>
15+
16+
## Install
17+
18+
```cmd
19+
pip install -r requirements.txt
20+
```
21+
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
## Usage
2+
3+
```text
4+
usage: main.py [-h] [--draw] [--nomask] [--ref REF] [--scan SCAN]
5+
```
6+
- For example, you can run `main.py` in the command line:
7+
```cmd
8+
python main.py --draw --ref ../assets/form.jpg --scan ../assets/scan.jpg
9+
```
10+
- `--draw` means to draw the intermediate result.
11+
- `--nomask` means draw all the feature points matches, but the default is to only draw the inliers.
12+
- `--ref` means the path of the reference image.
13+
- `--scan` means the the path of the scanned image.
Loading
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
import os
2+
import cv2
3+
import numpy as np
4+
import matplotlib.pyplot as plt
5+
import argparse
6+
7+
parser = argparse.ArgumentParser()
8+
parser.add_argument('--draw', action="store_true")
9+
parser.add_argument('--nomask', action="store_true")
10+
parser.add_argument('--ref', type=str, default='../assets/butterfly.png')
11+
parser.add_argument('--scan', type=str, default='../assets/scan1.jpg')
12+
args = parser.parse_args()
13+
14+
MAX_MATCHES = 1000
15+
GOOD_MATCH_PERCENT = 0.6
16+
17+
def pltImshow(title, img):
18+
plt.title(title)
19+
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
20+
plt.imshow(img)
21+
plt.show()
22+
return
23+
24+
def alignImages(imgScan, imgRef):
25+
print(f'Alignment configs:\nMAX_MATCHES={MAX_MATCHES}\nGOOD_MATCH_PERCENT={GOOD_MATCH_PERCENT}')
26+
27+
# Convert images to grayscale
28+
imgScanGray = cv2.cvtColor(imgScan, cv2.COLOR_BGR2GRAY)
29+
imgRefGray = cv2.cvtColor(imgRef, cv2.COLOR_BGR2GRAY)
30+
31+
# Detect ORB features and compute descriptors
32+
orb = cv2.ORB_create(MAX_MATCHES)
33+
kpsScan, desScan = orb.detectAndCompute(imgScanGray, None)
34+
kpsRef, desRef = orb.detectAndCompute(imgRefGray, None)
35+
36+
# Match features
37+
matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMINGLUT)
38+
matches = matcher.match(desScan, desRef, None)
39+
40+
# Sort matches by score
41+
matches.sort(key=lambda x: x.distance, reverse=False)
42+
43+
# Using only TOP-numGoodMatches matches
44+
numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
45+
matches = matches[:numGoodMatches]
46+
47+
# Draw top matches
48+
imgMatches = cv2.drawMatches(imgScan, kpsScan, imgRef, kpsRef, matches, None)
49+
50+
# Extract location of good matches
51+
pointsScan = np.zeros((len(matches), 2), dtype=np.float32)
52+
pointsRef = np.zeros((len(matches), 2), dtype=np.float32)
53+
54+
for i, match in enumerate(matches):
55+
pointsScan[i, :] = kpsScan[match.queryIdx].pt
56+
pointsRef[i, :] = kpsRef[match.trainIdx].pt
57+
58+
# Find homography. Possible flags: cv2.RANSAC
59+
homographyMatrix, mask = cv2.findHomography(pointsScan, pointsRef, cv2.RHO)
60+
matchesMask = mask.ravel().tolist()
61+
62+
# Draw only inliers matches
63+
if not args.nomask:
64+
imgMatches = cv2.drawMatches(imgScan, kpsScan, imgRef, kpsRef, matches, None, matchesMask=matchesMask)
65+
66+
# Use homography to transform the image, also the target image size
67+
height, width, channels = imgRef.shape
68+
imgAlign = cv2.warpPerspective(imgScan, homographyMatrix, (width, height))
69+
70+
return imgAlign, homographyMatrix, imgMatches
71+
72+
73+
if __name__ == '__main__':
74+
# Read reference image
75+
print(f'Reading reference image: {args.ref}')
76+
imgRef = cv2.imread(args.ref, cv2.IMREAD_COLOR)
77+
78+
# Read scanned image to be aligned
79+
print(f'Reading image to align: {args.scan}')
80+
imgScan = cv2.imread(args.scan, cv2.IMREAD_COLOR)
81+
82+
imgAlign, homographyMatrix, imgMatches = alignImages(imgScan, imgRef)
83+
# print("Estimated homography: \n", homographyMatrix)
84+
85+
# Save aligned image to disk.
86+
outFilename = "aligned.jpg"
87+
print("Saving aligned image: ", outFilename)
88+
cv2.imwrite(outFilename, imgAlign)
89+
90+
# Stack imgAlign & imgRef
91+
if args.draw:
92+
pltImshow('Matches', imgMatches)
93+
pltImshow('Aligned & Reference', np.hstack((imgAlign, imgRef)))
94+
File renamed without changes.

0 commit comments

Comments
 (0)