-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
6 changed files
with
319 additions
and
0 deletions.
There are no files selected for viewing
99 changes: 99 additions & 0 deletions
99
...assification-Data-driven Approach, k-Nearest Neighbor, train_val_test splits.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
# Image Classification-Data-driven Approach, k-Nearest Neighbor, train_val_test splits | ||
|
||
## image classification | ||
|
||
- challenges | ||
- viewpoint variation | ||
- scale variation | ||
- deformation | ||
- occlusion | ||
- illumination conditions | ||
- background clutter | ||
- intra-class variation | ||
- data-driven approach | ||
- the image classification pipeline | ||
- input | ||
- learning | ||
- training a classifier | ||
- learning a model | ||
- evaluation | ||
|
||
## Nearest Neighbor Classifier | ||
|
||
$$ | ||
d_1 (I_1, I_2) = \sum_{p} \left| I^p_1 - I^p_2 \right| | ||
$$ | ||
|
||
```python | ||
import numpy as np | ||
|
||
class NearestNeighbor(object): | ||
def **init**(self): | ||
pass | ||
|
||
def train(self, X, y): | ||
""" X is N x D where each row is an example. Y is 1-dimension of size N """ | ||
# the nearest neighbor classifier simply remembers all the training data | ||
self.Xtr = X | ||
self.ytr = y | ||
|
||
def predict(self, X): | ||
""" X is N x D where each row is an example we wish to predict label for """ | ||
num_test = X.shape[0] | ||
# lets make sure that the output type matches the input type | ||
Ypred = np.zeros(num_test, dtype = self.ytr.dtype) | ||
|
||
# loop over all test rows | ||
for i in range(num_test): | ||
# find the nearest training image to the i'th test image | ||
# using the L1 distance (sum of absolute value differences) | ||
distances = np.sum(np.abs(self.Xtr - X[i,:]), axis = 1) | ||
min_index = np.argmin(distances) # get the index with smallest distance | ||
Ypred[i] = self.ytr[min_index] # predict the label of the nearest example | ||
|
||
return Ypred | ||
``` | ||
|
||
$$ | ||
d_2 (I_1, I_2) = \sqrt{\sum_{p} \left( I^p_1 - I^p_2 \right)^2} | ||
$$ | ||
|
||
```python | ||
distances = np.sqrt(np.sum(np.square(self.Xtr - X[i,:]), axis = 1)) | ||
``` | ||
|
||
## k - Nearest Neighbor Classifier | ||
|
||
![[Pasted image 20241031202452.jpg]] | ||
|
||
## Validation sets for Hyperparameter tuning | ||
|
||
> Evaluate on the test only a single time, at the very end | ||
>Split your training set into training set and a validation set. Use validation set to tune all hyperparameters. At the end run a single time on the test set and report performance. | ||
- cross-validation | ||
- single calidation split | ||
![[Pasted image 20241031202849.png]] | ||
|
||
## Pros and Cons of Nearest Neighbor classifier | ||
|
||
- simple to implement and understand | ||
- take no time to train | ||
- however, pay a cost at test time | ||
|
||
>As an aside, the computational complexity of the Nearest Neighbor classifier is an active area of research, and several **Approximate Nearest Neighbor** (ANN) algorithms and libraries exist that can accelerate the nearest neighbor lookup in a dataset (e.g. [FLANN](https://github.com/mariusmuja/flann)). These algorithms allow one to trade off the correctness of the nearest neighbor retrieval with its space/time complexity during retrieval, and usually rely on a pre-processing/indexing stage that involves building a kdtree, or running the k-means algorithm. | ||
- $\displaystyle L_{2}$ isn't enough sensitive | ||
|
||
>In particular, note that images that are nearby each other are much more a function of the general color distribution of the images, or the type of background rather than their semantic identity. | ||
> [!note]+ Applying kNN in practice | ||
> 1. Preprocess your data: Normalize the features in your data (e.g. one pixel in images) to have zero mean and unit variance. We will cover this in more detail in later sections, and chose not to cover data normalization in this section because pixels in images are usually homogeneous and do not exhibit widely different distributions, alleviating the need for data normalization. | ||
> 2. If your data is very high-dimensional, consider using a dimensionality reduction technique such as PCA ([wiki ref](https://en.wikipedia.org/wiki/Principal_component_analysis), [CS229ref](http://cs229.stanford.edu/notes/cs229-notes10.pdf), [blog ref](https://web.archive.org/web/20150503165118/http://www.bigdataexaminer.com:80/understanding-dimensionality-reduction-principal-component-analysis-and-singular-value-decomposition/)), NCA ([wiki ref](https://en.wikipedia.org/wiki/Neighbourhood_components_analysis), [blog ref](https://kevinzakka.github.io/2020/02/10/nca/)), or even [Random Projections](https://scikit-learn.org/stable/modules/random_projection.html). | ||
> 3. Split your training data randomly into train/val splits. As a rule of thumb, between 70-90% of your data usually goes to the train split. This setting depends on how many hyperparameters you have and how much of an influence you expect them to have. If there are many hyperparameters to estimate, you should err on the side of having larger validation set to estimate them effectively. If you are concerned about the size of your validation data, it is best to split the training data into folds and perform cross-validation. If you can afford the computational budget it is always safer to go with cross-validation (the more folds the better, but more expensive). | ||
> 4. Train and evaluate the kNN classifier on the validation data (for all folds, if doing cross-validation) for many choices of **k** (e.g. the more the better) and across different distance types (L1 and L2 are good candidates) | ||
> 5. If your kNN classifier is running too long, consider using an Approximate Nearest Neighbor library (e.g. [FLANN](https://github.com/mariusmuja/flann)) to accelerate the retrieval (at cost of some accuracy). | ||
> 6. Take note of the hyperparameters that gave the best results. There is a question of whether you should use the full training set with the best hyperparameters, since the optimal hyperparameters might change if you were to fold the validation data into your training set (since the size of the data would be larger). In practice it is cleaner to not use the validation data in the final classifier and consider it to be _burned_ on estimating the hyperparameters. Evaluate the best model on the test set. Report the test set accuracy and declare the result to be the performance of the kNN classifier on your data. | ||
[[more about Machine Learing]] |
23 changes: 23 additions & 0 deletions
23
docs/AI/CS231n/Linear classification-Support Vector Machine, Softmax.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# Linear classification-Support Vector Machine, Softmax | ||
|
||
## Linear Classifiaction | ||
|
||
$$ | ||
L_i = \sum_{j\neq y_i} \max(0, s_j - s_{y_i} + \Delta) | ||
$$ | ||
|
||
$$ | ||
L = \frac{1}{N} \sum_i \sum_{j\neq y_i} \left[ \max(0, f(x_i; W)_j - f(x_i; W)_{y_i} + \Delta) \right] + \lambda \sum_k\sum_l W_{k,l}^2 | ||
$$ | ||
|
||
$$ | ||
L_i = -\log\left(\frac{e^{f_{y_i}}}{ \sum_j e^{f_j} }\right) \hspace{0.5in} \text{or equivalently} \hspace{0.5in} L_i = -f_{y_i} + \log\sum_j e^{f_j} | ||
$$ | ||
|
||
$$ | ||
\frac{e^{f_{y_i}}}{\sum_j e^{f_j}} | ||
= \frac{Ce^{f_{y_i}}}{C\sum_j e^{f_j}} | ||
= \frac{e^{f_{y_i} + \log C}}{\sum_j e^{f_j + \log C}} | ||
$$ | ||
|
||
![[Pasted image 20241031210509.png]] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,147 @@ | ||
## Python | ||
|
||
### string | ||
|
||
```python | ||
s = "hello" | ||
print(s.capitalize()) # Capitalize a string; prints "Hello" | ||
print(s.upper()) # Convert a string to uppercase; prints "HELLO" | ||
print(s.rjust(7)) # Right-justify a string, padding with spaces; prints " hello" | ||
print(s.center(7)) # Center a string, padding with spaces; prints " hello " | ||
print(s.replace('l', '(ell)')) # Replace all instances of one substring with another; | ||
# prints "he(ell)(ell)o" | ||
print(' world '.strip()) # Strip leading and trailing whitespace; prints "world" | ||
``` | ||
|
||
### Containers | ||
|
||
```python | ||
animals = ['cat', 'dog', 'monkey'] | ||
for idx, animal in enumerate(animals): | ||
print('#%d: %s' % (idx + 1, animal)) | ||
# Prints "#1: cat", "#2: dog", "#3: monkey", each on its own line | ||
``` | ||
|
||
列表推导式 | ||
|
||
```python | ||
nums = [0, 1, 2, 3, 4] | ||
even_squares = [x ** 2 for x in nums if x % 2 == 0] | ||
print(even_squares) # Prints "[0, 4, 16]" | ||
``` | ||
|
||
同样也有字典推导式 | ||
|
||
Tuples 可以用作字典中的键和集合的元素,但是 lists 不能 | ||
|
||
## Numpy | ||
|
||
```python | ||
import numpy as np | ||
|
||
# Create a new array from which we will select elements | ||
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]]) | ||
|
||
print(a) # prints "array([[ 1, 2, 3], | ||
# [ 4, 5, 6], | ||
# [ 7, 8, 9], | ||
# [10, 11, 12]])" | ||
|
||
# Create an array of indices | ||
b = np.array([0, 2, 0, 1]) | ||
|
||
# Select one element from each row of a using the indices in b | ||
print(a[np.arange(4), b]) # Prints "[ 1 6 7 11]" | ||
|
||
# Mutate one element from each row of a using the indices in b | ||
a[np.arange(4), b] += 10 | ||
|
||
print(a) # prints "array([[11, 2, 3], | ||
# [ 4, 5, 16], | ||
# [17, 8, 9], | ||
# [10, 21, 12]]) | ||
``` | ||
|
||
```python | ||
import numpy as np | ||
|
||
a = np.array([[1,2], [3, 4], [5, 6]]) | ||
|
||
bool_idx = (a > 2) # Find the elements of a that are bigger than 2; | ||
# this returns a numpy array of Booleans of the same | ||
# shape as a, where each slot of bool_idx tells | ||
# whether that element of a is > 2. | ||
|
||
print(bool_idx) # Prints "[[False False] | ||
# [ True True] | ||
# [ True True]]" | ||
|
||
# We use boolean array indexing to construct a rank 1 array | ||
# consisting of the elements of a corresponding to the True values | ||
# of bool_idx | ||
print(a[bool_idx]) # Prints "[3 4 5 6]" | ||
|
||
# We can do all of the above in a single concise statement: | ||
print(a[a > 2]) # Prints "[3 4 5 6]" | ||
``` | ||
|
||
```python | ||
x = np.array([1, 2], dtype=np.int64) # Force a particular datatype | ||
print(x.dtype) # Prints "int64" | ||
``` | ||
|
||
```python | ||
import numpy as np | ||
|
||
x = np.array([[1,2],[3,4]], dtype=np.float64) | ||
y = np.array([[5,6],[7,8]], dtype=np.float64) | ||
|
||
# Elementwise sum; both produce the array | ||
# [[ 6.0 8.0] | ||
# [10.0 12.0]] | ||
print(x + y) | ||
print(np.add(x, y)) | ||
|
||
# Elementwise difference; both produce the array | ||
# [[-4.0 -4.0] | ||
# [-4.0 -4.0]] | ||
print(x - y) | ||
print(np.subtract(x, y)) | ||
|
||
# Elementwise product; both produce the array | ||
# [[ 5.0 12.0] | ||
# [21.0 32.0]] | ||
print(x * y) | ||
print(np.multiply(x, y)) | ||
|
||
# Elementwise division; both produce the array | ||
# [[ 0.2 0.33333333] | ||
# [ 0.42857143 0.5 ]] | ||
print(x / y) | ||
print(np.divide(x, y)) | ||
|
||
# Elementwise square root; produces the array | ||
# [[ 1. 1.41421356] | ||
# [ 1.73205081 2. ]] | ||
print(np.sqrt(x)) | ||
``` | ||
|
||
广播可以避免循环 | ||
|
||
```python | ||
import numpy as np | ||
|
||
# We will add the vector v to each row of the matrix x, | ||
# storing the result in the matrix y | ||
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]]) | ||
v = np.array([1, 0, 1]) | ||
y = x + v # Add v to each row of x using broadcasting | ||
print(y) # Prints "[[ 2 2 4] | ||
# [ 5 5 7] | ||
# [ 8 8 10] | ||
# [11 11 13]]" | ||
``` | ||
|
||
## SciPy | ||
|
||
## Matplotlib |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# 2024 年第 44 周周结 | ||
|
||
10 月 28 日 - 11 月 03 日 | ||
|
||
- 总结 | ||
- CS231n | ||
- lecture 听了 01-04 | ||
- 完成了 [Assignment1](https://github.com/WncFht/CS231n/tree/main/assignment1) | ||
- 做了部分的[笔记](../AI/CS231n/Python%20Numpy.md) | ||
- 主要是不太熟练 numpy 之类的运用,vectorize 还是有点难度。不过之前做过一些 EECS498 的作业就还好。 | ||
- 周末两天打了个物理的比赛 | ||
- 熟练了 LaTeX 和 Python,draw.io 作图 | ||
- 论文在[这里](351A.pdf) | ||
- 计划 | ||
- 下周期中考,复习一下 | ||
- RM 接了个小任务,要求识别视频中每一帧的固定位置的一个数字并插值计算积分 | ||
- 11 月 4 日用预处理+tesseract OCR | ||
- 这里是[第一版](https://github.com/WncFht/Power-OCR-Video-Analyzer)捏 | ||
- 打算标记一下数据,训练个小模型试试。 | ||
- 好吧,尝试了一下效果很差,用了个小的 CNN 啥都识别不出来 | ||
- RM 被分进了自动化兑矿的车组,要求实现 6Dpose 识别+机械臂运动规划 | ||
- 下周内要求提出具体方案并立项 | ||
- 从深度相机和激光相机两个方向去找,要去看看论文 | ||
- 继续学习 CS231n | ||
- 打算写完 Assignment 1 的笔记和题解。 | ||
- 把 Assignment 2 做完 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
# 2024 年第 45 周周结 | ||
|
||
11 月 04 日 - 11 月 10 日 | ||
|
||
- 总结 | ||
- CS231n | ||
- 复习了一下 Assignment 1 的代码,整理了一下框架 | ||
- 去看了一些李宏毅的深度学习教程 | ||
- 回看了一下 Dive into Deep Learning,好像代码还是调用了很多高级 api,比如 backforward 就不是从零实现的。 | ||
- 期中考试结束 | ||
- RM 校内赛造车,实现底盘运动(麦轮解算代码 + 底盘安装)。迭代了三版保险杠。通过中期检查。(这个真废时间,不过学了电控和 solidwork) | ||
- RM 项目 | ||
- 搜索了 6Dof 识别相关模型。排除了所有基于 Transformer 的模型(参数太多,算力需求大,不能在 Orin NX 上在 1s 内跑完)。 | ||
- 最终大概选择用 FFB6D pose,他是基于李飞飞之前的 DenseFusion 改进的。用 RGB 和 D 的信息进行特征识别 + 融合 + 投票的模型。 | ||
- 大致尝试部署了一下,但是环境出了很多问题。 | ||
- 上周**功率识别**的任务完成,最终用了预处理 + tesseract OCR,但是对红色数字的识别不是很好(因为视频本身就很模糊)。顺带去学了一下 PR 。 | ||
- 代码在[这里](https://github.com/WncFht/Power-OCR-Video-Analyzer) | ||
- 计划 | ||
- 复现 FFB6D pose ,制作数据集。 | ||
- 顺带学一下 docker 啥的。 | ||
- 做完 Assignment 2,上周的 flag 倒了 :( (真没时间了) | ||
- RM 校内赛,周三之前画完图,给厂家加工。实现捡球+扔球的功能。 | ||
- 完善功率识别的任务,传统方法似乎精度不太够用。 | ||
- 调整作息! |
Binary file not shown.