Content Enhancement : KNN #14

AM1CODES · 2021-04-03T12:04:16Z

The below mentioned additions are required -

Math - We need to add a solved example to show the working of the algorithm. To be precise, a pen and paper example of the math behind the algorithm with help of images like we have used for other algorithms.

sudharsansrini · 2021-08-31T11:10:15Z

KNN - K Nearest Neighbor

KNN algorithm is mainly used to classify and predicting the values in Machine Learning. We can use it for Supervised Machine Learning problems(Labeled) such that Classification and Regression.

KNN
K - constant value
NN - Nearest Neighbor

KNN is simply , Finding the K Nearest Neighbor of the data point and classifying the data points with maximum number of classes of the neighbors

We can find the value of K by finding the mean of the error in each records and plotting as graph
K is the value where the mean error rate is lesser

error =[]
for i in range(1, 40):
    model = KNeighborsClassifier(n_neighbors=i)
    model.fit(x_train, y_train)
    pred_i = model.predict(x_test)
    error.append(np.mean(pred_i != y_test))

plt.figure(figsize=(15, 15))
plt.plot(range(1, 40), error, color = "red", linestyle = "dashed", marker="o", markerfacecolor="blue", markersize=10)
plt.title('Error Rate K value')
plt.xlabel('K value')
plt.ylabel('Mean Error')
plt.show()

Finding K -> Finding how many neighbors that we have to choose around the data point

Value of k = less error rate
After choosing K value,
You may have a question like,
How will it find the k nearest points ?

Distance of the data points will be calculated in KNN using Minkowski distance.

Based on Minkowski Distance metric, We gonna classify the data points.
p=1, Manhattan Distance
p=2, Euclidean Distance

Manhattan Distance

This is the distance between real vectors using sum of their absolute difference.

Euclidean Distance

Euclidean distance is calculated as the square root of sum of the squared differences between a new point (x) & existing point y

Hamming Distance

It is used for categorical variables. If the value x and the value y are the same, the distance d will be equal to 0. Otherwise distance equal to 1

The above methods are used to find the distance and to finding the k closest neighbor of that particular data point.
With the classes of neighbors, We can classify the data point with maximum neighbor's class

Thank you!

AM1CODES · 2021-08-31T12:24:08Z

Hey @sudharsansrini the content that you mentioned above looks really amazing and we could incorporate it in the main platform. I am pretty sure that writing this would have taken a good amount of time and i request you to make a PR for the same because otherwise it won't get counted as your contribution. Feel free to join our discord if you need any help with how to contribute and stuff like that.

AM1CODES added Improvements Math Classification labels Apr 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Content Enhancement : KNN #14

Content Enhancement : KNN #14

AM1CODES commented Apr 3, 2021

sudharsansrini commented Aug 31, 2021 •

edited

Loading

AM1CODES commented Aug 31, 2021

Content Enhancement : KNN #14

Content Enhancement : KNN #14

Comments

AM1CODES commented Apr 3, 2021

sudharsansrini commented Aug 31, 2021 • edited Loading

KNN - K Nearest Neighbor

Manhattan Distance

Euclidean Distance

Hamming Distance

AM1CODES commented Aug 31, 2021

sudharsansrini commented Aug 31, 2021 •

edited

Loading