Performing Principal Component Analysis (PCA) - Lab

Introduction

Now that you have a high-level overview of PCA, as well as some of the details of the algorithm itself, it's time to practice implementing PCA on your own using the NumPy package.

Objectives

You will be able to:

Implement PCA from scratch using NumPy

Import the data

Import the data stored in the file 'foodusa.csv' (set index_col=0)
Print the first five rows of the DataFrame

import pandas as pd
data = None

Normalize the data

Next, normalize your data by subtracting the mean from each of the columns.

data = None
data.head()

Calculate the covariance matrix

The next step is to calculate the covariance matrix for your normalized data.

cov_mat = None
cov_mat

Calculate the eigenvectors

Next, calculate the eigenvectors and eigenvalues for your covariance matrix.

import numpy as np
eig_values, eig_vectors = None

Sort the eigenvectors

Great! Now that you have the eigenvectors and their associated eigenvalues, sort the eigenvectors based on their eigenvalues to determine primary components!

# Get the index values of the sorted eigenvalues
e_indices = None

# Sort 
eigenvectors_sorted = None
eigenvectors_sorted

Reprojecting the data

Finally, reproject the dataset using your eigenvectors. Reproject this dataset down to 2 dimensions.

Summary

Well done! You've now coded PCA on your own using NumPy! With that, it's time to look at further applications of PCA.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
.learn		.learn
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
foodusa.csv		foodusa.csv
index.ipynb		index.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Performing Principal Component Analysis (PCA) - Lab

Introduction

Objectives

Import the data

Normalize the data

Calculate the covariance matrix

Calculate the eigenvectors

Sort the eigenvectors

Reprojecting the data

Summary

About

Releases

Packages

Contributors 7

Languages

License

learn-co-curriculum/dsc-pca-numpy-lab

Folders and files

Latest commit

History

Repository files navigation

Performing Principal Component Analysis (PCA) - Lab

Introduction

Objectives

Import the data

Normalize the data

Calculate the covariance matrix

Calculate the eigenvectors

Sort the eigenvectors

Reprojecting the data

Summary

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages