This post was adapted from a paper I originally wrote and extended for a school project. The full notebook can be found as a .ipynb file on my GitHub.

Implementing PCA and Dual PCA on CIFAR-10

submited by
Style Pass
2024-10-06 15:00:03

This post was adapted from a paper I originally wrote and extended for a school project. The full notebook can be found as a .ipynb file on my GitHub. The post assumes some background knowledge of linear algebra and eigenvalue decomposition. If you don’t have these prerequisites, I highly recommend watching 3Blue1Brown’s playlist on linear algebra.

Principal Component Analysis (PCA) is an important part of machine learning and data science, as it allows you to compress data into fewer dimensions. Later in this post, we see images that were originally in 3072 dimensions being represented in just two. Despite this, major patterns and trends are “summarized” into features such that, even with only two dimensions, we can predict with higher accuracy what is in an image.

The objective of this notebook is to implement Principal Component Analysis (PCA) and Dual PCA on the CIFAR-10 dataset and compare their computational efficiency by measuring the time taken to compute the principal components.

Leave a Comment