With ever-growing datasets and image resolutions, dimensionality reduction has grown increasingly popular in recent years - and for good cause. Efficient dimensionality reduction methods allow us to store more data, avoid overfitting (which can easily occur when dealing with millions of features as common in images), and train and run machine learning models in lower time and latency. Singular Value Decomposition, and specifically its application in low-rank approximations, is a common technique for reducing the dimensionality of images - and a go-to for many practitioners due to its simplicity, ease of implementation, and clear mathematical explainability. However, SVD is limited in (1) its ability to only find linear transformations of the input dataset and (2) its limited dimension reduction capabilities. This project exposes this second weakness of SVD. Specifically, using a very-basic dataset of hand images, it unveils that even on simple tasks, SVD fails to offer dimension reductions that are at par with more sophisticated non-linear machine learning methods like autoencoders. I show that rank-reduction methods would require 7x as much space per-image as one using an, auto-encoder. Next, I examined some of the additional overheads with using an autoencoder and identify use-cases where SVD for reduction would be preferred.
akshgarg7 / 104_project Goto Github PK
View Code? Open in Web Editor NEWExposing Weaknesses of SVD in Dimensionality Reduction For Large Datasets