Machine learning algorithm to classify music by metedata
Today, we'll be examining data compiled by a research group known as The Echo Nest. Our goal is to look through this dataset and classify songs as being either 'Hip-Hop' or 'Rock' - all without listening to a single one ourselves. In doing so, we will learn how to clean our data, do some exploratory data visualization, and use feature reduction towards the goal of feeding our data through some simple machine learning algorithms, such as decision trees and logistic regression. Hopefully with musical features we would be able to classify music genres with an acceptable accuracy.
- Load the Dataset from two various sources, examine the data and understand the features.
Merge
into a single Dataset. with requiredfeatures
and targetLabels
, "Hip-Hop"/"Rock".- Normalize data using
StandardScaler
to decrease bias. - Utlilize
PCA
to enhance model interpretibility and performence. - Develop
Decision Tree
andLogisitic Regression
to predict the desired class. Undersampling
to overcome class imbalances.- Apply
Cross Validation
to evaluate which model generalize better for the given dataset.