The STP framework of marketing is the backbone of this machine learning project. Segmentation is done using unsupervised learning algorithms and supervised learning determines the most fruitful segments for targeting and positioning of products in the future.
For segmentation, hierarchical clustering is perfromed which return a linkage matrix, which then is plotted using dendrogram.
Perform K-means clustering. I have considered 1 to 10 clusters, so the for loop runs 10 iterations. In addition the algortihm runs at many different starting points(using init=kmeans++).
Note I figure out optimized number of clusters.
Now I create a new data frame with the original features and add the PCA scores and assigned clusters. PCA is used to reduce the dimension and get rid of unnecessary varaibles.
Lets now analyze segment proportions.