Introduction to Caret with Titanic dataset
The caret package developed by Max Kuhn is short for Classification and Regression Training. It contains a set of functions that streamline the process of creating predictive models. It contains a set of packages to perform the following functions:
• data splitting
• pre-processing
• feature selection
• model tuning using resampling
• variable importance estimation
In this example I have tried to learn the basis of caret by implementing a few features on the Titanic dataset.
Data Pre-Processing with Caret
The below trasforms have been implemented on Iris and PimaIndiansDiabetes datasets:
• Data scaling
• Data centering
• Data standardization
• Data normalization
• The Box-Cox Transform
• The Yeo-Johnson Transform
• PCA (Principle Component Analysis) Transform
• ICA (Independent Componenet Analysis) Transform
References
• https://github.com/datasciencedojo/meetup/blob/master/intro_to_ml_with_r_and_caret/IntroToMachineLearning.R
• https://github.com/topepo/caret/tree/master/pkg/caret/R
• https://machinelearningmastery.com/pre-process-your-dataset-in-r