The analysis file performs the following order of operations:
- Read in the training data sets (X_train, y_train and subjects_train)
- Read in the test data sets (X_test, y_test and subjects_train)
- Combine the training and test data sets (data, activity_labels and subjects)
- Add the subject ID's (subjects) and activty labels (activity_labels) to the merged data
- Extract the proper feature names from the features.txt file and assign these names to the columns of merged data file
- Create a logic variable that equals True if the feature name contains either mean() or std()
- USe this logic variable to select only those rows of the data table that contain mean() and std() variables
- Replace the activity ID numbers by substituting them with their corresonding activity name (from the activity_labels.txt file)
- Clean up the column names of the data table, replacing all instances of '-' with '.' and all instances of '()' with '' (empty space
- Order the data table according to Subject IDs (ascending order)
- Split the data in a list accoring to the Subject ID and activity label
- Calculate the mean value for all features for each combination of Subject ID and activity label
- Reconstruct a table by rowbinding the vectors of the average values
- Recreate the SubjectId and Activity columns and add them to the table of average values
- Move the Subject ID and Activity columns to the front of the data table
- Save the final tidy data table to a txt file (tidy_data.txt)
- ???
- PROFIT!