###Description
Instructions were to create an R Script to perform the following:
- Merges the training and the test sets to create one data set.
- Extracts only the measurements on the mean and standard deviation for each measurement.
- Uses descriptive activity names to name the activities in the data set
- Appropriately labels the data set with descriptive variable names.
- Creates a second, independent tidy data set with the average of each variable for each activity and each subject.
The output was an R Script named run_analysis.R and after running this script it produced the needed files in the output directory. This includes the "tidy_output.txt" and "tidy_mean_output.txt" file.
###How the run_analysis.R works Within the run_analysis.R file there are comments to describe what each lines of code are doing. But heres an overview of what happens when running that file:
- Imports needed libraries for processing the data. This includes: plyr and reshape2.
- Global variables are set to point to where the input data set is and where the output data set should be put into. Note if you change the file structure between running the run_analysis.R file you will need to update these constants.
- Imports the activityLabels data and the Features data into internal R Data structures.
- Imports the test and train data into an R data structure called a data.frame. This includes importing the subject data and merging that with the X and Y data.
- Cleans up the feature names so that they can be accessed using R Scripts
- Merges the test and train data by the subject_id. This data object will inevitably be exported to the tidy_output.txt file.
- Removes the columns of data for each subject and activity that aren't means or standard deviations
- Set the names of the columns to the filtered names we brought in in step #3
- merge the activity labels data with the primary data object that contains the train and test data so that the activitys are easier to understand.
- Generate the mean version of the output file called tidy_mean_ouput.txt. this will take the data that will be put into the tidy_ouput.txt file and group the data by the activity_name and subject_id and generate the mean of those values which mean that criteria.
- Final step: output the internal R Data Frames into their output text files.
###About the Input Data Input Data is within the UCI_HAR_Dataset Directory. For more information on this data set read the README file within that directory