Skip to content

rraghav13/Cleaning-Data-Course-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

The Course project requires to create a tidy data set from the data collected from accelerometers of Samsung galaxy s smartphones. The process requires to load, extract and the data sets to create the tidy set asked.

The run_analysis.R script if run on the data given, will produce a tidy data with the average of each variable for each activity and each subject.

The steps taken in the run_analysis.R script to achieve this goal are as follows:

  • Load the libraries reshape2 and dplyr as their functions would be used
  • Load all the data sets relevant to the train and test data sets.
    • Xtrain, Xtest, Ytrain, Ytest, subjecttrain, subjecttest, features, activitylabels
  • Bind the subjects and activities of test and train data sets to their respective main data sets
  • Using features which contains the variable names, name both the train and the test data set
  • Merge both the train and test sets using rbind
  • Now, use the activity numbers in the merged set and use the match function to create a factor variable having the activity labels for all the measurements.
  • Bind the new activity variable with the merged data set and remove the old activity numbers. Also, name the subject and activity columns
  • Now, subset the mean and standard deviation measurements from the merged data set and create a new data set, including the activity and subject columns
  • Use the melt function to melt the extracted set setting subject and activity as the ids
  • Order the data set using rearrange by subject and activity
  • Using the dcast function, create a wide tidy data set and also averaging the values of each variable for each subject and each activity

About

The R script for creating a tidy data set

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages