Class Project for Getting and Cleaning Data

This project is based on data from the Human Activity Recognition Using Smartphones Dataset, Version 1.0. The observations are from a group of volunteers, each performing six activities (walking, waling upstairs, walking downstairs, sitting, standing, and laying). The participants were wearing a smartphone (Samsung Galaxy S II) on the waist using its embedded accelerometer and gyroscope.

For this project, the instructor provided a specific URL for students to download the data.

Course URL

This repository contains the downloaded data and all the files needed to reproduce my project. If you wish to download the data from the course URL separately, unzip the file so the "UCI HAR Dataset"" folder is in the same directory as the run-analysis.R script.

The output of this project is a tidy data set of the mean of each of the recorded data features, by partipant and by activity.

To replicate the data cleanup process

Clone the repository to your local machine

Set your R working directory to your cloned repository

Run the R script run_analysis.R to create the tidy data set (dataTidy.txt).

Checking the output file

The tidy data set output file can be loaded in R and examined by using commands

dataCheck <- read.table("dataTidy.txt", sep = ",")
View(dataCheck)

Prerequisites

The run_analysis.R script requires packages "plyr", and "reshape2".

Coding

The single R script is used to

Merge the training and the test sets to create one data set.
Use descriptive activity names to name the activities in the data set.
Appropriately label the data set with descriptive variable names.
Creates a second, independent tidy data set with the average of each variable for each activity and each subject.

This is accomplished by the following operations.

Load "features.txt" to get the variables used on the feature vector.
Give the features more meaningful names.
Load and join "X" data which contains observations where 70% of the volunteers generated the training data and 30% generated the test data.
Load "Activity Labels" which link activity numbers with activity names.
Load and join "Y" data which contains activity numbers for training and test groups
Load and join "Subject" data for both training and test groups
Join obervations with subjects and activity names
Extract measurements of mean and standard deviation for each measurement.
Melt the joined data frame into a form suitable for casting.
Cast the molten data frame into a tidy data frame that meets the principles of tidy data taught in this class.
Write a tidy data text file to disk.

END OF README.md FILE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Class Project for Getting and Cleaning Data

To replicate the data cleanup process

Checking the output file

Prerequisites

Coding

About

Uh oh!

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
UCI HAR Dataset		UCI HAR Dataset
CodeBook.md		CodeBook.md
README.md		README.md
dataTidy.txt		dataTidy.txt
run_analysis.R		run_analysis.R

bryansloan/GettingAndCleaningData

Folders and files

Latest commit

History

Repository files navigation

Class Project for Getting and Cleaning Data

To replicate the data cleanup process

Checking the output file

Prerequisites

Coding

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages