Getting and Cleaning Data Course Project

Thomas Fischer January 31, 2018

Repository for the submission of the peer-graded assignment in Coursera's Getting and Cleaning Data course by Johns Hopkins University.

Overview

This project aims to demonstrate how to collect and clean raw data and to transform it to a tidy data set that can be used for further analysis. More information on tidy data can be obtained from Tidy Data by Hadley Wickham.

Repository Content

README.md - just this file, which provides an overview of the repository
CodeBook.md - desciption of the variables, the data, and any transformations or work done to clean up the data
run_analysis.R - R script for performing data cleaning. Details on Coursera's requirements for the script can be obtainded in the Scipt Requirements section below.
UCI HAR Dataset - original raw data folder. Important Note: Due to GitHub's file size limits, this folder does not contain the training and test files, namely ./train/X_train.txt and ./test/X_test.txt. Furthermore both ./training/Inertial Signals and ./test/Inertial Signalswere removed as they don't contribute neccessary data to this project. Training and test data sets can be downloaded here. Please extract it to the same directory where the R script called run_analysis.R resides.
tidy_data.txt - tidy data set with the average of each variable for each activity and each subject. This data set is created by the run_analysis.R script.

Raw Data

The raw data to tidy up represents data collected from the accelerometers from the Samsung Galaxy S smartphone. The data was built from the recordings of 30 subjects performing activities of daily living (ADL) while carrying a waist-mounted smartphone with embedded inertial sensors. The data was labeled manually corresponding to video records. A full description is available at the site where the data was obtained: Human Activity Recognition Using Smartphones Data Set. The raw data namely consists of 561 attributes (already preprocessed measurements) and 10299 observations for 30 individuals. The labels were classified into six activities (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING).

Scipt Requirements

The R script run_analysis.R takes raw data as input and produces and tidy data set according to the instructions for this coursera assignment. More details you can find in the CodeBook in this repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting and Cleaning Data Course Project

Overview

Repository Content

Raw Data

Scipt Requirements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
UCI HAR Dataset		UCI HAR Dataset
.gitignore		.gitignore
CodeBook.md		CodeBook.md
README.md		README.md
run_analysis.R		run_analysis.R
tidy_data.txt		tidy_data.txt

tomfischersz/GettingAndCleaningData_CourseProject

Folders and files

Latest commit

History

Repository files navigation

Getting and Cleaning Data Course Project

Overview

Repository Content

Raw Data

Scipt Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages