This repository contains Data Analysis of feedback forms, performed during my 6-month internship at R FOSSEE, IIT Bombay. The goal of the task was to use factor analysis to identify hidden features of the dataset and make meaningful conclusions. The procedure consisted of the following steps:
- Data loading and exploration.
- Data cleaning.
- Data preprocessing.
- Data analysis.
The datasets that were used for this task are:
- 1 day Jmol Application Advanced workshop, conducted on 12th September.
- 1 day ChemCollective Virtual Lab Workshop, conducted on 12 December 2020.
The specific methods used were as follows:
I) Data Exploration:
- Checking for duplicate entries.
- Changing column names for easier understanding.
- Check if any row/column is redundant due to the presence of missing values.
II) Data Cleaning:
A) Software Use:
- Participants marked negative response when asked if they have used any software other than the specific software taught in the workshop, but entered the name of an alternative software in the following section.
- Participants marked positive response when asked if they have used any software other than the specific software taught in the workshop, but did not mention the name of the software.
- Participants marked positive response when asked if they have used the specific software taught in the workshop before, but failed to mention the purpose of use.
- Participants marked negative response when asked if they have used specific software taught in the workshop before, but mentioned the purpose of use.
B) Knowledge:
- Participants contradicted by responding strongly in favor of both Exposure to new knowledge and Didn't learn much statements.
- Participants contradicted by responding strongly in oppose of both Exposure to new knowledge and Didn't learn much statements.
- Participants indicating reduced level of knowledge after workshop.
- Participants indicating reduced level of knowledge after screening task.
- Checking for positive feedback in negative descriptive questions and vice versa.
III) Data Preprocessing:
- Converting string values like Not Attempted to NA in numeric data columns.
- Subsetting data into background, suggestions, quantitative and qualitative divisions.
- Converting the class of Data object to data.frame and creating a backup.
- Data type conversion from character to numeric for quantitative data columns.
- Data type conversion from character to factor for qualitative data columns.
- Removing columns containing participants' background information, comments, suggestions and/or opinions.
IV) Data Analysis:
Exploratory and Confirmatory Factor Analysis were tried on the dataset after performing steps I,II and III. However, it was discovered that the factor analysis is unsuitable on the given dataset as the correlation matrix was singular. Hence, there was no more data analysis performed.