we should first understand the importance of data analysis.
- As you know, data is collected everywhere around us, whether it's collected manually by scientists, or collected digitally every time you click on a website or your mobiledevice.
- But data does not mean information. Data analysis, and in essence, data science, helps us unlock the information and insights from raw data, to answer our questions.
- So data analysis plays an important role by helping us: To discover useful information from the data, Answer questions, and even predict the future or the unknown.
"Can we estimate the price of a used car based on its characteristics?"
Navadeep wants to sell his car. But the problem is, he doesn't know how much he should sell his car for. Navadeep wants to sell his car for as much as he can. But he also wants to set the price reasonably so someone would want to purchase it. So the price he sets should represent the value of the car.
How can we help navadeep determine the best price for his car? Let's think like data scientists and clearly define some of his problems:
- For example, is there data on the prices of other cars and their characteristics?
- What features of cars affect their prices? Colour? Brand? Does horsepower also affect the selling price, or perhaps, something else? As a data analyst or data scientist, these are some of the questions we can start thinking about.
Here, I would like to tell you about how the data scientist will approach a problem for model building
I took an example of a case study of an auto.csv data set that contains the Used Car Appraisal, by using this case study I explained the following things.
- The problem
- How to Understand the Dataset
- How to Import and Exporting Data
- Basic Insights from Datasets
- Data pre-processing techniques
- Identify and Handle Missing Values
- Dealing with missing values
- Data Formatting
- Data Normalization
- Binning
- Turning categorical variables into quantitative variables
- Exploratory Data Analysis
- Descriptive Statistics
- Basic of Grouping
- ANOVA
- Correlation
- Correlation Statistics