Skip to content

Why data analysis? , How to understand the problem, what to do for data analysis, and how clean the data for building Machine Learning models

Notifications You must be signed in to change notification settings

Navadeeppasala/Data-Analysis-with-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Data-Analysis-with-Python

Why Data analysis?

we should first understand the importance of data analysis.

  1. As you know, data is collected everywhere around us, whether it's collected manually by scientists, or collected digitally every time you click on a website or your mobiledevice.
  2. But data does not mean information. Data analysis, and in essence, data science, helps us unlock the information and insights from raw data, to answer our questions.
  3. So data analysis plays an important role by helping us: To discover useful information from the data, Answer questions, and even predict the future or the unknown.

Used Car Apparaisal

Problem statement :

"Can we estimate the price of a used car based on its characteristics?"

Business understanding:

Navadeep wants to sell his car. But the problem is, he doesn't know how much he should sell his car for. Navadeep wants to sell his car for as much as he can. But he also wants to set the price reasonably so someone would want to purchase it. So the price he sets should represent the value of the car.

How can we help navadeep determine the best price for his car? Let's think like data scientists and clearly define some of his problems:

  1. For example, is there data on the prices of other cars and their characteristics?
  2. What features of cars affect their prices? Colour? Brand? Does horsepower also affect the selling price, or perhaps, something else? As a data analyst or data scientist, these are some of the questions we can start thinking about.

Here, I would like to tell you about how the data scientist will approach a problem for model building

I took an example of a case study of an auto.csv data set that contains the Used Car Appraisal, by using this case study I explained the following things.

  1. The problem
  2. How to Understand the Dataset
  3. How to Import and Exporting Data
  4. Basic Insights from Datasets
  5. Data pre-processing techniques
  6. Identify and Handle Missing Values
  7. Dealing with missing values
  8. Data Formatting
  9. Data Normalization
  10. Binning
  11. Turning categorical variables into quantitative variables
  12. Exploratory Data Analysis
  13. Descriptive Statistics
  14. Basic of Grouping
  15. ANOVA
  16. Correlation
  17. Correlation Statistics

Releases

No releases published

Packages

No packages published