Skip to content

Zeocode/Salary_prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Salary_prediction

For newly Graduated students it's really hard to find a source from which they can get a idea of their currrent worth in the IT industry. Given the data, we can predict the salary of the job. Using this knowledge we can create many scenarios like what will be the salary of the job or what should be the salary of the position (can be used by the company), is my salary enough (the employee can think in this term).

Measures used

Clustering

Clustering is a Machine Learning technique that involves the grouping of data points.Given a set of data points, we can use a clustering algorithm to classify each data point into a specific group.In Data Science, we can use clustering analysis to gain some valuable insights from our data by seeing what groups the data points fall into when we apply a clustering algorithm.

Methods used
  1. K-means

Distance Measures

Choosing a good distance metric becomes really important here. The distance metric helps algorithms to recognize similarities between the contents.Choosing a good distance metric will improve how well a classification or clustering algorithms performed.

Methods Used
  1. Euclidean distance
  2. Manhattan distance

Regression

Regression is a ML algorithm that can be trained to predict real numbered outputs; like temperature, stock price, etc. Regression is based on a hypothesis that can be linear, quadratic, polynomial, non-linear, etc. The hypothesis is a function that based on some hidden parameters and the input values. In the training phase, the hidden parameters are optimized w.r.t. the input values presented in the training. The process that does the optimization is the gradient decent algorithm. If you are using neural networks, then you also need Back-propagation algorithm to compute gradient at each layer. Once the hypothesis parameters got trained (when they gave least error during the training), then the same hypothesis with the trained parameters are used with new input values to predict outcomes that will be again real values.

Methods Used
  1. Linear Regresion - COMPLETED
  2. Polynomial Regression - COMPLETED - additional task - find optimal degree - NOT COMPLETED
  3. Logistic Regression - Not Required - reason : Not applicable for this data.
  4. Lasso Regression -COMPLETED
  5. Ridge Regression - CANNOT BE IMPLEMENTED HERE

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published