Skip to content
This repository has been archived by the owner on Sep 11, 2023. It is now read-only.

Commit

Permalink
Merge pull request #38 from Nima-Jamshidi/milestone-03
Browse files Browse the repository at this point in the history
Milestone 03
  • Loading branch information
Nima-Jamshidi authored Mar 17, 2020
2 parents 5444b32 + dae9cca commit 5105145
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 4 deletions.
4 changes: 2 additions & 2 deletions docs/milestone3.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -152,8 +152,8 @@ kable(head(augmented), caption = "Estimated values their statistics")
# Discussion

Based on the "Residuals vs Fitted" and "Real vs Fitted" graphs, we can see that the model fairly works for charges under 2000\$. There are three clusters in these graphs with similar slopes. There is a gap between charges under and over 2000\$ which might be relevant to the weak estimates of the model over 2000\$.
If we apply linear regression on each cluster we will get similar coefficients for the variables with different intercepts. Each cluster might be attributed to a different desease group and in each of them the impacts of age, smoking, bmi and etc. are similar.
If we apply linear regression on each cluster we will get similar coefficients for the variables with different intercepts. Each cluster might be attributed to a different disease group and in each of them the impacts of age, smoking, bmi and etc. are similar.

# Conclusion

We were able to do a linear regression on our dataset. The results show that there is an association relationship between age, bmi, number of children, and smoking with medical charges. interestingly, gender does not affect medical charges. Diagnostic plots reveal that the data is not completely normally distributed. Moreover, three clusters of records are present in the dataset, which might be representative of different types of deseases.
We were able to do a linear regression on our dataset. The results show that there is an association relationship between age, bmi, number of children, and smoking with medical charges. The estimated coefficients for these variables are all positive, meaning that higher age, bmi, number of children and/or being a smoker increase medical charges. interestingly, gender does not affect medical charges. Diagnostic plots reveal that the data is not completely normally distributed. Moreover, three clusters of records are present in the dataset, which might be representative of different types of diseases.
4 changes: 2 additions & 2 deletions docs/milestone3.html
Original file line number Diff line number Diff line change
Expand Up @@ -809,11 +809,11 @@ <h1><span class="header-section-number">6</span> Results</h1>
<div id="discussion" class="section level1">
<h1><span class="header-section-number">7</span> Discussion</h1>
<p>Based on the “Residuals vs Fitted” and “Real vs Fitted” graphs, we can see that the model fairly works for charges under 2000$. There are three clusters in these graphs with similar slopes. There is a gap between charges under and over 2000$ which might be relevant to the weak estimates of the model over 2000$.
If we apply linear regression on each cluster we will get similar coefficients for the variables with different intercepts. Each cluster might be attributed to a different desease group and in each of them the impacts of age, smoking, bmi and etc. are similar.</p>
If we apply linear regression on each cluster we will get similar coefficients for the variables with different intercepts. Each cluster might be attributed to a different disease group and in each of them the impacts of age, smoking, bmi and etc. are similar.</p>
</div>
<div id="conclusion" class="section level1">
<h1><span class="header-section-number">8</span> Conclusion</h1>
<p>We were able to do a linear regression on our dataset. The results show that there is an association relationship between age, bmi, number of children, and smoking with medical charges. interestingly, gender does not affect medical charges. Diagnostic plots reveal that the data is not completely normally distributed. Moreover, three clusters of records are present in the dataset, which might be representative of different types of deseases.</p>
<p>We were able to do a linear regression on our dataset. The results show that there is an association relationship between age, bmi, number of children, and smoking with medical charges. The estimated coefficients for these variables are all positive, meaning that higher age, bmi, number of children and/or being a smoker increase medical charges. interestingly, gender does not affect medical charges. Diagnostic plots reveal that the data is not completely normally distributed. Moreover, three clusters of records are present in the dataset, which might be representative of different types of diseases.</p>
</div>


Expand Down
Binary file modified docs/milestone3.pdf
Binary file not shown.

0 comments on commit 5105145

Please sign in to comment.