-
Notifications
You must be signed in to change notification settings - Fork 127
/
index.qmd
107 lines (81 loc) · 2.78 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
---
title: "Sample Report - Data Science Capstone"
author: "Student name"
date: '`r Sys.Date()`'
format:
html:
code-fold: true
course: STA 6257 - Advanced Statistical Modeling
bibliography: references.bib # file contains bibtex for references
#always_allow_html: true # this allows to get PDF with HTML features
self-contained: true
execute:
warning: false
message: false
editor:
markdown:
wrap: 72
---
## Introduction
### What is "mehtod"?
This is an introduction to LASSO regression, which is a non-parametric
estimator that estimates the conditional expectation of two variables
which is random. The goal of a kernel regression is to discover the
non-linear relationship between two random variables. To discover the
non-linear relationship, kernel estimator or kernel smoothing is the
main method to estimate the curve for non-parametric statistics. In
kernel estimator, weight function is known as kernel function
[@efr2008]. Cite this paper [@bro2014principal]. The GEE [@wang2014].
The PCA [@daffertshofer2004pca]
This is my work and I want to add more work...
### Related work
This section is going to cover the literature review...
## Methods
The common non-parametric regression model is
$Y_i = m(X_i) + \varepsilon_i$, where $Y_i$ can be defined as the sum of
the regression function value $m(x)$ for $X_i$. Here $m(x)$ is unknown
and $\varepsilon_i$ some errors. With the help of this definition, we
can create the estimation for local averaging i.e. $m(x)$ can be
estimated with the product of $Y_i$ average and $X_i$ is near to $x$. In
other words, this means that we are discovering the line through the
data points with the help of surrounding data points. The estimation
formula is printed below [@R-base]:
$$
M_n(x) = \sum_{i=1}^{n} W_n (X_i) Y_i \tag{1}
$$ $W_n(x)$ is the sum of weights that belongs to all real numbers.
Weights are positive numbers and small if $X_i$ is far from $x$.
Another equation:
$$
y_i = \beta_0 + \beta_1 X_1 +\varepsilon_i
$$
## Analysis and Results
### Data and Visualization
A study was conducted to determine how...
```{r, warning=FALSE, echo=T, message=FALSE}
# loading packages
library(tidyverse)
library(knitr)
library(ggthemes)
library(ggrepel)
library(dslabs)
```
```{r, warning=FALSE, echo=TRUE}
# Load Data
kable(head(murders))
ggplot1 = murders %>% ggplot(mapping = aes(x=population/10^6, y=total))
ggplot1 + geom_point(aes(col=region), size = 4) +
geom_text_repel(aes(label=abb)) +
scale_x_log10() +
scale_y_log10() +
geom_smooth(formula = "y~x", method=lm,se = F)+
xlab("Populations in millions (log10 scale)") +
ylab("Total number of murders (log10 scale)") +
ggtitle("US Gun Murders in 2010") +
scale_color_discrete(name = "Region")+
theme_bw()
```
### Statistical Modeling
```{r}
```
### Conclusion
## References