You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lecture-08-calibration.md
+25-22Lines changed: 25 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -98,13 +98,13 @@ Fitting a calibration curve is not the end of the story. We need to know how con
98
98
99
99
### A Theoretical Interlude
100
100
101
-
In OLS, the sum of squared errors (SSE) or residuals (SSR) is key in determining the confidence intervals for the slope and intercept. The SSR is defined as
101
+
In OLS, the **sum of squared errors (SSE)** or **residuals (SSR)** is key in determining the confidence intervals for the slope and intercept. The SSR is defined as
102
102
103
103
$$
104
104
SSR = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
105
105
$$
106
106
107
-
where $y_i$ is the observed value of the dependent variable, $\hat{y}_i$ is the predicted value of the dependent variable, and $n$ is the number of data points. Looking at the plot above, this would correspond to summing the squares of the vertical distances (gray lines) between the observed data points and the line. The SSR is related to the variance of the residuals, which is defined as
107
+
where $y_i$ is the observed value of the dependent variable, $\hat{y}_i$ is the predicted value of the dependent variable, and $n$ is the number of data points. Looking at the plot above, this would correspond to summing the squares of the vertical distances (gray lines) between the observed data points and the line. The SSR is related to the **standard error of the regression**, which is defined as
108
108
109
109
````{margin}
110
110
```{note}
@@ -113,27 +113,27 @@ We divide SSR by $n-2$ (not $n$) because estimating the slope and intercept uses
113
113
````
114
114
115
115
$$
116
-
\sigma^2 = \frac{SSR}{n-2}
116
+
s_{y/x} = \frac{SSR}{n-2}
117
117
$$
118
118
119
-
where $n$ is the number of data points. The variance of the residuals is used to calculate the standard errors of the slope and intercept, which are then used to calculate the confidence intervals. The standard errors of the slope and intercept are defined as
119
+
where $n$ is the number of data points. The variance of the residuals is used to calculate the standard errors of the slope and intercept, which are then used to calculate the confidence intervals. The **standard errors of the slope and intercept** are defined as
where $\hat{\beta}_1$ is the estimated slope, $\hat{\beta}_0$ is the estimated intercept, $x_i$ is the value of the independent variable, $\bar{x}$ is the mean of the independent variable, and $n$ is the number of data points. The confidence intervals for the slope and intercept are then calculated as
129
+
where $\hat{\beta}_1$ is the estimated slope, $\hat{\beta}_0$ is the estimated intercept, $x_i$ is the value of the independent variable, $\bar{x}$ is the mean of the independent variable, and $n$ is the number of data points. The **confidence intervals for the slope and intercept** are then calculated as
where $t_{\alpha/2}$ is the critical value of the $t$-distribution with $n-2$ degrees of freedom and a significance level of $\alpha/2$. The confidence intervals give us a range of values likely to contain the true value of the slope and intercept with a certain level of confidence.
@@ -147,22 +147,22 @@ Let's calculate the confidence intervals for the calibration curve's slope and i
147
147
residuals = absorbance - line
148
148
149
149
# Calculate the sum of the squared residuals
150
-
def sse(residuals):
150
+
def ssr(residuals):
151
151
return np.sum(residuals ** 2)
152
152
153
153
# Test the function
154
-
print(sse(residuals))
154
+
print(ssr(residuals))
155
155
```
156
156
157
-
Now, let us write a function to compute the variance of the residuals.
157
+
Now, let us write a function to compute the standard error of the regression.
158
158
159
159
```{code-cell} ipython3
160
-
# Calculate the variance of the residuals
161
-
def variance(residuals):
162
-
return sse(residuals) / (len(residuals) - 2)
160
+
# Calculate the standard error of the regression
161
+
def se_regression(residuals):
162
+
return ssr(residuals) / (len(residuals) - 2)
163
163
164
164
# Test the function
165
-
print(variance(residuals))
165
+
print(se_regression(residuals))
166
166
```
167
167
168
168
OK, now we can calculate the standard errors of the slope and intercept.
@@ -171,7 +171,7 @@ OK, now we can calculate the standard errors of the slope and intercept.
The last step in analyzing calibration data is to perform correlation analysis. Correlation analysis assesses the strength of the relationship between the two variables. In this case, we are interested in the correlation between the diacetyl concentration and the absorbance value. The correlation coefficient measures the strength and direction of the relationship between two variables. It ranges from -1 to 1, with 1 indicating a perfect positive relationship, -1 indicating a perfect negative relationship, and 0 indicating no relationship. The correlation coefficient is calculated as
where $x_i$ is the value of the independent variable, $\bar{x}$ is the mean of the independent variable, $y_i$ is the value of the dependent variable, and $\bar{y}$ is the mean of the dependent variable. The correlation coefficient gives us an indication of how well the two variables are related. A correlation coefficient close to 1 or -1 indicates a strong relationship, while a correlation coefficient close to 0 indicates a weak relationship.
0 commit comments