Near infra-red spectroscopy is a technique used in a number of industries, including food and agrochemical quality control. The goal is to determine a quantity of interest, for instance, protein content of milk, from spectrometer measurements.
More information: [](https://en.wikipedia.org/wiki/Near-infrared_spectroscopy)
The data for this assignment consists of the following columns (in this particular order):
-
Sample number.
-
The quantity to predict.
-
The rest of the columns are the spectral data.
Your task:
-
Perform an exploratory analysis of the data.
-
Develop a predictive model for the second column and report its accuracy in terms of R-squared and RMSE error.
-
Explain your model.
We suggest you to use Jupyter notebook for this assignment.