Near infra-red spectroscopy is a technique used in a number of industries, including food and agrochemical quality control. The goal is to determine a quantity of interest, for instance, protein content of milk, from spectrometer measurements.
More information: [](
The data for this assignment consists of the following columns (in this particular order):
Sample number.
The quantity to predict.
The rest of the columns are the spectral data.
Your task:
Perform an exploratory analysis of the data.
Develop a predictive model for the second column and report its accuracy in terms of R-squared and RMSE error.
Explain your model.
We suggest you to use Jupyter notebook for this assignment.