Skip to content

Completed LCR assignment_1#1

Open
Alexatug wants to merge 3 commits intomainfrom
assignment-1
Open

Completed LCR assignment_1#1
Alexatug wants to merge 3 commits intomainfrom
assignment-1

Conversation

@Alexatug
Copy link
Owner

@Alexatug Alexatug commented Sep 5, 2025

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

I am trying to add code to get information from dataset, and explain some concepts in classification such as standardization and setting random seed.

What did you learn from the changes you have made?

I learnt splitting datasets into training and testing datasets, standardization, setting a random seed, using KNeighborsClassifier to fit the model, using scikit-learn, numpy, and pandas in classification. I also learnt training, testing, and evaluating a classification model.

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

None

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

I faced challenges of removing and re-adding the response variables when performing standardization and data-splitting.
It overcame it through the participation in the work period.

How were these changes tested?

The changes were tested and worked well.

A reference to a related issue in your repository (if applicable)

Checklist

  • I can confirm that my changes are working as intended

@efantinatti
Copy link

Hi Alexandre (GitHub: @Alexatug) – PR #1 on branch assignment-1

Hits:

Used info(), shape and unique() to inspect data, which correctly answered Q1.

Standardized predictors and set up KNN with GridSearchCV and 10‑fold CV.

Issues & Advice:

Data leakage: After standardizing predictors, class was added back into the same DataFrame before splitting. This means the model saw the target during training and testing. Avoid including class in your feature matrix when standardizing/splitting

Incorrect data split: The code used wine_df_train/wine_df_test with the class column still inside. You should split into X_train, y_train, X_test, y_test.

Grid range: Parameter grid was range(1,50) (excluding 50). It should be range(1,51) to include all values from 1 to 50.

Ensure you remove the target from predictors before scaling/splitting and show the grid‑search results explicitly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants