Skip to content

Commit

Permalink
added bagging with decision tree classifier
Browse files Browse the repository at this point in the history
  • Loading branch information
samirash committed Apr 26, 2017
1 parent 486bb6a commit ed811de
Showing 1 changed file with 46 additions and 5 deletions.
51 changes: 46 additions & 5 deletions labs/lab5-draft.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -171,12 +171,53 @@
"to check the parameters. This function provides more than just bagging, for instance in addition to you can also take random subsets of the features (max_features). In order to implement bagging, you need to keep all the features but use random subsets of the samples. You can use the bootstrap parameter to specify that your samples are drawn with replacement. Use n_estimators and max_samples to specify the number of the estimators you want to use and the number of samples you want to use for each of them."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Bagging with Decision Tree Classifier\n",
"Use Decision Tree as your base classifier. You can start with depth 20 for your decision trees. Since the data\n",
"is very unbalanced regarding to the number of True and False samples, use the class_weight parameter to specify\n",
"how much the model should prefer correctly classifying one class over another.\n",
"\n",
"Define your classifier. Use fit and score functions to fit your model and compute the score."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###What happens when you try different bagging parameters?\n",
"Try n_estimator = { 5 ,10, 20 } , max_depth = {10, 20} and max_samples = { 0.35, 0.5, 0.65 }\n",
"and report the results.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Bagging with logistic regression\n",
"Use logistic regression as your base classifier. To keep it simple use l2 norm and C = 1. Since the data is very unbalanced regarding to the number of True and False samples, use the class_weight parameter to specify how much the model should prefer correctly classifying one class over another.\n",
"Now you will try using another base classifier. Use logistic regression as your base classifier. To keep it simple use l2 norm and C = 1. Since the data is very unbalanced regarding to the number of True and False samples, use the class_weight parameter to specify how much the model should prefer correctly classifying one class over another.\n",
"\n",
"Define your classifier. Use fit and score functions to fit your model and compute the score."
]
Expand Down Expand Up @@ -231,7 +272,7 @@
"metadata": {},
"source": [
"## AdaBoost with decision tree of depth one\n",
"Use decision tree of depth one as your base classifier. For Adaboost parameters use n_estimators = 20. Again use the class_weight parameter for your decision tree classifier to deal with the unbalanced data. You may use the 'balanced' option.\n",
"Use decision tree of depth one as your base classifier. For Adaboost parameters use n_estimators = 5. Again use the class_weight parameter for your decision tree classifier to deal with the unbalanced data. You may use the 'balanced' option.\n",
"\n",
"Define your classifier. Use fit and score functions to fit your model and compute the score."
]
Expand All @@ -253,7 +294,7 @@
"source": [
"### What happens when you decrease or increase the number of your estimators?\n",
"\n",
"Try using n_estimators = { 10, 20, 40} and report the results."
"Try using n_estimators = { 1, 2, 5, 10} and report the results."
]
},
{
Expand All @@ -275,7 +316,7 @@
"\n",
"In this part you will make a random forest classifier. Refer to\n",
"\n",
"http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier\n",
"http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html\n",
"\n",
"for the specifications and the parameters. You can use n_estimators and max_features to specify the number of the estimators and the number of features you want to use for building each tree. \n",
"\n"
Expand All @@ -287,7 +328,7 @@
"source": [
"## Exeriment with Random Forest \n",
"\n",
"Use n_estimator = 10 and max_features=sqrt(n_features). You may use max_depth=20 in combination with min_samples_split=1 to stop the trees from growing too deep.\n",
"Use n_estimator = 10 and max_features=sqrt(n_features). You may use max_depth=20 in combination with min_samples_split=1 to stop the trees from growing too deep. Again use the class_weight parameter for your decision tree classifier to deal with the unbalanced data.\n",
"\n",
"Define your classifier. Use fit and score functions to fit your model and compute the score.\n"
]
Expand Down

0 comments on commit ed811de

Please sign in to comment.