Skip to content

Commit cb74479

Browse files
committed
Merge pull request #4 from kvsikand/master
put stuff on 2 pages, added short mode seeking section
2 parents 7ffa50a + 2c71a03 commit cb74479

2 files changed

Lines changed: 8 additions & 1 deletion

File tree

189cheatSheet.pdf

-182 Bytes
Binary file not shown.

189cheatSheet.tex

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -255,7 +255,8 @@ \subsection{Boosting}
255255
Weak Learner: Can classify with at least 50\% accuracy.\\
256256
Train weak learner to get a weak classifier. Test it on the training data, up-weigh misclassified data, down-weigh correctly classified data. Train a new weak learner on the weighted data. Repeat. A new point is classified by every weak learner and the output class is the sign of a weighted avg. of weak learner outputs. Boosting generally overfits. If there is label noise, boosting keeps upweighing the mislabeled data.\\
257257
{\bf AdaBoost} is a boosting algorithm. The weak learner weights are given by $\alpha_t=\frac{1}{2}\ln(\frac{1-\epsilon_t}{\epsilon_t})$ where $\epsilon_t=Pr_{D_t}(h_t(x_i)\ne y_i)$ (probability of misclassification). The weights are updated $D_{t+1}(i)=\frac{D_t(i)exp(-\alpha_ty_ih_t(x_i))}{Z_t}$ where $Z_t$ is a normalization factor.
258-
\newpage
258+
\vfill
259+
\columnbreak
259260
\subsection{Neural Networks}
260261
Neural Nets explore what you can do by combining perceptrons, each of which is a simple linear classifier. We use a soft threshold for each activation function $\theta$ because it is twice differentiable.
261262
\includegraphics[scale=0.31]{NN.pdf} \ \includegraphics[scale=0.2]{NN2.pdf}
@@ -298,6 +299,12 @@ \subsection{Clustering}
298299
\\{\bf Nonparametric Discriminative Clustering}: Histogram, Kernel Density Estimation.
299300
\\Kernel: $P(x) = \frac{1}{n} \sum K(x-x_i)$, s.t. K is normalized, symmetric, and $\lim_{||x|| \rightarrow \infty} ||x||^d K(x) = 0$.
300301

302+
\subsection{Mode Seeking}
303+
To find ''Bumps" in the distribution. Mean Shift: calculate
304+
\[m(x)=\left[ \frac{\sum_{i=1}^n x_i g(\frac{||x-x_i||^2}{h})}{\sum_{i=1}^n g(\frac{||x-x_i||^2}{h})} - x \right] \].
305+
Then translate kernel window by m(x). To deal with saddle points, perturb modes and prune.
306+
307+
301308
% You can even have references
302309
\rule{0.3\linewidth}{0.25pt}
303310
\newpage

0 commit comments

Comments
 (0)