Skip to content

Commit 3009035

Browse files
Lazy Programmerlazyprogrammer
Lazy Programmer
authored andcommitted
initial commit
add url to tutorial Create README.md linear regression tutorial update a comment for linear regression add Classifier class and change main() to test() rename classifier to regressor. oops! add line of best fit for absolute error add k-means example fix bug and add link to tutorial update readme add bayes classifier add knn add tutorial links matricize linear regression matricize regressor also matricize regressor also add predict and fit to regressor add logistic regression for mnist add tutorial link logistic add linear regression 1-d class code, add GMM for some reason was not added yet oops, now add GMM add r-squared add 2d and poly examples wip logistic regression class add course urls ann examples add class links try batch on donut problem ann oops more oopses change slow code too nlp class fix lsa add url more links misc stuff oops add new class update comment add gpu script add course links add one more i forget cnn update cnn class add unsupervised class oops erase unneeded line use eye renet test fix reshaping add GRU add real mnist add batches and downcast add all scan version unsupervised deep learning updates visualize features add class url fix load and save use get_value instead of eval visualize better e-commerce example visualize lda ann updates update tf update deep unsupervised + a test minor unsupervised fix some wip files add lstm wiki + minor fixes compartmentalize gru,lstm add decent embeddings add visualize embeddings add vanishing gradient demo and xwing 30 epochs add url to rnn rrnn variable learning rate add linear regression examples wip add word2vec and glove update remove some unnecessary comments add rnn init cnn additions rntn update util print out remove divide by len labels, only look at root for scoring extra comment increment j recursive nn tensorflow faster train score rntn add theano renn and rntn messing around with rntn add pos hmm make sure hmm class init exists nlp2 url wiki data split by paragraph update gitignore better cost unsupervised2 ssl add dropout files fix class name overfitting bias in right order add clustering extras + urls supervised class update comment add visualization airline example add bonus message remove irrelevant comments minor changes numpy class add nonlinearity to tf cnn tf load and save remove unnecessary code small update to logistic code bayesian ab testing tiny fix update url fix add word2idx json add batch + tf rnn update tensorflow dropout update bias correctly comment out the right derivative bayesian examples k-means additions add url add regularization code and overfitting code for linear and logistic supervised class regression init for unsupervised class init for ann2 ensembles class tiny fix alt nesterov add dlc url ab testing add dlc url to ann add dlc url to ann2 update cnn urls update hmm urls update linear urls update logistic urls update nlp urls update urls nlp2 update urls numpy update urls rnn update urls supervised update urls supervised2 update urls unsupervised update urls unsupervised2 add course url to readme cleanup change it back remove old hmm change it back add tic tac toe to rl add policy iteration and value iteration fix dt update readme with course links update update comment tiny update more correct add url add course url update remove old line add course urls and rl files add dlc url re-add linear programming linear regression update cnn typo fix relu derivative cast y to int misc updates update tf update for tf1.0 tiny xor update update xor label change var names initial commit rl2 update readme update rl2 make tf example compatible with python 3 tiny update add done flag use list copy make purity work different optimizer use brown corpus instead of wiki add brown corpus update glove "a" only gets updated at top of loop in q-learning updating a to a2 is ineffectual as it gets overwritten at top of loop before use anyway. q-learning is off policy so that line is misleading tiny fix plot mean images add new examples one more add brown update add sklearn example update add brown oops oops Set theme jekyll-theme-cayman and migrate Page Generator content Update index.md test tf scan hmm tf tf language model actual tf language model updates different way of catching loops minor change extra help finding files help finding files cnn oops add extra reading ann 2 Updating cnn_theano code according to theano API last update. update update add extra reading rl extra reading rl2 add hints for data oops update name clean up more reading add relu update add new theano fix tf unsupervised deep learning update custom cnn test single autoencoder theano autoencoder tiny update tiny update softplus py 3 test other means change it back better rmsprop make it like tensorflow more materials python3 compatibility for web service example add updated stuff that had not been pushed reading material fix ucb1 change back numbers fix fix s,a,r tuples Fix conv gans add link more links minor updates small changes tiny fix misc updates new examples update kmeans mnist just for fun oops
1 parent 54f4201 commit 3009035

File tree

311 files changed

+724037
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

311 files changed

+724037
-0
lines changed

.gitignore

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
*.DS_Store
2+
*.pyc
3+
large_files
4+
large_files/*
5+
nlp_class2/chunking/*

README.md

+67
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
machine_learning_examples
2+
=========================
3+
4+
A collection of machine learning examples and tutorials.
5+
6+
Find associated tutorials at https://lazyprogrammer.me
7+
8+
Find associated courses at https://deeplearningcourses.com
9+
10+
11+
Direct Course Links
12+
===================
13+
14+
Deep Learning Prerequisites: The Numpy Stack in Python
15+
https://deeplearningcourses.com/c/deep-learning-prerequisites-the-numpy-stack-in-python
16+
17+
Deep Learning Prerequisites: Linear Regression in Python
18+
https://deeplearningcourses.com/c/data-science-linear-regression-in-python
19+
20+
Deep Learning Prerequisites: Logistic Regression in Python
21+
https://deeplearningcourses.com/c/data-science-logistic-regression-in-python
22+
23+
Deep Learning in Python
24+
https://deeplearningcourses.com/c/data-science-deep-learning-in-python
25+
26+
Cluster Analysis and Unsupervised Machine Learning in Python
27+
https://deeplearningcourses.com/c/cluster-analysis-unsupervised-machine-learning-python
28+
29+
Data Science: Supervised Machine Learning in Python
30+
https://deeplearningcourses.com/c/data-science-supervised-machine-learning-in-python
31+
32+
Bayesian Machine Learning in Python: A/B Testing
33+
https://deeplearningcourses.com/c/bayesian-machine-learning-in-python-ab-testing
34+
35+
Easy Natural Language Processing in Python
36+
https://deeplearningcourses.com/c/data-science-natural-language-processing-in-python
37+
38+
Practical Deep Learning in Theano and TensorFlow
39+
https://deeplearningcourses.com/c/data-science-deep-learning-in-theano-tensorflow
40+
41+
Ensemble Machine Learning in Python: Random Forest and AdaBoost
42+
https://deeplearningcourses.com/c/machine-learning-in-python-random-forest-adaboost
43+
44+
Deep Learning: Convolutional Neural Networks in Python
45+
https://deeplearningcourses.com/c/deep-learning-convolutional-neural-networks-theano-tensorflow
46+
47+
Unsupervised Deep Learning in Python
48+
https://deeplearningcourses.com/c/unsupervised-deep-learning-in-python
49+
50+
Unsupervised Machine Learning: Hidden Markov Models in Python
51+
https://deeplearningcourses.com/c/unsupervised-machine-learning-hidden-markov-models-in-python
52+
53+
Deep Learning: Recurrent Neural Networks in Python
54+
https://deeplearningcourses.com/c/deep-learning-recurrent-neural-networks-in-python
55+
56+
Advanced Natural Language Processing: Deep Learning in Python
57+
https://deeplearningcourses.com/c/natural-language-processing-with-deep-learning-in-python
58+
59+
Artificial Intelligence: Reinforcement Learning in Python
60+
https://deeplearningcourses.com/c/artificial-intelligence-reinforcement-learning-in-python
61+
62+
Advanced AI: Deep Reinforcement Learning in Python
63+
https://deeplearningcourses.com/c/deep-reinforcement-learning-in-python
64+
65+
Deep Learning: GANs and Variational Autoencoders
66+
https://deeplearningcourses.com/c/deep-learning-gans-and-variational-autoencoders
67+

ab_testing/bayesian_bandit.py

+68
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# From the course: Bayesin Machine Learning in Python: A/B Testing
2+
# https://deeplearningcourses.com/c/bayesian-machine-learning-in-python-ab-testing
3+
# https://www.udemy.com/bayesian-machine-learning-in-python-ab-testing
4+
import matplotlib.pyplot as plt
5+
import numpy as np
6+
from scipy.stats import beta
7+
8+
9+
NUM_TRIALS = 2000
10+
BANDIT_PROBABILITIES = [0.2, 0.5, 0.75]
11+
12+
13+
class Bandit(object):
14+
def __init__(self, p):
15+
self.p = p
16+
self.a = 1
17+
self.b = 1
18+
19+
def pull(self):
20+
return np.random.random() < self.p
21+
22+
def sample(self):
23+
return np.random.beta(self.a, self.b)
24+
25+
def update(self, x):
26+
self.a += x
27+
self.b += 1 - x
28+
29+
30+
def plot(bandits, trial):
31+
x = np.linspace(0, 1, 200)
32+
for b in bandits:
33+
y = beta.pdf(x, b.a, b.b)
34+
plt.plot(x, y, label="real p: %.4f" % b.p)
35+
plt.title("Bandit distributions after %s trials" % trial)
36+
plt.legend()
37+
plt.show()
38+
39+
40+
def experiment():
41+
bandits = [Bandit(p) for p in BANDIT_PROBABILITIES]
42+
43+
sample_points = [5,10,20,50,100,200,500,1000,1500,1999]
44+
for i in xrange(NUM_TRIALS):
45+
46+
# take a sample from each bandit
47+
bestb = None
48+
maxsample = -1
49+
allsamples = [] # let's collect these just to print for debugging
50+
for b in bandits:
51+
sample = b.sample()
52+
allsamples.append("%.4f" % sample)
53+
if sample > maxsample:
54+
maxsample = sample
55+
bestb = b
56+
if i in sample_points:
57+
print "current samples: %s" % allsamples
58+
plot(bandits, i)
59+
60+
# pull the arm for the bandit with the largest sample
61+
x = bestb.pull()
62+
63+
# update the distribution for the bandit whose arm we just pulled
64+
bestb.update(x)
65+
66+
67+
if __name__ == "__main__":
68+
experiment()

ab_testing/chisquare.py

+63
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# From the course: Bayesin Machine Learning in Python: A/B Testing
2+
# https://deeplearningcourses.com/c/bayesian-machine-learning-in-python-ab-testing
3+
# https://www.udemy.com/bayesian-machine-learning-in-python-ab-testing
4+
import numpy as np
5+
import matplotlib.pyplot as plt
6+
from scipy.stats import chi2, chi2_contingency
7+
8+
# contingency table
9+
# click no click
10+
#------------------------------
11+
# ad A | a b
12+
# ad B | c d
13+
#
14+
# chi^2 = (ad - bc)^2 (a + b + c + d) / [ (a + b)(c + d)(a + c)(b + d)]
15+
# degrees of freedom = (#cols - 1) x (#rows - 1) = (2 - 1)(2 - 1) = 1
16+
17+
# short example
18+
19+
# T = np.array([[36, 14], [30, 25]])
20+
# c2 = np.linalg.det(T)**2 * T.sum() / ( T[0].sum()*T[1].sum()*T[:,0].sum()*T[:,1].sum() )
21+
# p_value = 1 - chi2.cdf(x=c2, df=1)
22+
23+
# equivalent:
24+
# (36-31.429)**2/31.429+(14-18.571)**2/18.571 + (30-34.571)**2/34.571 + (25-20.429)**2/20.429
25+
26+
27+
class DataGenerator:
28+
def __init__(self, p1, p2):
29+
self.p1 = p1
30+
self.p2 = p2
31+
32+
def next(self):
33+
click1 = 1 if (np.random.random() < self.p1) else 0
34+
click2 = 1 if (np.random.random() < self.p2) else 0
35+
return click1, click2
36+
37+
38+
def get_p_value(T):
39+
# same as scipy.stats.chi2_contingency(T, correction=False)
40+
det = T[0,0]*T[1,1] - T[0,1]*T[1,0]
41+
c2 = float(det) / T[0].sum() * det / T[1].sum() * T.sum() / T[:,0].sum() / T[:,1].sum()
42+
p = 1 - chi2.cdf(x=c2, df=1)
43+
return p
44+
45+
46+
def run_experiment(p1, p2, N):
47+
data = DataGenerator(p1, p2)
48+
p_values = np.empty(N)
49+
T = np.zeros((2, 2)).astype(np.float32)
50+
for i in xrange(N):
51+
c1, c2 = data.next()
52+
T[0,c1] += 1
53+
T[1,c2] += 1
54+
# ignore the first 10 values
55+
if i < 10:
56+
p_values[i] = None
57+
else:
58+
p_values[i] = get_p_value(T)
59+
plt.plot(p_values)
60+
plt.plot(np.ones(N)*0.05)
61+
plt.show()
62+
63+
run_experiment(0.1, 0.11, 20000)

ab_testing/ci_comparison.py

+37
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# From the course: Bayesin Machine Learning in Python: A/B Testing
2+
# https://deeplearningcourses.com/c/bayesian-machine-learning-in-python-ab-testing
3+
# https://www.udemy.com/bayesian-machine-learning-in-python-ab-testing
4+
import matplotlib.pyplot as plt
5+
import numpy as np
6+
from scipy.stats import beta, norm
7+
8+
T = 501 # number of coin tosses
9+
true_ctr = 0.5
10+
a, b = 1, 1 # beta priors
11+
plot_indices = (10, 20, 30, 50, 100, 200, 500)
12+
data = np.empty(T)
13+
for i in xrange(T):
14+
x = 1 if np.random.random() < true_ctr else 0
15+
data[i] = x
16+
17+
# update a and b
18+
a += x
19+
b += 1 - x
20+
21+
if i in plot_indices:
22+
# maximum likelihood estimate of ctr
23+
p = data[:i].mean()
24+
n = i + 1 # number of samples collected so far
25+
std = np.sqrt(p*(1-p)/n)
26+
27+
# gaussian
28+
x = np.linspace(0, 1, 200)
29+
g = norm.pdf(x, loc=p, scale=std)
30+
plt.plot(x, g, label='Gaussian Approximation')
31+
32+
# beta
33+
posterior = beta.pdf(x, a=a, b=b)
34+
plt.plot(x, posterior, label='Beta Posterior')
35+
plt.legend()
36+
plt.title("N = %s" % n)
37+
plt.show()

ab_testing/convergence.py

+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# From the course: Bayesin Machine Learning in Python: A/B Testing
2+
# https://deeplearningcourses.com/c/bayesian-machine-learning-in-python-ab-testing
3+
# https://www.udemy.com/bayesian-machine-learning-in-python-ab-testing
4+
import matplotlib.pyplot as plt
5+
import numpy as np
6+
from bayesian_bandit import Bandit
7+
8+
9+
def run_experiment(p1, p2, p3, N):
10+
bandits = [Bandit(p1), Bandit(p2), Bandit(p3)]
11+
12+
data = np.empty(N)
13+
14+
for i in xrange(N):
15+
# thompson sampling
16+
j = np.argmax([b.sample() for b in bandits])
17+
x = bandits[j].pull()
18+
bandits[j].update(x)
19+
20+
# for the plot
21+
data[i] = x
22+
cumulative_average_ctr = np.cumsum(data) / (np.arange(N) + 1)
23+
24+
# plot moving average ctr
25+
plt.plot(cumulative_average_ctr)
26+
plt.plot(np.ones(N)*p1)
27+
plt.plot(np.ones(N)*p2)
28+
plt.plot(np.ones(N)*p3)
29+
plt.ylim((0,1))
30+
plt.xscale('log')
31+
plt.show()
32+
33+
34+
run_experiment(0.2, 0.25, 0.3, 100000)

ab_testing/demo.py

+27
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# From the course: Bayesin Machine Learning in Python: A/B Testing
2+
# https://deeplearningcourses.com/c/bayesian-machine-learning-in-python-ab-testing
3+
# https://www.udemy.com/bayesian-machine-learning-in-python-ab-testing
4+
import numpy as np
5+
import matplotlib.pyplot as plt
6+
from scipy.stats import beta
7+
8+
def plot(a, b, trial, ctr):
9+
x = np.linspace(0, 1, 200)
10+
y = beta.pdf(x, a, b)
11+
mean = float(a) / (a + b)
12+
plt.plot(x, y)
13+
plt.title("Distributions after %s trials, true rate = %.1f, mean = %.2f" % (trial, ctr, mean))
14+
plt.show()
15+
16+
true_ctr = 0.3
17+
a, b = 1, 1 # beta parameters
18+
show = [0, 5, 10, 25, 50, 100, 200, 300, 500, 700, 1000, 1500]
19+
for t in xrange(1501):
20+
coin_toss_result = (np.random.random() < true_ctr)
21+
if coin_toss_result:
22+
a += 1
23+
else:
24+
b += 1
25+
26+
if t in show:
27+
plot(a, b, t+1, true_ctr)

ab_testing/ttest.py

+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# From the course: Bayesin Machine Learning in Python: A/B Testing
2+
# https://deeplearningcourses.com/c/bayesian-machine-learning-in-python-ab-testing
3+
# https://www.udemy.com/bayesian-machine-learning-in-python-ab-testing
4+
import numpy as np
5+
from scipy import stats
6+
7+
# generate data
8+
N = 10
9+
a = np.random.randn(N) + 2 # mean 2, variance 1
10+
b = np.random.randn(N) # mean 0, variance 1
11+
12+
# roll your own t-test:
13+
var_a = a.var(ddof=1) # unbiased estimator, divide by N-1 instead of N
14+
var_b = b.var(ddof=1)
15+
s = np.sqrt( (var_a + var_b) / 2 ) # balanced standard deviation
16+
t = (a.mean() - b.mean()) / (s * np.sqrt(2.0/N)) # t-statistic
17+
df = 2*N - 2 # degrees of freedom
18+
p = 1 - stats.t.cdf(np.abs(t), df=df) # one-sided test p-value
19+
print "t:\t", t, "p:\t", 2*p # two-sided test p-value
20+
21+
# built-in t-test:
22+
t2, p2 = stats.ttest_ind(a, b)
23+
print "t2:\t", t2, "p2:\t", p2

0 commit comments

Comments
 (0)