- This is a note on how I interpret every tutorials, articles, and books that I've read.
- All sources will be cited, if not cited means it's based on my personal experience
- There could be an information distortion, so I would be very grateful if you could tell me or let me know via twitter or linkedin
- Boltzmann Machine
- Backward Pass
- Call optimizer.zerograd() after each .step() prevent accumulating the gradient in .backward().
-
CNN: better understanding of cnn by visualize it
- 2D Visualize CNN [src:ryerson.ca]
- Visualizing convolutional features using PyTorch [github: fg91]
- pytorch_visualization [github: pedrodiamel]
- pytorch-cnn-visualizations [github: utkuozbulak]
- CNN Explainer [github: poloclub]
-
Covariate Shift
- Covariate shift is the change in distribution between training dataset and test dataset
- DCGAN
- print(netD.main[5].weight.size()) | torch.Size([256, 128, 4, 4]) artinya 256 feature maps out, 128 feature maps in, kernel 4x4
- Every iteration convolution put different result for every feature maps
- if Loss D is near zero and Loss G still high means the generator generate garbage
- Loss G πΊ = fooling D with garbage, Loss D π» = doesn't learn anything
- Loss G π» = generate good image, Loss D π» = can distinguish fake n real
- D(x) - the average output (across the batch) of the discriminator for the all real batch. This should start close to 1 then theoretically converge to 0.5 when G gets better. Why? It's because initially the discriminator know how to predict the real one (output's mean = 1) and then start to confused by the weight produced by the discriminator while training on the fake batch.
- D(G(z)) - average discriminator outputs for the all fake batch. The first number is before D is updated and the second number is after D is updated. These numbers should start near 0 and converge to 0.5 as G gets better. Why? It's because initially the discriminator know how to predict the fake one (output's mean = 0) and then start to confused, cause the generator can produce almost as good as the real one.
- Fractional Strided Convolution
- Transposed Convolution = Fractional Strided Convolution β deconvolution [src: ArXiv] [src: Implementing the Generator of DCGAN on FPGA]
- The number of stride in Convolution is equal to half of the stride in Transposed Convolution, [src: ArXiv]
- Training on GPU
- I found Tensorflow can harness more GPU power then PyTorch while training DCGAN using their tutorial's code
- Hook PyTorch
- First create function for hook, then create model, then register hook
- Latent Space
- Latent Space is a compressed representation from certain dataset.
- Neuroscience
- If your cells can turn into eyeballs or teeth, probably your cells can do backpropagation or something similar like backpropagation [YouTube:Preserve Knowledge]
-
P Value
- p-value, the probability getting the current/original idea is TRUE or correct
- The lower the P-value is the more significant your independent variable is going to be, the more impact on the dependent variable. <5% highly significant, >5% less significant
-
Polynomial Linear Regression
- Rven though the relation between x and y is non linear you can use Polynomial Linear Regression
-
R
- name space seperated using dot
-
Preview .md files in vscode
- Press Ctrl+Shift+V, [src: Visual Studio]
-
Reactjs Concepts
- Split component as needed, and naming props from the componentβs own point of view rather than the context in which it is being used. React Doc
-
Data Security in ML
- Even with decentralized deep learning, GAN can generate protypical samples of targeted data. [src: Arxiv]
-
Sparse coding
- nan
-
Spyder
- An object cannot be viewed in spyder
- torch.detach
- detach(): remove a tensor from computation graph (excluded from further tracking of operations) [src: B. Nikolic Software and Computing Blog]
- torch.grad
- grad is a variable
- torch.no_grad() to prevent memory overload [src: PyTorch]
- Unbalanced Data
- Do oversampling or Undersamplng π Back