Skip to content

Latest commit

 

History

History
556 lines (411 loc) · 23.5 KB

README.md

File metadata and controls

556 lines (411 loc) · 23.5 KB

Deep Learning Practice

Basically neural network based implementation and corresponding notes.

More "general" machine learning notes will be noted in my Machine Learning repository.

If you want to clone this repository, please use the following command:

GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/daviddwlee84/DeepLearningPractice.git

The notes of this repository haven't updated for a long time, I will update it once I organize my local notes.

Environment

  • Using Python 3

Dependencies

  • tensorflow
    • github
    • Brief Notes - Placeholder, Graph, Session
    • TensorFlow 2.0 Notes
    • Model Save and Restore Notes - ckpt, transfer learning
    • Data Manipulating Notes - TFRecord, Iterator
    • Multi-thread Notes
    • High-level API Notes - tf.keras, tf.layer
    • simple demos with maybe jupyter notebook?!
  • keras
  • pytorch
    • github
    • Brief Notes
    • torch friends
      • tensorboardX - tensorboard for pytorch (and chainer, mxnet, numpy, ...)
      • pytorch-lightning - The lightweight PyTorch wrapper for ML researchers. Scale your models. Write less boilerplate
      • tnt - is torchnet for pytorch, supplying you with different metrics (such as accuracy) and abstraction of the train loop
      • inferno and torchsample - attempt to model things very similar to Keras and provide some tools for validation
      • skorch - is a scikit-learn wrapper for pytorch that lets you use all the tools and metrics from sklearn

Project

PKU Courses and Some side projects

  • Basically based on TensorFlow 1.x and Keras
  • Begin with the most basic model > CV > NLP
Subject Technique Framework Complexity Remark
Perceptron Practice SLP, MLP Numpy ○○●●● Truth Table (AND, OR, XOR) and Iris Dataset (simulate Keras API)
Softmax Derivation FCNN Numpy ○○○●● Backpropagation of Softmax with Cross Entropy Loss
MNIST Handwriting Digit FCNN Tensorflow (and tf.keras) ○○●●● Implement by different ways
Semeion Handwritten Digit FCNN Tensorflow ○○○●● Made a Tensorflow like Dataset Class
CIFAR-10 FCNN, CNN Tensorflow ○○●●● Comparison of FCNN and CNN
Chinese Named Entity Recognizer RNN, LSTM Tensorflow ○●●●● TODO: LSTM testing
Flowers CNN Tensorflow ○○●●● Transfer Learning
Fruits CNN Tensorflow (and tf.layer) ○○●●● Multi-thread training and TFRecord TODO: Try more complex model
Trigonometric Function Prediction RNN Tensorflow ○○○○● Predict sine, cosine using LSTM
Penn TreeBank RNN, LSTM Tensorflow ○○●●● Language corpus preprocessing and training
Chinese Neural Machine Translation RNN, Attention Tensorflow ○●●●● A practice of Seq2Seq and Attention TODO: Multi-graph, Try transformer
Dogs! CNN Keras ○○●●● Using images from ImageNet, Keras Transfer learning and Data augmentation
2048 FCNN with Policy Gradient Tensorflow ●●●●● Reinforcement Learning
Text Relation Classification Multiple Models Multiple Libraries ●●●●● SemEval2018 Task 7 Semantic Relation Extraction and Classification in Scientific Papers
Medical Corpus Human Labor Naked Eyes ●●●●● From Chinese word segmentation to POS tagging to NER
Word Sense Induction Multiple Models Multiple Libraries ●●●●● SemEval2013 Task 13 Word Sense Induction for Graded and Non-Graded Senses
Chinese WS/POS/(NER) RNN, CRF TansorFlow ●●●●● The "from scratch" version of the previous project ("Medical Corpus") (paper)
Toxicity Classification BiLSTM Keras ●●●●● Jigsaw Unintended Bias in Toxicity Classification - Detect toxicity across a diverse range of conversations
CWS/NER RNN, CRF TensorFlow ●●●●● The sequence labeling model on the classic Chinese NLP task

NLP PyTorch

  • Basically based on PyTorch and most of the contents are NLP
Subject Technique Framework Complexity Remark
Machine Translation RNN, Transformer PyTorch ●●●●● Machine translation model from Chinese to English based on WMT17 corpus (use result of CS224n)
Sentence Similarity RNN PyTorch ●●●●● Enhanced-RCNN and other baseline models on some sentence similarity dataset

Other Projects

NCTU DL Course

Subject Technique Framework Complexity Remark

Deep Learning Categories

TODO: Tasks, Subtasks, Structure, General Architecture, Elements, State-of-the-art model

  • General Architecture (DNN, CNN, RNNs, Atteniton, Transformer)
  • Categorized by Learning (supervised, ...)
  • Categorized by Tasks (NMT, NER, RE, ...)
  • Categorized by Structure (Seq2seq, Siamese)
  • Categorized by Learning Framework (GAN ?!)
  • State-of-the-art models and papers (BERT, ...)

Technique / Network Structure

Image Learning

Sequence Learning

Basic Block for Sequence Model!

  • Q Learning
  • Policy Gradient Methods (PG)

Uncategorized

  • Generative Adversarial Network (GAN)
  • Variational Autoencoder (VAE)
  • Self-Organizing Map (SOM)

Learning Framework / Model

Object Detection

Text and Sequence

"Pre-training in NLP" ≈ "Embedding"

Others

  • Neural Architecture Search

Ingredient of magic

  • BatchNorm
  • Convolution
  • Pooling
  • Fully Connected (Dense)
  • Dropout
  • Linear
  • LSTM
  • RNN

General speaking

  • Input
  • Hidden
  • Output
  • Sigmoid
  • Hyperbolic Tangent
  • Rectified Linear Unit (ReLU)
  • Leaky ReLU
  • Softmax
  • Cross-Entropy
  • Hinge
  • Huber
  • Kullback-Leibler
  • MAE (L1)
  • MSE (L2)
  • Exponential Moving Average (Exponentially Weighted Moving Average)
  • Adadelta
  • Adagrad
  • Adam
  • Conjugate Gradients
  • BFGS
  • Momentum
  • Nesterov Momentum
  • Newton’s Method
  • RMSProp
  • Stochastic Gradient Descent (SGD)

Parameter

  • Learning Rate: Used to limit the amount each weight is corrected each time it is updated.
  • Epochs: The number of times to run through the training data while updating the weight.

Regularization

  • Data Augmentation
  • Dropout
  • Early Stopping
  • Ensembling
  • Injecting Noise
  • L1 Regularization
  • L2 Regularization

Common Concept

Big Pucture: Machine Learning vs. Deep Learning

ML vs DL

Terminology / Tricks

  • one-hot encoding
  • ground truth
  • Data Parallelism
  • Vanilla - means standard, usual, or unmodified version of something.
    • Vanilla gradient descent (aka. Batch gradient descent) - means the basic gradient descent algorithm without any bells or whistles.

Tricks for language model - a sort of overview

Network Framework

  • LeNet - CNN
  • AlexNet - CNN
  • ZFNet
  • VGG-Net - CNN
  • GoogleNet - CNN
  • ResNet - CNN
  • DenseNet
  • ResNeXt
  • DPN (Dual Path Network)
  • CliqueNet

Applications

CV

NLP

  • Basis
    • Text segmentation
    • Part-of-speech tagging (POS tagging)
  • Speech Recognition
    • End-to-End Models:
      • (Traditional --> HMM)
      • CTC
      • RNN Transducer
      • Attention-based Model
    • Improved attention
      • Single head attention
      • Multi-headed attention
    • Word Pieces
    • Sequence-Training
      • Beam-Search Decoding Based EMBR
  • Named Entity Recognition (NER)
  • Neural Machine Translation (NMT)
    • Encoder LSTM + Decoder LSTM
    • Google NMT (GNMT)
  • Speech Synthesis
    • WaveNet: A Generative Model for Raw Audio
    • Tacotron: An end-to-end speech synthesis system
  • Personalized Recommendation
  • Machine Translation
  • Sentiment classification
  • Chatbot

Other Sequence Learning Problem

  • Music generation
  • DNA sequence analysis
  • Video activity recognition

Books Recommendation

Tools

Visualize Drawing Tools

Latex

Toy

Resources

Dataset/Corpus

Corpus/NLP Dataset

Animate Dataset

Github Repository

Example

Summary

Application

Mature Tools

NLP

  • Chinese
    • jieba
  • English
    • spaCy - Industrial-Strength Natural Language Processing in Python
    • gensim
    • nltk
    • fairseq - Facebook AI Research Sequence-to-Sequence Toolkit

Tutorial

Course

Interactive Learning

MOOC

Document

Github

Slides

Conclusion

NLP

Summaries

NLP

CV

Article

NLP

Lexical Database

Other

Manipulate Github Large File (>100MB)

.gitattributes

Time measure

Export Markdown

Machine Learning/Deep Learning Platform

Deprecated notes

  • h5py - HDF5 for Python: To store model in HDF5 binary data format
  • pyyaml - PyYAML: YAML framework

Programming Framework

Framework Organization Support Language Remark
TensorFlow Google Python, C++, Go, JavaScript, ...
Keras fchollet Python on top of TensorFlow, CNTK, or Theano
PyTorch Facebook Python
CNTK Microsoft C++
OpenNN C++
Caffe BVLC C++, Python
MXNet DMLC Python, C++, R, ...
Torch7 Facebook Lua
Theano U. Montreal Python
Deeplearning4J DeepLearning4J Java, Scala
Leaf AutumnAI Rust
Lasagne Lasagne Python
Neon NervanaSystems Python

Pending Project

Subject Technique Framework Complexity Remark
Online ImageNet Classifier CNN Keras ○○●●● (TODO) Using Keras Applications combine with RESTful API
First TF.js (TODO) Using TensorFlow.js to load pre-trained model and make prediction on the browser
YOLO CNN Tensorflow (TODO) Real-time Object Detection
Word Similarity (TODO) Word Similarity Based on Dictionary and Based on Corpus