Basically neural network based implementation and corresponding notes.
More "general" machine learning notes will be noted in my Machine Learning repository.
If you want to clone this repository, please use the following command:
GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/daviddwlee84/DeepLearningPractice.git
The notes of this repository haven't updated for a long time, I will update it once I organize my local notes.
- Using Python 3
tensorflow
- github
- Brief Notes - Placeholder, Graph, Session
- TensorFlow 2.0 Notes
- Model Save and Restore Notes - ckpt, transfer learning
- Data Manipulating Notes - TFRecord, Iterator
- Multi-thread Notes
- High-level API Notes - tf.keras, tf.layer
- simple demos with maybe jupyter notebook?!
keras
pytorch
- github
- Brief Notes
- torch friends
tensorboardX
- tensorboard for pytorch (and chainer, mxnet, numpy, ...)pytorch-lightning
- The lightweight PyTorch wrapper for ML researchers. Scale your models. Write less boilerplatetnt
- is torchnet for pytorch, supplying you with different metrics (such as accuracy) and abstraction of the train loopinferno
andtorchsample
- attempt to model things very similar to Keras and provide some tools for validationskorch
- is a scikit-learn wrapper for pytorch that lets you use all the tools and metrics from sklearn
- Basically based on TensorFlow 1.x and Keras
- Begin with the most basic model > CV > NLP
Subject | Technique | Framework | Complexity | Remark |
---|---|---|---|---|
Perceptron Practice | SLP, MLP | Numpy | ○○●●● | Truth Table (AND, OR, XOR) and Iris Dataset (simulate Keras API) |
Softmax Derivation | FCNN | Numpy | ○○○●● | Backpropagation of Softmax with Cross Entropy Loss |
MNIST Handwriting Digit | FCNN | Tensorflow (and tf.keras) | ○○●●● | Implement by different ways |
Semeion Handwritten Digit | FCNN | Tensorflow | ○○○●● | Made a Tensorflow like Dataset Class |
CIFAR-10 | FCNN, CNN | Tensorflow | ○○●●● | Comparison of FCNN and CNN |
Chinese Named Entity Recognizer | RNN, LSTM | Tensorflow | ○●●●● | TODO: LSTM testing |
Flowers | CNN | Tensorflow | ○○●●● | Transfer Learning |
Fruits | CNN | Tensorflow (and tf.layer) | ○○●●● | Multi-thread training and TFRecord TODO: Try more complex model |
Trigonometric Function Prediction | RNN | Tensorflow | ○○○○● | Predict sine, cosine using LSTM |
Penn TreeBank | RNN, LSTM | Tensorflow | ○○●●● | Language corpus preprocessing and training |
Chinese Neural Machine Translation | RNN, Attention | Tensorflow | ○●●●● | A practice of Seq2Seq and Attention TODO: Multi-graph, Try transformer |
Dogs! | CNN | Keras | ○○●●● | Using images from ImageNet, Keras Transfer learning and Data augmentation |
2048 | FCNN with Policy Gradient | Tensorflow | ●●●●● | Reinforcement Learning |
Text Relation Classification | Multiple Models | Multiple Libraries | ●●●●● | SemEval2018 Task 7 Semantic Relation Extraction and Classification in Scientific Papers |
Medical Corpus | Human Labor | Naked Eyes | ●●●●● | From Chinese word segmentation to POS tagging to NER |
Word Sense Induction | Multiple Models | Multiple Libraries | ●●●●● | SemEval2013 Task 13 Word Sense Induction for Graded and Non-Graded Senses |
Chinese WS/POS/(NER) | RNN, CRF | TansorFlow | ●●●●● | The "from scratch" version of the previous project ("Medical Corpus") (paper) |
Toxicity Classification | BiLSTM | Keras | ●●●●● | Jigsaw Unintended Bias in Toxicity Classification - Detect toxicity across a diverse range of conversations |
CWS/NER | RNN, CRF | TensorFlow | ●●●●● | The sequence labeling model on the classic Chinese NLP task |
- Basically based on PyTorch and most of the contents are NLP
Subject | Technique | Framework | Complexity | Remark |
---|---|---|---|---|
Machine Translation | RNN, Transformer | PyTorch | ●●●●● | Machine translation model from Chinese to English based on WMT17 corpus (use result of CS224n) |
Sentence Similarity | RNN | PyTorch | ●●●●● | Enhanced-RCNN and other baseline models on some sentence similarity dataset |
Subject | Technique | Framework | Complexity | Remark |
---|
TODO: Tasks, Subtasks, Structure, General Architecture, Elements, State-of-the-art model
- General Architecture (DNN, CNN, RNNs, Atteniton, Transformer)
- Categorized by Learning (supervised, ...)
- Categorized by Tasks (NMT, NER, RE, ...)
- Categorized by Structure (Seq2seq, Siamese)
- Categorized by Learning Framework (GAN ?!)
- State-of-the-art models and papers (BERT, ...)
- Feedforward Neural Network
- Multilayer Perceptron (MLP)
Fully Connected Neural Network (FCNN)
- And an overview of neural network training process including forward and back propagation- Dense Neural Network (DNN)
Basic Block for Sequence Model!
Recurrent Neural Network (RNN)
- Basis of Sequence modelLong Short Term Memory (LSTM)
- Imporvement of "memory" (brief introduce other regular RNN block)Gated Recurrent Units (GRUs)
Q Learning
Policy Gradient Methods (PG)
Generative Adversarial Network (GAN)
Variational Autoencoder (VAE)
Self-Organizing Map (SOM)
- Sequence-to-Sequence (seq-to-seq) (Encoder-Decoder) Architecture - Overview of sequence models
Bidirectional RNN (BRNN)
- RNN-Based seq-to-seq- Convolution-based seq-to-seq
Attention Model
- Transformer-based seq-to-seqTransformer
- Attention Is All You Need - Transformer-based multi-headed self-attention
- Word Piece Model (WPM) aka. SentencePiece
"Pre-training in NLP" ≈ "Embedding"
- Neural Architecture Search
- BatchNorm
- Convolution
- Pooling
- Fully Connected (Dense)
- Dropout
- Linear
- LSTM
- RNN
General speaking
- Input
- Hidden
- Output
- Sigmoid
- Hyperbolic Tangent
- Rectified Linear Unit (ReLU)
- Leaky ReLU
- Softmax
- Cross-Entropy
- Hinge
- Huber
- Kullback-Leibler
- MAE (L1)
- MSE (L2)
- Exponential Moving Average (Exponentially Weighted Moving Average)
- Adadelta
- Adagrad
- Adam
- Conjugate Gradients
- BFGS
- Momentum
- Nesterov Momentum
- Newton’s Method
- RMSProp
- Stochastic Gradient Descent (SGD)
Parameter
- Learning Rate: Used to limit the amount each weight is corrected each time it is updated.
- Epochs: The number of times to run through the training data while updating the weight.
- Data Augmentation
- Dropout
- Early Stopping
- Ensembling
- Injecting Noise
- L1 Regularization
- L2 Regularization
Big Pucture: Machine Learning vs. Deep Learning
- one-hot encoding
- ground truth
- Data Parallelism
- Vanilla - means standard, usual, or unmodified version of something.
- Vanilla gradient descent (aka. Batch gradient descent) - means the basic gradient descent algorithm without any bells or whistles.
Tricks for language model - a sort of overview
-
CNN for NLP
-
RNN for NLP
-
Capsule net with GRU
- LeNet - CNN
- AlexNet - CNN
- ZFNet
- VGG-Net - CNN
- GoogleNet - CNN
- ResNet - CNN
- DenseNet
- ResNeXt
- DPN (Dual Path Network)
- CliqueNet
- Basis
- Text segmentation
- Part-of-speech tagging (POS tagging)
- Speech Recognition
- End-to-End Models:
- (Traditional --> HMM)
- CTC
- RNN Transducer
- Attention-based Model
- Improved attention
- Single head attention
- Multi-headed attention
- Word Pieces
- Sequence-Training
- Beam-Search Decoding Based EMBR
- End-to-End Models:
- Named Entity Recognition (NER)
- Neural Machine Translation (NMT)
- Encoder LSTM + Decoder LSTM
- Google NMT (GNMT)
- Speech Synthesis
- WaveNet: A Generative Model for Raw Audio
- Tacotron: An end-to-end speech synthesis system
- Personalized Recommendation
- Machine Translation
- Sentiment classification
- Chatbot
- Music generation
- DNA sequence analysis
- Video activity recognition
- Deep Learning - MIT
- Dive into Deep Learning (D2L Book) (d2l.ai) / 動手學深度學習
- Speech and Language Processing 2ed.
- Deep Learning with Python
Latex
Toy
- nico-opendata
- Danbooru2018 - A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset
- MyAnimeList Dataset
Example
Summary
- brightmart/text_classification - all kinds of text classification models and more with deep learning
NLP
- Chinese
- jieba
- English
- Tensorflow and deep learning without a PhD series by @martin_gorner
- YSDA Natural Language Processing course
- Tensorflow and deep learning without a PhD - Martin Görner
- dataflowr | deep learning courses - github
- Stanford - CS231n: Convolutional Neural Networks for Visual Recognition
- Stanford - CS244n: Natural Language Processing with Deep Learning
- Winter 2019 - first time using PyTorch
- Winter 2017 - using TensorFlow
- MIT Deep Learning
- Github - Tutorials, assignments, and competitions for MIT Deep Learning related courses
- PKU - 人工智慧實踐:Tensorflow筆記
- DeepNotes
- deepnet - Implementations of CNNs, RNNs and cool new techniques in deep learning from scratch
- UFLDL Tutorial
- Machine Learning Cheatsheet
- 深度學習500問
- Machine Learning Notebook
- Machine Learning cheatsheets for Stanford's CS 229
- Deep Learning cheatsheets for Stanford's CS 230
- graykode/nlp-tutorial - Natural Language Processing Tutorial for Deep Learning Researchers
- Microsoft Natural Language Processing Best Practices & Examples
- Microsoft AI education materials for Chinese students, teachers and IT professionals
- Lambda Deep Learning Demos
- Azure/MachineLearningNotebooks
- smilelight/lightNLP
- RasaHQ/rasa - Open source machine learning framework to automate text- and voice-based conversations
- xinjli/pytensor: A numpy deep learning framework
- boat-group/fancy-nlp: NLP for human. A fast and esay-to-use natural language processing (NLP) toolkit, satisfying your imagination about NLP.
- Hironsan/anago: Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.
- BrikerMan/Kashgari: Kashgari is a Production-ready NLP Transfer learning framework for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
- jiesutd/NCRFpp: NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
- jiqizhixin/ML-Tutorial-Experiment: Coding the Machine Learning Tutorial for Learning to Learn
NLP
NLP
- graykode/nlp-roadmap: ROADMAP(Mind Map) and KEYWORD for students those who have interest in learning NLP
- Tracking Progress in Natural Language Processing
CV
- Awesome Computer Vision - A curated list of awesome computer vision resources
NLP
Manipulate Github Large File (>100MB)
.gitattributes
- Git large file storage
- Bitbucket tutorial - Git LFS
- Configuring Git Large File Storage
- Moving a file in your repository to Git Large File Storage
- BFG Repo-Cleaner -
brew install bfg
- Removing sensitive data from a repository - git filter-branch
- BFG Repo-Cleaner -
Time measure
Export Markdown
Machine Learning/Deep Learning Platform
Framework | Organization | Support Language | Remark |
---|---|---|---|
TensorFlow | Python, C++, Go, JavaScript, ... | ||
Keras | fchollet | Python | on top of TensorFlow, CNTK, or Theano |
PyTorch | Python | ||
CNTK | Microsoft | C++ | |
OpenNN | C++ | ||
Caffe | BVLC | C++, Python | |
MXNet | DMLC | Python, C++, R, ... | |
Torch7 | Lua | ||
Theano | U. Montreal | Python | |
Deeplearning4J | DeepLearning4J | Java, Scala | |
Leaf | AutumnAI | Rust | |
Lasagne | Lasagne | Python | |
Neon | NervanaSystems | Python |
Subject | Technique | Framework | Complexity | Remark |
---|---|---|---|---|
Online ImageNet Classifier | CNN | Keras | ○○●●● | (TODO) Using Keras Applications combine with RESTful API |
First TF.js | (TODO) Using TensorFlow.js to load pre-trained model and make prediction on the browser | |||
YOLO | CNN | Tensorflow | (TODO) Real-time Object Detection | |
Word Similarity | (TODO) Word Similarity Based on Dictionary and Based on Corpus |