diff --git a/README.md b/README.md index 3b3aadb..6ed492b 100644 --- a/README.md +++ b/README.md @@ -6,27 +6,27 @@ Discover pretrained models for deep learning in MATLAB. ### Computer Vision * [Image Classification](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#image-classification-) -* [Object Detection](#ObjectDetection) -* [Semantic Segmentation](#SemanticSegmentation) -* [Instance Segmentation](#InstanceSegmentation) -* [Image Translation](#ImageTranslation) -* [Pose Estimation](#PoseEstimation) -* [Video Classification](#VideoClassification) -* [Text Detection & Recognition](#textdetection) +* [Object Detection](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#object-detection-) +* [Semantic Segmentation](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#semantic-segmentation-) +* [Instance Segmentation](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#instance-segmentation-) +* [Image Translation](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#image-translation-) +* [Pose Estimation](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#pose-estimation-) +* [Video Classification](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#video-classification-) +* [Text Detection & Recognition](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#text-detection-and-recognition-) ### Natural Language Processing -* [Transformers](#transformers) +* [Transformers](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#transformers-text-) ### Audio -* [Audio Embeddings](#AudioEmbeddings) +* [Audio Embeddings](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#audio-embeddings-) * [Sound Classification](#SoundClassification) * [Pitch Estimation](#PitchEstimation) -* [Speech to Text](#Speech2Text) +* [Speech to Text](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#speech-to-text-) ### Lidar -* [Point Cloud Classification](#PointCloud) -* [Point Cloud Segmentation](#PointCloudSeg) -* [Point Cloud Object Detection](#PointCloudObj) +* [Point Cloud Classification](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#lidar-) +* [Point Cloud Segmentation](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#lidar-) +* [Point Cloud Object Detection](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#lidar-) ## Image Classification @@ -70,7 +70,7 @@ Pretrained networks have different characteristics that matter when choosing a n ![](Images/pretrained.png) Figure. Comparing image classification model accuracy, speed and size. -[Back to top](#Models) +[Back to top](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#matlab-deep-learning-model-hub) ## Object Detection @@ -104,7 +104,7 @@ These networks have been trained to detect specific objects for a given applicat | [Single Shot Detector (SSD)](https://www.mathworks.com/help/vision/ug/object-detection-using-single-shot-detector.html) | Vehicle detection | 44 |[Doc ](https://www.mathworks.com/help/vision/ug/object-detection-using-single-shot-detector.html)|| | [Faster R-CNN](https://www.mathworks.com/help/vision/ug/object-detection-using-faster-r-cnn-deep-learning.html) | Vehicle detection | 118 |[Doc](https://www.mathworks.com/help/vision/ug/object-detection-using-faster-r-cnn-deep-learning.html)|| -[Back to top](#Models) +[Back to top](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#matlab-deep-learning-model-hub) ## Semantic Segmentation @@ -127,7 +127,7 @@ This network has been trained to detect 20 objects classes from the PASCAL VOC d | [AdaptSeg (GAN)](https://www.mathworks.com/help/deeplearning/ug/train-deep-learning-semantic-segmentation-network-using-3d-simulation-data.html) | Model tuning using 3-D simulation data | 54.4 | [Doc](https://www.mathworks.com/help/deeplearning/ug/train-deep-learning-semantic-segmentation-network-using-3d-simulation-data.html) || -[Back to top](#Models) +[Back to top](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#matlab-deep-learning-model-hub) ## Instance Segmentation @@ -141,7 +141,7 @@ Inputs are RGB images, outputs are pixel classifications (semantic maps), boundi | ------------- | ------------- |------------- | | [Mask R-CNN](https://www.mathworks.com/help/vision/ref/maskrcnn.html) | 80 | [Doc](https://www.mathworks.com/help/vision/ref/maskrcnn.html)
[Github](https://github.com/matlab-deep-learning/mask-rcnn)| -[Back to top](#Models) +[Back to top](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#matlab-deep-learning-model-hub) ## Image Translation
@@ -159,7 +159,7 @@ Inputs are images, outputs are translated RGB images. This example workflow show | [CycleGAN](https://www.mathworks.com/help/images/unsupervised-medical-image-denoising-using-cyclegan.html) | Medical Image Denoising | 75.3 | [Doc](https://www.mathworks.com/help/images/unsupervised-medical-image-denoising-using-cyclegan.html) || | [VDSR](https://www.mathworks.com/help/images/single-image-super-resolution-using-deep-learning.html) | Super Resolution (estimate a high-resolution image from a low-resolution image) | 2.4 | [Doc](https://www.mathworks.com/help/images/single-image-super-resolution-using-deep-learning.html) || -[Back to top](#Models) +[Back to top](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#matlab-deep-learning-model-hub) ## Pose Estimation @@ -173,7 +173,7 @@ All inputs are RGB images, outputs are heatmaps and part affinity fields (PAFs) | ------------- | ------------- |------------- | | [OpenPose](https://www.mathworks.com/help/deeplearning/ug/estimate-body-pose-using-deep-learning.html) | 14 | [Doc](https://www.mathworks.com/help/deeplearning/ug/estimate-body-pose-using-deep-learning.html) | -[Back to top](#Models) +[Back to top](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#matlab-deep-learning-model-hub) ## Video Classification @@ -189,7 +189,7 @@ All inputs are Videos only or Video with Optical Flow data, outputs are gesture | [R(2+1)D](https://www.mathworks.com/help/vision/ref/r2plus1dvideoclassifier.html) | Video | 112 |400 |Faster convergence than Inflated-3D|[Doc](https://www.mathworks.com/help/vision/ref/r2plus1dvideoclassifier.html) | [Inflated-3D](https://www.mathworks.com/help/vision/ref/inflated3dvideoclassifier.html) | Video & Optical Flow data | 91 | 400 |Accuracy of the classifier improves when combining optical flow and RGB data.| [Doc](https://www.mathworks.com/help/vision/ref/inflated3dvideoclassifier.html) -[Back to top](#Models) +[Back to top](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#matlab-deep-learning-model-hub) ## Text Detection and Recognition @@ -209,7 +209,7 @@ Inputs are RGB images, outputs are bounding boxes that identify regions of text. | ------------- | ------------- |------------- |------------- |------------- | | [Seven Segment Digit Recognition](https://github.com/matlab-deep-learning/Seven-Segment-Digit-Recognition) |Seven segment digit recognition using deep learning and OCR. This is helpful in industrial automation applications where digital displays are often surrounded with complex background. | 3.8 |[GitHub](https://github.com/matlab-deep-learning/Seven-Segment-Digit-Recognition) |![](Images/7segment.png)| -[Back to top](#Models) +[Back to top](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#matlab-deep-learning-model-hub) ## Transformers (Text) @@ -231,7 +231,7 @@ Inputs are sequences of text, outputs are text feature embeddings. | [FinBERT](https://github.com/matlab-deep-learning/transformer-models#bert-and-finbert) | The FinBERT model is a BERT model for financial sentiment analysis | 388 |[GitHub](https://github.com/matlab-deep-learning/transformer-models#bert-and-finbert) |![](Images/finbert.png)| | [GPT-2](https://github.com/matlab-deep-learning/transformer-models#gpt-2) | The GPT-2 model is a decoder model used for text summarization.| 1.2GB |[GitHub](https://github.com/matlab-deep-learning/transformer-models#gpt-2) |![](Images/gpt2.png)| -[Back to top](#Models) +[Back to top](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#matlab-deep-learning-model-hub) ## Audio Embeddings @@ -267,7 +267,7 @@ Inputs are Audio signals, outputs is text. | [wav2vec](https://github.com/matlab-deep-learning/wav2vec-2.0) | Speech to Text | 236| 3.2 |[GitHub](https://github.com/matlab-deep-learning/wav2vec-2.0) | | [deepspeech](https://github.com/matlab-deep-learning/deepspeech) | Speech to Text | 167| 5.97 |[GitHub](https://github.com/matlab-deep-learning/deepspeech) | -[Back to top](#Models) +[Back to top](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#matlab-deep-learning-model-hub) ## Lidar @@ -287,7 +287,7 @@ Inputs are Lidar Point Clouds converted to five-channels, outputs are segmentati | [PointPillars](https://www.mathworks.com/help/lidar/ug/object-detection-using-pointpillars-network.html) | Object Detection | 8| 3 |[Doc](https://www.mathworks.com/help/lidar/ug/object-detection-using-pointpillars-network.html)| | [Complex YOLO v4](https://github.com/matlab-deep-learning/Lidar-object-detection-using-complex-yolov4) | Object Detection | 233 (complex-yolov4)
21 (tiny-complex-yolov4) | 3 |[GitHub](https://github.com/matlab-deep-learning/Lidar-object-detection-using-complex-yolov4)| -[Back to top](#Models) +[Back to top](https://github.com/matlab-deep-learning/MATLAB-Deep-Learning-Model-Hub#matlab-deep-learning-model-hub) ## Model requests If you'd like to request MATLAB support for additional pretrained models, [**please create an issue from this repo**](https://docs.github.com/en/issues/tracking-your-work-with-issues/creating-an-issue).