Skip to content

Commit d5886dd

Browse files
committed
Improve root README and 2d classification documentation
Signed-off-by: Mingxin Zheng <[email protected]>
1 parent 8b90a16 commit d5886dd

File tree

4 files changed

+238
-95
lines changed

4 files changed

+238
-95
lines changed

2d_classification/mednist_tutorial.ipynb

Lines changed: 34 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -17,14 +17,27 @@
1717
"\n",
1818
"# Medical Image Classification Tutorial with the MedNIST Dataset\n",
1919
"\n",
20-
"In this tutorial, we introduce an end-to-end training and evaluation example based on the MedNIST dataset.\n",
20+
"This comprehensive tutorial demonstrates how to build a complete medical image classification system using MONAI and the MedNIST dataset. You'll learn to integrate MONAI's powerful features into PyTorch workflows for medical AI applications.\n",
2121
"\n",
22-
"We'll go through the following steps:\n",
23-
"* Create a dataset for training and testing\n",
24-
"* Use MONAI transforms to pre-process data\n",
25-
"* Use the DenseNet from MONAI for classification\n",
26-
"* Train the model with a PyTorch program\n",
27-
"* Evaluate on test dataset\n",
22+
"## Tutorial Overview\n",
23+
"\n",
24+
"This end-to-end tutorial covers the complete machine learning pipeline for medical image classification:\n",
25+
"\n",
26+
"1. **Dataset Preparation**: Create training, validation, and test datasets\n",
27+
"2. **Data Preprocessing**: Apply medical image transforms and augmentations\n",
28+
"3. **Model Architecture**: Implement DenseNet121 for medical image classification\n",
29+
"4. **Training Workflow**: Train with PyTorch using MONAI optimizations\n",
30+
"5. **Model Evaluation**: Comprehensive performance assessment and visualization\n",
31+
"6. **Advanced Features**: Occlusion sensitivity analysis for model interpretability\n",
32+
"\n",
33+
"## Learning Objectives\n",
34+
"\n",
35+
"- Understand MONAI's integration with PyTorch workflows\n",
36+
"- Learn medical image preprocessing techniques\n",
37+
"- Implement data augmentation strategies for medical images\n",
38+
"- Train robust classification models for medical data\n",
39+
"- Evaluate model performance with medical AI metrics\n",
40+
"- Use interpretation techniques to understand model decisions\n",
2841
"\n",
2942
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Project-MONAI/tutorials/blob/main/2d_classification/mednist_tutorial.ipynb)"
3043
]
@@ -217,11 +230,21 @@
217230
"cell_type": "markdown",
218231
"metadata": {},
219232
"source": [
220-
"## Read image filenames from the dataset folders\n",
233+
"## Explore the Dataset Structure\n",
234+
"\n",
235+
"Let's examine our MedNIST dataset to understand its organization and characteristics. This exploration step is crucial for understanding the data before training.\n",
236+
"\n",
237+
"### Dataset Organization\n",
238+
"\n",
239+
"The MedNIST dataset contains 6 medical image categories:\n",
240+
"- **Hand**: X-ray images of hands\n",
241+
"- **AbdomenCT**: CT scans of the abdomen \n",
242+
"- **CXR**: Chest X-rays\n",
243+
"- **ChestCT**: CT scans of the chest\n",
244+
"- **BreastMRI**: MRI images of breast tissue\n",
245+
"- **HeadCT**: CT scans of the head\n",
221246
"\n",
222-
"First of all, check the dataset files and show some statistics. \n",
223-
"There are 6 folders in the dataset: Hand, AbdomenCT, CXR, ChestCT, BreastMRI, HeadCT, \n",
224-
"which should be used as the labels to train our classification model."
247+
"Each folder name serves as the class label for our classification model."
225248
]
226249
},
227250
{

2d_classification/monai_101.ipynb

Lines changed: 95 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -16,19 +16,31 @@
1616
"See the License for the specific language governing permissions and \n",
1717
"limitations under the License.\n",
1818
"\n",
19-
"# MONAI 101 tutorial\n",
19+
"# MONAI 101 Tutorial\n",
2020
"\n",
21-
"In this tutorial, we will introduce how simple it can be to run an end-to-end classification pipeline with MONAI.\n",
21+
"Welcome to MONAI 101! This tutorial introduces beginners to the basics of building an end-to-end medical image classification pipeline with MONAI.\n",
2222
"\n",
23-
"These steps will be included in this tutorial, and each of them will take only a few lines of code:\n",
24-
"- Dataset download\n",
25-
"- Data pre-processing\n",
26-
"- Define a DenseNet-121 and run training\n",
27-
"- Check the results on test dataset\n",
23+
"## What You'll Learn\n",
2824
"\n",
29-
"This tutorial will use about 7GB of GPU memory and 10 minutes to run.\n",
25+
"In this tutorial, you'll discover how simple it can be to create a complete medical image classification system. We'll cover each step with just a few lines of code:\n",
3026
"\n",
31-
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Project-MONAI/tutorials/blob/main/2d_classification/monai_101.ipynb)"
27+
"- **Dataset Download**: Automatically retrieve and set up the MedNIST dataset\n",
28+
"- **Data Preprocessing**: Transform medical images for training\n",
29+
"- **Model Definition**: Set up a DenseNet-121 neural network for classification\n",
30+
"- **Training**: Train your model with medical imaging data\n",
31+
"- **Evaluation**: Test your trained model's performance\n",
32+
"\n",
33+
"## Requirements\n",
34+
"\n",
35+
"- **GPU Memory**: Approximately 7GB\n",
36+
"- **Runtime**: About 10 minutes\n",
37+
"- **Level**: Beginner (no prior MONAI experience required)\n",
38+
"\n",
39+
"## Quick Start Options\n",
40+
"\n",
41+
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Project-MONAI/tutorials/blob/main/2d_classification/monai_101.ipynb)\n",
42+
"\n",
43+
"*Click the badge above to run this tutorial in Google Colab without any local setup.*"
3244
]
3345
},
3446
{
@@ -130,11 +142,15 @@
130142
"cell_type": "markdown",
131143
"metadata": {},
132144
"source": [
133-
"## Setup data directory\n",
145+
"## Setup Data Directory\n",
134146
"\n",
135-
"You can specify a directory with the `MONAI_DATA_DIRECTORY` environment variable. \n",
136-
"This allows you to save results and reuse downloads. \n",
137-
"If not specified a temporary directory will be used."
147+
"You can specify a directory for storing datasets and results using the `MONAI_DATA_DIRECTORY` environment variable. \n",
148+
"This allows you to:\n",
149+
"- Save results permanently\n",
150+
"- Reuse downloaded datasets across different sessions\n",
151+
"- Avoid re-downloading large datasets\n",
152+
"\n",
153+
"If not specified, a temporary directory will be used (data will be lost after the session ends)."
138154
]
139155
},
140156
{
@@ -163,12 +179,21 @@
163179
"cell_type": "markdown",
164180
"metadata": {},
165181
"source": [
166-
"## Use MONAI transforms to preprocess data\n",
182+
"## Use MONAI Transforms to Preprocess Data\n",
183+
"\n",
184+
"Medical images require specialized methods for input/output (I/O), preprocessing, and augmentation. Unlike natural images, medical images often:\n",
185+
"- Follow specific formats (DICOM, NIfTI, etc.)\n",
186+
"- Are handled with specific protocols\n",
187+
"- Have high-dimensional data arrays\n",
188+
"- Require domain-specific preprocessing\n",
167189
"\n",
168-
"Medical images require specialized methods for I/O, preprocessing, and augmentation.\n",
169-
"They often follow specific formats, are handled with specific protocols, and the data arrays are often high-dimensional.\n",
190+
"In this example, we'll create a preprocessing pipeline using three MONAI transforms:\n",
170191
"\n",
171-
"In this example, we will perform image loading, data format verification, and intensity scaling with three `monai.transforms` listed below, and compose a pipeline ready to be used in next steps."
192+
"1. **`LoadImageD`**: Loads medical images from various formats\n",
193+
"2. **`EnsureChannelFirstD`**: Ensures the image has the correct channel dimension\n",
194+
"3. **`ScaleIntensityD`**: Normalizes pixel intensities to a standard range\n",
195+
"\n",
196+
"These transforms are combined into a pipeline that will be applied to our data."
172197
]
173198
},
174199
{
@@ -191,18 +216,29 @@
191216
"cell_type": "markdown",
192217
"metadata": {},
193218
"source": [
194-
"## Prepare datasets using MONAI Apps\n",
219+
"## Prepare Dataset Using MONAI Apps\n",
220+
"\n",
221+
"We'll use the `MedNISTDataset` from MONAI Apps to automatically download and set up our dataset. This convenience class will:\n",
222+
"- Download the dataset to your specified directory\n",
223+
"- Apply the preprocessing transforms we defined above\n",
224+
"- Split the data into training, validation, and test sets\n",
195225
"\n",
196-
"We use `MedNISTDataset` in MONAI Apps to download a dataset to the specified directory and perform the pre-processing steps in the `monai.transforms` compose.\n",
226+
"### About the MedNIST Dataset\n",
197227
"\n",
198-
"The MedNIST dataset was gathered from several sets from [TCIA](https://wiki.cancerimagingarchive.net/display/Public/Data+Usage+Policies+and+Restrictions),\n",
199-
"[the RSNA Bone Age Challenge](http://rsnachallenges.cloudapp.net/competitions/4),\n",
200-
"and [the NIH Chest X-ray dataset](https://cloud.google.com/healthcare/docs/resources/public-datasets/nih-chest).\n",
228+
"The MedNIST dataset is a collection of medical images from multiple sources:\n",
229+
"- [TCIA](https://wiki.cancerimagingarchive.net/display/Public/Data+Usage+Policies+and+Restrictions) (The Cancer Imaging Archive)\n",
230+
"- [RSNA Bone Age Challenge](http://rsnachallenges.cloudapp.net/competitions/4)\n",
231+
"- [NIH Chest X-ray Dataset](https://cloud.google.com/healthcare/docs/resources/public-datasets/nih-chest)\n",
201232
"\n",
202-
"The dataset is kindly made available by [Dr. Bradley J. Erickson M.D., Ph.D.](https://www.mayo.edu/research/labs/radiology-informatics/overview) (Department of Radiology, Mayo Clinic)\n",
203-
"under the Creative Commons [CC BY-SA 4.0 license](https://creativecommons.org/licenses/by-sa/4.0/).\n",
233+
"### Dataset Information\n",
234+
"- **Size**: 58,954 images\n",
235+
"- **Classes**: 6 medical image types (AbdomenCT, BreastMRI, CXR, ChestCT, Hand, HeadCT)\n",
236+
"- **Format**: 2D grayscale images\n",
237+
"- **License**: Creative Commons [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/)\n",
204238
"\n",
205-
"If you use the MedNIST dataset, please acknowledge the source. "
239+
"The dataset is kindly made available by [Dr. Bradley J. Erickson M.D., Ph.D.](https://www.mayo.edu/research/labs/radiology-informatics/overview) (Department of Radiology, Mayo Clinic).\n",
240+
"\n",
241+
"*If you use the MedNIST dataset in your research, please acknowledge the source.*"
206242
]
207243
},
208244
{
@@ -236,11 +272,24 @@
236272
"cell_type": "markdown",
237273
"metadata": {},
238274
"source": [
239-
"## Define a network and a supervised trainer\n",
275+
"## Define Network and Supervised Trainer\n",
276+
"\n",
277+
"Now we'll set up our machine learning model and training configuration.\n",
278+
"\n",
279+
"### Model Selection: DenseNet-121\n",
280+
"\n",
281+
"We'll use DenseNet-121, a proven convolutional neural network architecture that:\n",
282+
"- Has shown excellent performance on ImageNet and medical imaging tasks\n",
283+
"- Features dense connections between layers for better gradient flow\n",
284+
"- Is computationally efficient for medical image classification\n",
240285
"\n",
241-
"To train a model that can perform the classification task, we will use the DenseNet-121 which is known for its performance on the ImageNet dataset.\n",
286+
"### Training Configuration\n",
242287
"\n",
243-
"For a typical supervised training workflow, MONAI provides `SupervisedTrainer` to define the hyper-parameters."
288+
"MONAI provides `SupervisedTrainer` to simplify the training process. This high-level API handles:\n",
289+
"- Training loops and optimization\n",
290+
"- Loss computation and backpropagation \n",
291+
"- Metric tracking and logging\n",
292+
"- Device management (CPU/GPU)"
244293
]
245294
},
246295
{
@@ -270,7 +319,15 @@
270319
"cell_type": "markdown",
271320
"metadata": {},
272321
"source": [
273-
"## Run the training"
322+
"## Run the Training\n",
323+
"\n",
324+
"Now let's start the training process! The trainer will:\n",
325+
"- Load batches of medical images\n",
326+
"- Forward them through the DenseNet-121 model\n",
327+
"- Calculate the loss and update model weights\n",
328+
"- Track training progress\n",
329+
"\n",
330+
"This should take about 10 minutes on a GPU."
274331
]
275332
},
276333
{
@@ -287,7 +344,15 @@
287344
"cell_type": "markdown",
288345
"metadata": {},
289346
"source": [
290-
"## Check the prediction on the test dataset"
347+
"## Evaluate Model Performance on Test Dataset\n",
348+
"\n",
349+
"Let's see how well our trained model performs! We'll:\n",
350+
"- Load the test dataset (images the model has never seen)\n",
351+
"- Run predictions on these images\n",
352+
"- Compare predictions with ground truth labels\n",
353+
"- Display the results to see classification accuracy\n",
354+
"\n",
355+
"This evaluation helps us understand if our model can generalize to new medical images."
291356
]
292357
},
293358
{

2d_classification/monai_201.ipynb

Lines changed: 43 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -16,16 +16,31 @@
1616
"See the License for the specific language governing permissions and \n",
1717
"limitations under the License.\n",
1818
"\n",
19-
"# MONAI 201 tutorial\n",
19+
"# MONAI 201 Tutorial: Advanced Training Techniques\n",
2020
"\n",
21-
"In this tutorial we'll revisit the [MONAI 101 notebook](https://github.com/Project-MONAI/tutorials/blob/main/2d_classification/monai_101.ipynb) and add more features representing best practice concepts. This will include evaluation and tensorboard handler techniques.\n",
21+
"Welcome to MONAI 201! This tutorial builds upon [MONAI 101](https://github.com/Project-MONAI/tutorials/blob/main/2d_classification/monai_101.ipynb) and introduces advanced training techniques and best practices for production-ready medical AI models.\n",
2222
"\n",
23-
"These steps will be included in this tutorial, and each of them will take only a few lines of code:\n",
24-
"- Dataset download and Data pre-processing\n",
25-
"- Define a DenseNet-121 and run training\n",
26-
"- Run inference using SupervisedEvaluator\n",
23+
"## What You'll Learn\n",
2724
"\n",
28-
"This tutorial will use about 7GB of GPU memory and 10 minutes to run.\n",
25+
"This intermediate tutorial covers advanced concepts that are essential for building robust medical AI systems:\n",
26+
"\n",
27+
"- **Advanced Training Workflow**: Enhanced training with validation monitoring\n",
28+
"- **Model Evaluation**: Comprehensive evaluation using `SupervisedEvaluator`\n",
29+
"- **Experiment Tracking**: TensorBoard integration for training visualization\n",
30+
"- **Model Checkpointing**: Save and restore model states during training\n",
31+
"- **Production Best Practices**: Techniques used in real-world medical AI applications\n",
32+
"\n",
33+
"## Prerequisites\n",
34+
"\n",
35+
"- Complete [MONAI 101](https://github.com/Project-MONAI/tutorials/blob/main/2d_classification/monai_101.ipynb) or have basic MONAI knowledge\n",
36+
"- Understanding of deep learning concepts (training, validation, etc.)\n",
37+
"- Familiarity with PyTorch basics\n",
38+
"\n",
39+
"## Requirements\n",
40+
"\n",
41+
"- **GPU Memory**: Approximately 7GB\n",
42+
"- **Runtime**: About 10 minutes\n",
43+
"- **Level**: Intermediate\n",
2944
"\n",
3045
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Project-MONAI/tutorials/blob/main/2d_classification/monai_201.ipynb)"
3146
]
@@ -127,9 +142,9 @@
127142
"cell_type": "markdown",
128143
"metadata": {},
129144
"source": [
130-
"## Use MONAI transforms to preprocess data\n",
145+
"## Prepare Data with MONAI Transforms\n",
131146
"\n",
132-
"We'll first prepare the data very much like in the previous tutorial with the same transforms and dataset:"
147+
"We'll prepare our data using the same transforms as MONAI 101, but this time we'll also create a validation dataset. This separation is crucial for monitoring training progress and preventing overfitting."
133148
]
134149
},
135150
{
@@ -180,10 +195,18 @@
180195
"cell_type": "markdown",
181196
"metadata": {},
182197
"source": [
183-
"## Define a network and a supervised trainer\n",
198+
"## Advanced Training Setup with Evaluation and Monitoring\n",
184199
"\n",
185-
"For training we have the same elements again and will slightly change the `SupervisedTrainer` by expanding its train_handlers. This upgrade will be beneficial for efficient utilization of TensorBoard.\n",
186-
"Furthermore, we introduce a `SupervisedEvaluator` object that will efficiently track model progress. Accompanied by `TensorBoardStatsHandler`, it will log statistics for TensorBoard, ensuring precise tracking and management."
200+
"Now we'll create a more sophisticated training setup that includes validation monitoring and experiment tracking. This represents production-level best practices for medical AI development.\n",
201+
"\n",
202+
"### Key Components\n",
203+
"\n",
204+
"1. **`SupervisedEvaluator`**: Handles validation during training to monitor model performance\n",
205+
"2. **`TensorBoardStatsHandler`**: Logs training metrics for visualization\n",
206+
"3. **`CheckpointSaver`**: Automatically saves model checkpoints during training\n",
207+
"4. **`ValidationHandler`**: Coordinates validation runs at specified intervals\n",
208+
"\n",
209+
"This setup provides real-time monitoring of your model's learning progress and helps identify issues like overfitting early in the training process."
187210
]
188211
},
189212
{
@@ -252,9 +275,15 @@
252275
"cell_type": "markdown",
253276
"metadata": {},
254277
"source": [
255-
"## View training in tensorboard\n",
278+
"## Visualize Training Progress with TensorBoard\n",
279+
"\n",
280+
"TensorBoard provides powerful visualization tools to monitor your training progress. You can view:\n",
281+
"- Training and validation loss curves\n",
282+
"- Model performance metrics over time\n",
283+
"- Learning rate schedules\n",
284+
"- Model architecture graphs\n",
256285
"\n",
257-
"Please uncomment the following cell to load tensorboard results."
286+
"To view the results, uncomment and run the following cell. TensorBoard will open in your browser showing real-time training metrics."
258287
]
259288
},
260289
{

0 commit comments

Comments
 (0)