You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 5-transfer-learning.md
+114Lines changed: 114 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -258,6 +258,120 @@ The final validation accuracy reaches 64%, this is a huge improvement over 30% a
258
258
::::
259
259
:::
260
260
261
+
::: challenge
262
+
## Fine-Tune the Top Layer of the Pretrained Model
263
+
264
+
So far, we've trained only the custom head while keeping the DenseNet121 base frozen. Let's now **unfreeze just the top layer group** of the base model and observe how performance changes.
265
+
266
+
### 1. Unfreeze top layers
267
+
Unfreeze just the final convolutional block of the base model using:
268
+
269
+
```python
270
+
# 1. Unfreeze top block of base model
271
+
set_trainable =False
272
+
for layer in base_model.layers:
273
+
if'conv5'in layer.name:
274
+
set_trainable =True
275
+
layer.trainable = set_trainable
276
+
```
277
+
278
+
### 2. Recompile the model
279
+
Any time you change layer trainability, you **must recompile** the model.
280
+
281
+
Use the same optimizer and loss function as before:
plt.title("Validation Accuracy: Frozen vs. Finetuned")
342
+
plt.show()
343
+
344
+
plot_two_histories(history, history_finetune)
345
+
346
+
```
347
+
348
+

349
+
350
+
**Discussion of results**: Validation accuracy improved across all epochs compared to the frozen baseline. Training time also increased slightly, but the model was able to adapt better to the new dataset by fine-tuning the top convolutional block.
351
+
352
+
This makes sense: by unfreezing the last part of the base model, you're allowing it to adjust high-level features to the new domain, while still keeping the earlier, general-purpose filters/feature-detectors of the model intact.
353
+
354
+
355
+
**What happens if you unfreeze too many layers?**
356
+
If you unfreeze most or all of the base model:
357
+
358
+
- Training time increases significantly because more weights are being updated.
359
+
- The model may forget some of the general-purpose features it learned during pretraining. This is called "catastrophic forgetting."
360
+
- Overfitting becomes more likely, especially if your dataset is small or noisy.
361
+
362
+
363
+
### When does this approach work best?
364
+
365
+
Fine-tuning a few top layers is a good middle ground. You're adapting the model without retraining everything from scratch. If your dataset is small or very different from the original ImageNet data, you should be careful not to unfreeze too many layers.
366
+
367
+
For most use cases:
368
+
- Freeze most layers
369
+
- Unfreeze the top block or two
370
+
- Avoid full fine-tuning unless you have lots of data and compute
371
+
372
+
::::
373
+
:::
374
+
261
375
## Concluding: The power of transfer learning
262
376
In many domains, large networks are available that have been trained on vast amounts of data, such as in computer vision and natural language processing. Using transfer learning, you can benefit from the knowledge that was captured from another machine learning task. In many fields, transfer learning will outperform models trained from scratch, especially if your dataset is small or of poor quality.
0 commit comments