Skip to content

Commit 1814867

Browse files
committed
differences for PR #584
1 parent 60bcc8f commit 1814867

File tree

4 files changed

+115
-1
lines changed

4 files changed

+115
-1
lines changed

5-transfer-learning.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -258,6 +258,120 @@ The final validation accuracy reaches 64%, this is a huge improvement over 30% a
258258
::::
259259
:::
260260

261+
::: challenge
262+
## Fine-Tune the Top Layer of the Pretrained Model
263+
264+
So far, we've trained only the custom head while keeping the DenseNet121 base frozen. Let's now **unfreeze just the top layer group** of the base model and observe how performance changes.
265+
266+
### 1. Unfreeze top layers
267+
Unfreeze just the final convolutional block of the base model using:
268+
269+
```python
270+
# 1. Unfreeze top block of base model
271+
set_trainable = False
272+
for layer in base_model.layers:
273+
if 'conv5' in layer.name:
274+
set_trainable = True
275+
layer.trainable = set_trainable
276+
```
277+
278+
### 2. Recompile the model
279+
Any time you change layer trainability, you **must recompile** the model.
280+
281+
Use the same optimizer and loss function as before:
282+
- `optimizer='adam'`
283+
- `loss=SparseCategoricalCrossentropy(from_logits=True)`
284+
- `metrics=['accuracy']`
285+
286+
### 3. Retrain the model
287+
Retrain the model using the same setup as before:
288+
289+
- `batch_size=32`
290+
- `epochs=30`
291+
- Early stopping with `patience=5`
292+
- Pass in the validation set using `validation_data`
293+
- Store the result in a new variable called `history_finetune`
294+
295+
> You can reuse your `early_stopper` callback or redefine it.
296+
297+
### 4. Compare with baseline (head only)
298+
Plot the **validation accuracy** for both the baseline and fine-tuned models.
299+
300+
**Questions to reflect on:**
301+
- Did unfreezing part of the base model improve validation accuracy?
302+
- Did training time increase significantly?
303+
- Is there any evidence of overfitting?
304+
305+
:::: solution
306+
## Solution
307+
```python
308+
# 1. Unfreeze top block of base model
309+
set_trainable = False
310+
for layer in base_model.layers:
311+
if 'conv5' in layer.name:
312+
set_trainable = True
313+
else:
314+
set_trainable = False
315+
layer.trainable = set_trainable
316+
317+
# 2. Recompile the model
318+
model.compile(optimizer='adam',
319+
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
320+
metrics=['accuracy'])
321+
322+
# 3. Retrain the model
323+
early_stopper = keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=5)
324+
history_finetune = model.fit(train_images, train_labels,
325+
batch_size=32,
326+
epochs=30,
327+
validation_data=(val_images, val_labels),
328+
callbacks=[early_stopper])
329+
330+
# 4. Plot comparison
331+
def plot_two_histories(h1, h2, label1='Frozen', label2='Finetuned'):
332+
import pandas as pd
333+
import matplotlib.pyplot as plt
334+
df1 = pd.DataFrame(h1.history)
335+
df2 = pd.DataFrame(h2.history)
336+
plt.plot(df1['val_accuracy'], label=label1)
337+
plt.plot(df2['val_accuracy'], label=label2)
338+
plt.xlabel("Epochs")
339+
plt.ylabel("Validation Accuracy")
340+
plt.legend()
341+
plt.title("Validation Accuracy: Frozen vs. Finetuned")
342+
plt.show()
343+
344+
plot_two_histories(history, history_finetune)
345+
346+
```
347+
348+
![](episodes/fig/05-frozen_vs_finetuned.png)
349+
350+
**Discussion of results**: Validation accuracy improved across all epochs compared to the frozen baseline. Training time also increased slightly, but the model was able to adapt better to the new dataset by fine-tuning the top convolutional block.
351+
352+
This makes sense: by unfreezing the last part of the base model, you're allowing it to adjust high-level features to the new domain, while still keeping the earlier, general-purpose filters/feature-detectors of the model intact.
353+
354+
355+
**What happens if you unfreeze too many layers?**
356+
If you unfreeze most or all of the base model:
357+
358+
- Training time increases significantly because more weights are being updated.
359+
- The model may forget some of the general-purpose features it learned during pretraining. This is called "catastrophic forgetting."
360+
- Overfitting becomes more likely, especially if your dataset is small or noisy.
361+
362+
363+
### When does this approach work best?
364+
365+
Fine-tuning a few top layers is a good middle ground. You're adapting the model without retraining everything from scratch. If your dataset is small or very different from the original ImageNet data, you should be careful not to unfreeze too many layers.
366+
367+
For most use cases:
368+
- Freeze most layers
369+
- Unfreeze the top block or two
370+
- Avoid full fine-tuning unless you have lots of data and compute
371+
372+
::::
373+
:::
374+
261375
## Concluding: The power of transfer learning
262376
In many domains, large networks are available that have been trained on vast amounts of data, such as in computer vision and natural language processing. Using transfer learning, you can benefit from the knowledge that was captured from another machine learning task. In many fields, transfer learning will outperform models trained from scratch, especially if your dataset is small or of poor quality.
263377

fig/.gitkeep

Whitespace-only changes.

fig/05-frozen_vs_finetuned.png

110 KB
Loading

md5sum.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
"episodes/2-keras.md" "ebddf0ec35a0e97735a1ea7f1d204a40" "site/built/2-keras.md" "2025-02-11"
1010
"episodes/3-monitor-the-model.md" "93984f2bd862ddc2f10ba6749950b719" "site/built/3-monitor-the-model.md" "2025-02-11"
1111
"episodes/4-advanced-layer-types.md" "97b49e9dad76479bcfe608f0de2d52a4" "site/built/4-advanced-layer-types.md" "2025-02-11"
12-
"episodes/5-transfer-learning.md" "5808f2218c3f2d2d400e1ec1ad9f1f3c" "site/built/5-transfer-learning.md" "2025-02-11"
12+
"episodes/5-transfer-learning.md" "cd98f6e43ccedbcdf877a66e33f04449" "site/built/5-transfer-learning.md" "2025-05-15"
1313
"episodes/6-outlook.md" "007728216562f3b52b983ff1908af5b7" "site/built/6-outlook.md" "2025-02-11"
1414
"instructors/bonus-material.md" "382832ea4eb097fc7781cb36992c1955" "site/built/bonus-material.md" "2025-02-11"
1515
"instructors/design.md" "1537f9d0c90cdfdb1781bd3daf94dadd" "site/built/design.md" "2025-02-11"

0 commit comments

Comments
 (0)