Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/chapter2/第二章 Transformer架构.md
Original file line number Diff line number Diff line change
Expand Up @@ -374,7 +374,7 @@ class MLP(nn.Module):

```

注意,Transformer 的前馈神经网络是由两个线性层中间加一个 RELU 激活函数组成的,以及前馈神经网络还加入了一个 Dropout 层来防止过拟合。
注意,Transformer 的前馈神经网络是由两个线性层中间加一个 RELU 激活函数组成的,以及前馈神经网络还加入了一个 Dropout 层来防止过拟合。Dropout 层只在训练时开启,推理/测试阶段关闭,所以许多Transformer结构示意图中不会画出该层。

### 2.2.3 层归一化

Expand Down