Skip to content

Commit 8ee55fd

Browse files
authored
Merge pull request #20 from TezRomacH/dev
2 parents 54ebff9 + 148663d commit 8ee55fd

File tree

10 files changed

+11818
-11508
lines changed

10 files changed

+11818
-11508
lines changed

README.md

+35-10
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,9 @@ PyTorch implementation of L2L execution algorithm from paper [Training Large Neu
1919

2020
You need to define a torch model where all layers are specified in ModuleList.
2121

22-
for example
22+
See [examples folder](examples)
23+
24+
### Basic usage
2325

2426
```python
2527
import torch
@@ -55,15 +57,15 @@ class M(nn.Module):
5557

5658
return x
5759

60+
61+
model = M(depth=5, dim=40).train() # on CPU
5862
```
5963

6064
Then, you can use the L2L wrapper over this model.
6165

6266
```python
6367
from layer_to_layer_pytorch.l2l import Layer2Layer
6468

65-
model = M(depth=5, dim=40).train() # on CPU
66-
6769
l2l_model = Layer2Layer(
6870
model,
6971
layers_attr="layers", # attribute with ModuleList
@@ -81,23 +83,46 @@ x = torch.rand(1_000, 40) # on CPU
8183
y = torch.rand(1_000, 40) # on CPU
8284

8385
losses = []
84-
loss_fn = nn.MSELoss(reduction="sum") # since L2L calcs average losses itself, we just need to save them
86+
criterion = nn.MSELoss()
8587

86-
optimizer = optim.AdamW(l2l_model.main_model.parameters(), lr=0.001) # optimizer works with the main model on CPU
88+
optimizer = optim.AdamW(l2l_model.main_params) # optimizer works with the main model on CPU
8789

88-
for i in trange(5000):
90+
for i in trange(2000):
8991
l2l_model.zero_grad()
90-
l2l_model.forward(x)
92+
_ = l2l_model.forward(x)
9193

92-
loss_value = l2l_model.backward(x, y, loss_fn)
94+
loss_value: float = l2l_model.compute_loss(y, criterion)
9395

9496
if i % 50 == 0:
95-
tqdm.write(f"[{i}] loss = {loss_value.item()}")
96-
losses.append(loss_value.item())
97+
tqdm.write(f"[{i}] loss = {loss_value}")
98+
losses.append(loss_value)
9799

100+
101+
l2l_model.backward()
98102
optimizer.step()
103+
l2l_model.update_main_model_params() # Sync params with CPU
104+
```
105+
106+
### FP-16 usage
107+
108+
Cross-mixes-precision available in init params
109+
110+
```python
111+
from layer_to_layer_pytorch.l2l import Layer2Layer
112+
113+
l2l_model = Layer2Layer(
114+
model,
115+
layers_attr="layers",
116+
microbatch_size=100,
117+
118+
# fp-16
119+
mixed_precision=True,
120+
loss_scale = 128.0
121+
)
99122
```
100123

124+
And then train the same way 😉
125+
101126
## Installation
102127

103128
```bash

0 commit comments

Comments
 (0)