You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Q1.
if who want to train model with m1 max, his the biggest reason of choosing m1 max is the huge VRAM not the throughput.
I also want to know how large memory can be used in training model with m1 max. (60? 55???)
Q2.
do you have any problem with tensorflow-metal???
do you have any plan to improve this post with some experience about problem&solution in M1 training with TF. (some code or compatibilities with major packages in ML/Deep Learning)
It is just my suggestion. but I think many people want this experience sharing.
Thank you!
The text was updated successfully, but these errors were encountered:
I only have the 32GB version, so I cannot answer with absolute certainty. I observe that about ~3GB is required for the OS etc, so in theory you would have about ~60GB for training.
That said, I do not believe it makes sense to train large models, because it will be even slower than the smaller models, which is already 8x-10x slower than the usual GPUs we would use for training (V100, 3090, A100 etc.). VRAM issue can be mitigated through a variety of strategies (mixed precision, activation checkpointing, gradient accumulation, DeepSpeed, or even optimizer choice (Adafactor vs Adam).
Q2:
I have not encounter any problems so far, it is surprisingly painless.
m1 max has 64 GB RAM.
Q1.
if who want to train model with m1 max, his the biggest reason of choosing m1 max is the huge VRAM not the throughput.
I also want to know how large memory can be used in training model with m1 max. (60? 55???)
Q2.
do you have any problem with tensorflow-metal???
do you have any plan to improve this post with some experience about problem&solution in M1 training with TF. (some code or compatibilities with major packages in ML/Deep Learning)
It is just my suggestion. but I think many people want this experience sharing.
Thank you!
The text was updated successfully, but these errors were encountered: