-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: math domain error #40
Comments
Hey @hayoung-jeremy , try reducing the value of global_step_period under val: in the train sample yaml file , until it stops giving the error, which worked for me when I was trying to train with 350 objects. |
Wow, you're my savior, thank you so much! I'll try it! |
Thank you @kunalkathare , I've tried with the following config, modified ...
train:
mixed_precision: bf16
find_unused_parameters: false
loss:
pixel_weight: 1.0
perceptual_weight: 1.0
tv_weight: 5e-4
optim:
lr: 4e-4
weight_decay: 0.05
beta1: 0.9
beta2: 0.95
clip_grad_norm: 1.0
scheduler:
type: cosine
warmup_real_iters: 3000
batch_size: 16
accum_steps: 1
epochs: 100 # MODIFIED : 60 -> 100
debug_global_steps: null
val:
batch_size: 4
global_step_period: 100 # MODIFIED : 1000 -> 100
debug_batches: null
... and successfully generated a checkpoint as follows : [TRAIN STEP]loss=0.642, loss_pixel=0.0695, loss_perceptual=0.572, loss_tv=0.7, lr=1.35e-5: 100%|███████████████████████████████████████████████| 100/100 [03:24<00:00, 5.10s/it] But it seems the loss value is too high. What should I modify to decrease the loss value? |
The loss value is reduced when the size of the dataset is more, and I guess you can increase the epochs and see if it affects. |
Thank you for kind reply @kunalkathare !
Really great help from you, many thanks for your assistance. |
summary
reproduction of the error
installation of OpenLRM was successful
data preparation using
blender_script.py
was successful, generated 100 pairs of data each containigrgba
,pose
,intrinsics.npy
.configuration of
training_sample.yaml
andaccelerate_training.yaml
as follows :the error message :
The text was updated successfully, but these errors were encountered: