-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
Bug description
Got error
File "c:\Users\sean\miniconda3\envs\keras+torch+pl\Lib\site-packages\lightning\pytorch\loops\training_epoch_loop.py", line 459, in _update_learning_rates
raise MisconfigurationException(
lightning.fabric.utilities.exceptions.MisconfigurationException: ReduceLROnPlateau conditioned on metric val/loss which is not available. Available metrics are: ['lr-AdamW/pg1', 'lr-AdamW/pg2', 'train/a_pcc', 'train/loss']. Condition can be set using `monitor` key in lr scheduler dict
Here is the configure_optimizers
function:
@final
def configure_optimizers(self):
decay, no_decay = [], []
for name, param in self.named_parameters():
if not param.requires_grad:
continue
if "bias" in name or "Norm" in name:
no_decay.append(param)
else:
decay.append(param)
grouped_params = [
{"params": decay, "weight_decay": self.weight_decay, "lr": self.lr * 0.3},
{
"params": no_decay,
"weight_decay": self.weight_decay,
"lr": self.lr * 1.7,
},
]
optimizer = self.optmizer_class(
grouped_params, lr=self.lr, weight_decay=self.weight_decay
)
scheduler = self.lr_scheduler_class(
optimizer, **self.lr_scheduler_args if self.lr_scheduler_args else {}
)
scheduler = {
"scheduler": self.lr_scheduler_class(
optimizer, **self.lr_scheduler_args if self.lr_scheduler_args else {}
),
"monitor": "val/loss",
"interval": "epoch",
"frequency": 1,
# "strict": False,
}
return {"optimizer": optimizer, "lr_scheduler": scheduler}
The lr_scheduler_class
is passed in as
lr_scheduler_class: torch.optim.lr_scheduler.ReduceLROnPlateau
lr_scheduler_args:
mode: min
factor: 0.5
patience: 10
threshold: 0.0001
threshold_mode: rel
cooldown: 5
min_lr: 1.e-9
eps: 1.e-08
(using yaml and CLI, which, I think, is not the case here)
It seems that I got the error at the end of the training epoch, as I just see the progress bar reports train/loss. The validation epoch is not finished, but the scheduler is called.
I am quite sure that val/loss is available after validation epoch is finished, because progress bar can correctly display it.
What version are you seeing the problem on?
v2.5
Reproduced in studio
No response
How to reproduce the bug
Error messages and logs
# Error messages and logs here please
Environment
StatusCode : 200
StatusDescription : OK
Content : # Copyright The Lightning AI team.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the...
RawContent : HTTP/1.1 200 OK
Connection: keep-alive
Content-Security-Policy: default-src 'none'; style-src 'unsafe-inline'; sandbox
Strict-Transport-Security: max-age=31536000
X-Content-Type-Options: nosniff
...
Forms : {}
Headers : {[Connection, keep-alive], [Content-Security-Policy, default-src 'none'; style-src 'unsafe-inline'; sandbox], [Strict-Transport-Security, max-age=31536000],
[X-Content-Type-Options, nosniff]...}
Images : {}
InputFields : {}
Links : {}
ParsedHtml : mshtml.HTMLDocumentClass
RawContentLength : 2775
More info
No response