Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hddtemp_smartctl: configure warning and critical temps per device #1560

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ap-wtioit
Copy link
Contributor

this is useful if you have a mix of drives in the system (e.g. HDD + SSD) that have different temperature limits

Info @wt-io-it

Copy link
Contributor

@yunal yunal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see the $cfn change please.

if (defined($ENV{clean_fieldname($_) . ".critical"})) {
$critical = $ENV{clean_fieldname($_) . ".critical"};
}
print clean_fieldname($_) . ".critical $critical\n";
my $id = get_drive_id($_, device_for_drive($_), $use_nocheck);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks. This is good.

Could you please do my $cfn = clean_fieldname($_); at your line 237 and use that in the place of clean_fieldname($_) in the rest of the patch?

I'm not sure that we should introduce default temperature warning/critical levels here. The temperatures you chose are sort of sane but to narrow compared to the operating temperature of some of my disks. The first of my disks I checked has a "operating" envelope from 0 to 65 and non-operating from -40 - 70. Don't know if other disks are less or more temperature tolerant.

Having a env.warning and env.critical to use as default is entirely sane.

Munin::Plugin has a API to support this: print_thresholds, but there is no need to use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update the PR later today with the suggestion of $cfn. I will also add the documentation how to get the warning and critical temperatures with the nvme-cli (nvme).

The warning 57 and critical 60 were not chosen by me but are already present in munin. I just kept them for backwards compatiblity.
We mainly use this to fix the warning and critical values for SSDs (which can be checked with sudo nvme id-ctrl -H /dev/nvme0) as the often go past the 60°C munin has now for critical temperature (which i guess was chosen for spinning disks).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to use the `$d variable already present in the current munin master (i missed that because i started the patch from the munin version in debian)

this is useful if you have a mix of drives in the system
(e.g. HDD + SSD) that have different temperature limits
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants