Skip to content

Conversation

JenksJ
Copy link
Contributor

@JenksJ JenksJ commented Oct 8, 2025

When mounting a cifs mount that included white space in the source path the slurm NHC (Node Health Check) fails. This resulted in all the nodes being put into drain:
The slrum warning output was something like:

NHC: check_fs_mount:
/mnt/gen_epi mounted from //10.10.10.10/3.040Projects/sequencing040list
(should match //10.10.10.10/3. Projects/sequencing list)

Ansible mount in /etc/fstab formats white space as ASCII code \040 including the \.
/etc/fstab (and the CIFS kernel client) does not accept pre-escaped \040 in the src path. So the replace(' ', '\\040') was added to ansible/roles/nhc/templates/nhc.conf.j2 as a quick fix.

@sjpb
Copy link
Collaborator

sjpb commented Oct 10, 2025

@JenksJ is this still really draft or do you want us to review?

@sjpb
Copy link
Collaborator

sjpb commented Oct 10, 2025

@sjpb sjpb changed the base branch from main to fix/NHC_fails_mount_with_space October 14, 2025 10:12
@sjpb sjpb marked this pull request as ready for review October 14, 2025 10:13
@sjpb sjpb requested a review from a team as a code owner October 14, 2025 10:13
@sjpb sjpb merged commit edad01a into stackhpc:fix/NHC_fails_mount_with_space Oct 14, 2025
4 of 6 checks passed
sjpb added a commit that referenced this pull request Oct 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants