-
Notifications
You must be signed in to change notification settings - Fork 448
OCPEDGE-2188: embed fencing validator into TNF MCO #5285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: nhamza <[email protected]>
@Neilhamza: This pull request references OCPEDGE-2188 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Neilhamza The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@Neilhamza: This pull request references OCPEDGE-2188 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, I had some suggestions and questions. This is my initial pass, I'll give it another review once I deploy and test it on a cluster.
templates/master/00-master/two-node-with-fencing/files/fencing-validator.yaml
Outdated
Show resolved
Hide resolved
templates/master/00-master/two-node-with-fencing/files/fencing-validator.yaml
Outdated
Show resolved
Hide resolved
templates/master/00-master/two-node-with-fencing/files/fencing-validator.yaml
Outdated
Show resolved
Hide resolved
templates/master/00-master/two-node-with-fencing/files/fencing-validator.yaml
Outdated
Show resolved
Hide resolved
templates/master/00-master/two-node-with-fencing/files/fencing-validator.yaml
Outdated
Show resolved
Hide resolved
templates/master/00-master/two-node-with-fencing/files/fencing-validator.yaml
Outdated
Show resolved
Hide resolved
templates/master/00-master/two-node-with-fencing/files/fencing-validator.yaml
Outdated
Show resolved
Hide resolved
fi | ||
|
||
for ip in "$IP_A" "$IP_B"; do | ||
awk -F'|' -v ip="$ip" ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have the output as json, we should be able to check if both IPs exist with jq here. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same concern as above what are your thoughts?
templates/master/00-master/two-node-with-fencing/files/fencing-validator.yaml
Outdated
Show resolved
Hide resolved
templates/master/00-master/two-node-with-fencing/files/fencing-validator.yaml
Outdated
Show resolved
Hide resolved
Signed-off-by: nhamza <[email protected]>
Signed-off-by: nhamza <[email protected]>
@Neilhamza: This pull request references OCPEDGE-2188 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@Neilhamza: This pull request references OCPEDGE-2188 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just had some more minor suggestinos
templates/master/00-master/two-node-with-fencing/files/fencing-validator.yaml
Outdated
Show resolved
Hide resolved
templates/master/00-master/two-node-with-fencing/files/fencing-validator.yaml
Outdated
Show resolved
Hide resolved
templates/master/00-master/two-node-with-fencing/files/fencing-validator.yaml
Outdated
Show resolved
Hide resolved
Signed-off-by: nhamza <[email protected]>
Signed-off-by: nhamza <[email protected]>
@Neilhamza: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, had some small suggestions
get_internal_ip() { | ||
local node="$1" | ||
oc_run get node "$node" -o json | | ||
jq -r ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simplify this a bit since we just want the first internal ip of the node, or else either empty or add ?
to return the status code.
jq -re '[.status.addresses[] | select(.type == "InternalIP")][0].address // empty'
EXIT_FENCING_SECRETS_MISMATCH=26 | ||
EXIT_DAEMONS_BAD=22 | ||
EXIT_ETCD_NOT_READY=23 | ||
EXIT_ETCD_FATAL=24 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be unused
} | ||
|
||
pcmk_online() { | ||
local want="$1" s="${1%%.*}" names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets avoid using single letter variables, change s
to a word to define it's purpose.
} | ||
|
||
wait_not_ready() { | ||
local n="$1" deadline=$((SECONDS + TIMEOUT)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, n
to something descriptive
out="$(host_run "$tgt" \ | ||
"podman exec etcd sh -lc 'ETCDCTL_API=3 etcdctl -w json member list'")" && | ||
jq -e --arg ipa "$IP_A" --arg ipb "$IP_B" ' | ||
.members as $m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're explicit about etcd version 3 API, we should be able to simplify this a bit to just
.members | map(select(.isLearner | not)) | any(.clientURLs[] | contains($ipa)) and any(.clientURLs[] | contains($ipb))
|
||
wait_ready() { | ||
local n="$1" deadline=$((SECONDS + TIMEOUT)) | ||
log "Waiting for '$n' Ready (API)…" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, if not only the variable but log messages should be a bit more descriptive about what we are waiting for to be ready
log_ok() { | ||
printf '\033[32m[OK]\033[0m %s\n' "$*" | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol small nit, this space is bothering me, it's the only cluster of functions that is not spread by one space, let's keep it uniform and add a single space between all these helper functions
[[ "$1" == *:* ]] | ||
} | ||
fmt_host() { | ||
local h="$1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets clarify this a bit, make fmt_host
a bit more descriptive, and change the h
to something more descriptive as well. Since this is formatting the host IP or url to safe wrap for ipv6, something like this should be clear enough fmt_host_ip
local node="$1" ns="openshift-etcd" short_node | ||
short_node="$(short_hostname "$node")" | ||
oc_run -n "$ns" get secret -o json 2>/dev/null | | ||
jq -e --arg node "$node" --arg short "$short_node" ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets simplify this a bit, if we know that the fencing-credentials
will be a prefix, we can make this a bit more legible with this query, since we know the short-node name will always be present after the prefix, we can just look for fencing-credentials-$short.*
. This avoids any mistakes of using secrets that might just be called the short hostname
'.items[] | select(.metadata.name | test("fencing-credentials-$short.*")) | .metadata.name?'
What I did
Added a new MachineConfig template file under templates/master/00-master/two-node-with-fencing/files/ that installs the fencing_validator.sh script to /usr/local/bin/ on control-plane nodes for Two-Node Fencing clusters.
How to verify it
Deploy a Two-Node Fencing cluster.
Verify the MachineConfig for masters includes the new file.
On a master node, run:
oc debug node/ -- chroot /host ls -l /usr/local/bin/fencing_validator
oc debug node/ -- chroot /host /usr/local/bin/fencing_validator --help
copy it into the hypervisor:
oc debug node/ -- chroot /host cat /usr/local/bin/fencing_validator > fencing_validator
chmod +x fencing_validator
The script should be present, executable (0755), and runnable.
Ship /usr/local/bin/fencing_validator.sh via MCO for Two-Node Fencing clusters.