-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defer the MAC clear on the VF until after cmdDel() returns #186
Conversation
This can help ensure a MAC clean up if cmdDel() fails
@@ -185,6 +185,10 @@ func cmdDel(args *skel.CmdArgs) error { | |||
|
|||
sm := sriov.NewSriovManager() | |||
|
|||
defer func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the issue #126, it mentions that device is not visible in init netns until after the last cmdDel finishes, will this defer function be able to detect VF device in the host namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will try to reset the mac on the VF device in the init netns after ReleaseVF() which is called by cmdDel() has renamed it back to its original name and moved it back to the init netns.
One issue I see is if ReleaseVF() didn't succeed in renaming and moving it back to the init netns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can get the netdevice of the VF by conf.Master
conf.VFID
that way you dont rely on its name.
also MAC is restored during release VF. in case of a successful CmdDel (i.e netns was not deleted mid operation)
adding this defer will cause the operation to be performed twice
There is also a new use-case here right ? from what i understand from this PR is that the original MAC address is always saved and restored. Now, incomplete cleanup during CmdDel may occur if any of the operations during |
+1, I think we shall do the best-effort cleanup when |
This has been opened for a while, meanwhile a few improvements were introduced. see #115 and related issues. closing this one, feel free to rebase and update the PR if you feel additional changes are needed. |
This is a proposed partial fix for #126, which causes interfaces to be left in a configured state when the pause container is removed. This poses a problem specifically when bonding and MAC sharing is used, as you have the ability to create duplicate MACs if the interfaces aren't returned to their initial state when released by the CNI. This patch tries to ensure the MAC is eventually returned to its prior value.
This was written by a coworker, and I'm hoping to solicit some opinions. Is this still susceptible to the issue because the device might not always show in the init ns, or should it always be available after
cmdDel()
completes and this defer will always work? If I understand correctly there is still a small race condition, as the VF can be assigned back out from the the pool before the deferred call completes, if containers are spinning right back up.