-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add a cleanup loop for failed vm using configmap #59
Conversation
5b390fc
to
555458c
Compare
@alonSadan please take a look |
cf696c9
to
123427c
Compare
pkg/manager/manager.go
Outdated
@@ -49,18 +50,20 @@ type KubeMacPoolManager struct { | |||
stopSignalChannel chan os.Signal | |||
podNamespace string | |||
podName string | |||
waitingTime int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a docstring explaining why we have this attribute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -31,6 +31,7 @@ const ( | |||
RuntimeObjectFinalizerName = "k8s.v1.cni.cncf.io/kubeMacPool" | |||
networksAnnotation = "k8s.v1.cni.cncf.io/networks" | |||
networksStatusAnnotation = "k8s.v1.cni.cncf.io/networks-status" | |||
vmWaitConfigMapName = "kubemacpool-vm-configmap" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this CM good for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This CM saved the allocated mac addresses in waiting state.
We need to use it in case of a race condition that our manager go down and It have that types of allocations so the new leader can continue from the same state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds dangerous synchronisation-wise, but that's not different from the rest of kubemacpool... :D
@@ -0,0 +1,15 @@ | |||
package names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you introduce this change in a different commit please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Expect(err).ToNot(HaveOccurred()) | ||
_, err = NewPoolManager(fakeClient, startPoolRangeEnv, endPoolRangeEnv, false) | ||
Expect(err).ToNot(HaveOccurred()) | ||
createPoolManager("02:00:00:00:00:00", "02:FF:FF:FF:FF:FF") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please do this change in a separate commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -333,6 +355,121 @@ func (p *PoolManager) revertAllocationOnVm(vmName string, allocations map[string | |||
p.releaseMacAddressesFromInterfaceMap(allocations) | |||
} | |||
|
|||
func (p *PoolManager) getOrCreateVmMacWaitMap() (map[string]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think it would make sense to keep these helpers in a separate module. this logic is not trivial and it would be easier to document there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will prefer to document it here as we are going to have the same functions for pods in the future.
I will add comments onto of every function
This commit also extend the time for tests to 20min.
55e0310
to
4a7fef0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel good about the state of kubemacpool. If I give it +1 it is from full faith in Sebastian. We have to summarize all moving pieces of the controller, write it down and try to come up with a better design. It became a pile of synchronisation mechanisms and ping-pong with Kubernetes API.
|
||
configMap, err := p.kubeClient.CoreV1().ConfigMaps(names.MANAGER_NAMESPACE).Get(vmWaitConfigMapName, metav1.GetOptions{}) | ||
if err != nil { | ||
// TODO: should we exit here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to resolve this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can remove this.
The next loop will try again
is an validation error from the virt-api we didn't clean the allocation. This PR introduce a cleanup loop into the system. When we hit the create vm mutating webhook we create the regular allocation but we also mark the mac address in a waiting configmap. We have a waiting look the will check if the object is removed from the map and if is not after 30 second we assume the object wasn't saved into the etcd (no controller event) so we release the object. If we get an event from the controller then we remove the mac address from the configmap.
Open an issue to track the implementation #62 |
looks good! |
Before this PR we have an issue if the vm creation fails because there
is an validation error from the virt-api we didn't clean the allocation.
This PR introduce a cleanup loop into the system.
When we hit the create vm mutating webhook we create the regular
allocation but we also mark the mac address in a waiting configmap.
We have a waiting look the will check if the object is removed from the
map and if is not after 30 second we assume the object wasn't saved
into the etcd (no controller event) so we release the object.
If we get an event from the controller then we remove the mac address
from the configmap.
This PR are also add/remove some empty lines in the code, also fix the import positions and fix the output when a functional test fails