Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods networking is broken after openvswitch is restarted #47

Closed
bbl opened this issue Oct 7, 2019 · 9 comments
Closed

Pods networking is broken after openvswitch is restarted #47

bbl opened this issue Oct 7, 2019 · 9 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@bbl
Copy link

bbl commented Oct 7, 2019

Description

After ovs pod is restarted all pods on the corresponding node come up with broken networking. The gateway is not reachable, thus all egress connections are not possible.
If ovs-ofctl -O OpenFlow13 dump-ports-desc br0 is run inside the ovs pod, the output doesn't show old vethXXX interfaces, however they're still present on host.

Version
  • The output of git describe of openshift-ansible
openshift-ansible-3.11.146-1-22-g37e13e5
  • ovs image version:
docker.io/openshift/origin-node:v3.11
Steps To Reproduce
  1. Delete/restart the ovs pod on the compute node.
  2. Run ovs-ofctl -O OpenFlow13 dump-ports-desc br0, verify that veth interfaces are missing.
Expected Results

Expected pod networking is not broken after ovs is restarted. Old vethXXX interfaces are picked by ovs after the restart.

Additional Information
  • Operating system and version: CentOS 7
@danwinship
Copy link
Contributor

Restarting OVS should cause the SDN pod to restart, and it should reattach the pods then. Is that not happening?

@bbl
Copy link
Author

bbl commented Oct 7, 2019

Restarting OVS should cause the SDN pod to restart, and it should reattach the pods then. Is that not happening?

The sdn pod is restarted along with ovs, but pods network is not attached.

@bbl
Copy link
Author

bbl commented Nov 4, 2019

@danwinship we've encountered this issue one more time today. Is there any possible fix?

@danwinship
Copy link
Contributor

This may be fixed by #58, but it's not clear if/when that's going to be backported to 3.11. If that is the problem, then you could fix it by stopping the SDN pod before you restart the OVS pod, and then restarting it afterward.

@Conan-Kudo
Copy link

@danwinship There's a backport PR proposed: openshift/origin#24318

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 24, 2020
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 25, 2020
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci-robot
Copy link
Contributor

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

5 participants