You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Flaky] When main path ISL is UP, and ISL of protected path becomes active, and other non-involved ISLs have not enough bandwidth, the flow does not become UP, it stays degraded with “protected-path”: “Down”
#5655
Open
izadorozhna opened this issue
May 7, 2024
· 0 comments
Execute the test. So, it will be executed with switches 7 and 8.
If the test passes, repeat step 3.
When the test fails on the step when the main ISL is restored, and the flow is expected to be UP, but it is Degraded, the issue is reproduced.
Steps to reproduce with the manually:
Select switches 7 and 8 and create a flow with a protected path. Usually, such flow has path size 2 (7<-->8) for both main and protected paths.
Select all non-involved ISLs into the main or protected path of the flow and decrease the BW there to a minimum.
Break the ISL(s) of the main path, so the originally protected path swaps to the main, and vice-versa originally main path with broken ISL now becomes the protected path which is down.
Check that now the flow has degraded status because the protected path cannot be found (the original main path ISL is broken and cannot be a new protected path, and other non-involved ISLs have not enough BW).
Restore the original main ISL broken on step 4.
Check that the flow becomes active with main and protected paths UP.
When checking the history, it should have the reroute action after ISL is Active, and since the protected path is already present, earlier it was down due to the broken ISL, and now this ISL is up, the same protected path is found. So, Kilda skipped creating of new protected path:
Actual result:
When executing the same test several times (with the same switch pair 7-8), the result is not consistent. Sometimes, the expected result is received. But sometimes, after the main ISL is restored, the flow still stays in the Degraded state with the “protected-path”: “Down”:
However, the history has the route action after ISL became active:
But for some reason, this time, it does not have "Found the same protected path. Skipped creating of it" message, but it has "Couldn't find non overlapping protected path. Skipped creating it" instead.
Also, when I try to do the manual explicit reroute action via Northbound V2 API, it helps to reroute the flow and the flow becomes UP. The flow history now has a new reroute action started via Northbound. However, the API response to the reroute action has rerouted: false for some reason:
P.S. Please note that the test case is flaky and need to repeat the steps several times to reproduce the issue.
Also, it is important to note that there is a separate similar test "Flow swaps to protected path when main path gets broken, becomes DEGRADED if protected path is unable to reroute(no bw)" which has similar steps, but the other (non-involved ISLs into main or protected paths), are broken instead of decreasing BW. In this case, the test also fails sometimes with switch pair 7-8.
The text was updated successfully, but these errors were encountered:
Steps to reproduce with the automated test:
"Flow swaps to protected path when main path gets broken, becomes DEGRADED if protected path is unable to reroute(no bw)"
Steps to reproduce with the manually:
Expected result:
The flow becomes UP:
When checking the history, it should have the reroute action after ISL is Active, and since the protected path is already present, earlier it was down due to the broken ISL, and now this ISL is up, the same protected path is found. So, Kilda skipped creating of new protected path:
Actual result:
When executing the same test several times (with the same switch pair 7-8), the result is not consistent. Sometimes, the expected result is received. But sometimes, after the main ISL is restored, the flow still stays in the Degraded state with the “protected-path”: “Down”:
However, the history has the route action after ISL became active:
But for some reason, this time, it does not have
"Found the same protected path. Skipped creating of it"
message, but it has"Couldn't find non overlapping protected path. Skipped creating it"
instead.Also, when I try to do the manual explicit reroute action via Northbound V2 API, it helps to reroute the flow and the flow becomes UP. The flow history now has a new reroute action started via Northbound. However, the API response to the reroute action has
rerouted: false
for some reason:Attaching the flow history JSON which include the manual explicit reroute action as well.
07May180118_375_cinnamon9255.json
Attaching tolopogy.yaml:
topology.yaml.log
P.S. Please note that the test case is flaky and need to repeat the steps several times to reproduce the issue.
Also, it is important to note that there is a separate similar test "Flow swaps to protected path when main path gets broken, becomes DEGRADED if protected path is unable to reroute(no bw)" which has similar steps, but the other (non-involved ISLs into main or protected paths), are broken instead of decreasing BW. In this case, the test also fails sometimes with switch pair 7-8.
The text was updated successfully, but these errors were encountered: