Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The switch validation returns different results when the switch is validated several times after the flow endpoints swap failed and reverted #5636

Open
izadorozhna opened this issue Apr 15, 2024 · 0 comments

Comments

@izadorozhna
Copy link
Collaborator

This happens during the #5635 issue and has the same steps to reproduce.

The issue was found due to the failed test "Unable to swap endpoints for two flows when one of them is inactive", please see its code for additional details.

Steps to reproduce:

Repeating the steps of "Unable to swap endpoints for two flows when one of them is inactive":

  1. Select 2 pairs of neighboring switches with the same destination switch (e.g. (Switch_1, Switch_2) and (Switch_3, Switch_2)).
  2. Create 2 flows with the same dst switch (e.g. Flow1 with Switch_1 -> Switch_2, and Flow2 with Switch_3 -> Switch_2).
  3. Break all ISLs of the Flow1 source switch (in our example it is Switch_1).
  4. Try to swap endpoints for these 2 flows.
  5. Catch the expected HttpServerErrorException exception because the flows cannot be swapped due to the Flow1 source switch having links down.
  6. Validate the involved switches.

Expected result:

If one of the switches (Switch_3) in our case has some discrepancies, it should be shown in the response of the switch validation operation. If the user repeats the switch validation, the result should be the same if the system is not changed.

Actual result:

Please see #5635 actual result to see what happens with the flows and its rules. Finally, Switch_3 will have the missing rule.
However, when the user does switch_3 validation multiple times, sometimes it shows the missing rule 4611686018427471716 which is valid because flow1 has the rule hex 0x4000000000014764 (dec 4611686018427471716) only on Switch_2, so it is missing on Switch_3:

"missing": [4611686018427471716],
"misconfigured": [4611686018427471716],

But sometimes the switch validation request shows the missing rule 4665729213955917668 (hex 40C0000000014764):

"missing": [4665729213955917668],
"misconfigured": [4665729213955917668],

^ But we are not sure what is this rule about, it is absent on all switches, and Kibana logs have only 3 mentions of it - only in the context of switch validation:

On switch 00:00:00:00:00:00:00:03 rule 4665729213955917669 is misconfigured. Actual: RuleInfoEntryV2(cookie=4665729213955917669, cookieHex=null, cookieKind=null, tableId=2, priority=24576, flowId=null, flowPathId=null, flags=[RESET_COUNTERS], match={IN_PORT=RuleInfoEntryV2.FieldMatch(value=13, mask=null), METADATA=RuleInfoEntryV2.FieldMatch(value=269084536, mask=4294967288)}, instructions=RuleInfoEntryV2.Instructions(goToTable=null, goToMeter=null, writeMetadata=null, applyActions=[SetFieldActionEntry(value=3, field=ETH_SRC), SetFieldActionEntry(value=2, field=ETH_DST), PushVlanActionEntry(), SetFieldActionEntry(value=214, field=VLAN_VID), PortOutActionEntry(portNumber=2, portType=null)], writeActions=[]), yFlowId=null) : expected : RuleInfoEntryV2(cookie===4665729213955917668==, cookieHex=null, cookieKind=null, tableId=2, priority=24576, flowId=null, flowPathId=null, flags=[RESET_COUNTERS], match={IN_PORT=RuleInfoEntryV2.FieldMatch(value=13, mask=null), METADATA=RuleInfoEntryV2.FieldMatch(value=269084536, mask=4294967288)}, instructions=RuleInfoEntryV2.Instructions(goToTable=null, goToMeter=null, writeMetadata=null, applyActions=[SetFieldActionEntry(value=3, field=ETH_SRC), SetFieldActionEntry(value=2, field=ETH_DST), PushVlanActionEntry(), SetFieldActionEntry(value=213, field=VLAN_VID), PortOutActionEntry(portNumber=2, portType=null)], writeActions=[]), yFlowId=null)

Attaching my investigation wit the real case, its ids, time, logs.
Investigation_with_ids_and_time.pdf

When execute the test several times:

In other executions sometimes happens that the Switch Validation gives some missing rules, and sometimes an empty list of missing/misconfigured rules:

"missing": [4665729213955917667, 4611686018427471715],
"misconfigured": [4665729213955917667, 4611686018427471715],

Validate the same switch more times:

"missing": [],
"misconfigured": [],

Validate more times, and you will see 2 missing rules again, validate more times, and you will see an empty list of missing rules again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant