Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EVPN] Missing reference check during deletion of vxlan tunnel object #21237

Open
dgsudharsan opened this issue Dec 20, 2024 · 0 comments
Open

Comments

@dgsudharsan
Copy link
Collaborator

Description

The following commands are executed in the CLI in a teardown sequence

config vrf del_vrf_vni_map Vrf1
config vxlan map del vtep101032 100 500100
config vxlan evpn_nvo del my-nvo
config vxlan del vtep101032

However when they arrive at orchagent it can be in a different order as vrf and vxlan configurations are processed by different managers and if orchagent is busy, it can change the order of configuration.

2024-12-19.08:43:38.065836|VXLAN_TUNNEL_TABLE:vtep101032|DEL
2024-12-19.08:43:38.066248|VXLAN_EVPN_NVO_TABLE:my-nvo|DEL
2024-12-19.08:43:38.066985|VXLAN_TUNNEL_MAP_TABLE:vtep101032:map_500100_Vlan100|DEL
2024-12-19.08:43:38.067439|VXLAN_VRF_TABLE:vtep101032:evpn_map_500100_Vrf1|DEL
2024-12-19.08:43:38.067948|VRF_TABLE:Vrf1|SET|NULL:NULL|vni:0

Now with this ordering the tunnel gets removed

2024 Dec 19 08:43:38.065755 r-panther-03 INFO swss#orchagent: :- set: setting attribute 0x10000004 status: SAI_STATUS_SUCCESS
2024 Dec 19 08:43:38.066160 r-panther-03 NOTICE swss#orchagent: :- delOperation: Vxlan tunnel 'vtep101032' was removed

The vlan tunnel map doesn't get removed as vrf tunnel map removal hasn't arrived yet (In orchagent we remove vxlan tunnel map when we create vrf vni map. So the map won't exist until vrf tunnel map is removed.

2024 Dec 19 08:43:38.067339 r-panther-03 ERR swss#orchagent: :- delOperation: Error removing tunnel map vtep101032:map_500100_Vlan100: Can't delete a tunnel map entry object
2024 Dec 19 08:43:38.067339 r-panther-03 WARNING swss#orchagent: :- delOperation: NVO not deleted as hw delete is pending
2024 Dec 19 08:43:38.067339 r-panther-03 ERR swss#orchagent: :- meta_sai_validate_oid: object key SAI_OBJECT_TYPE_TUNNEL_MAP_ENTRY:oid:0x3b000000000751 doesn't exist
2024 Dec 19 08:43:38.067339 r-panther-03 ERR swss#orchagent: :- delOperation: Error removing tunnel map vtep101032:map_500100_Vlan100: Can't delete a tunnel map entry object

Vrf tunnel map is not removed because there is an exception since the tunnel object is removed.

2024 Dec 19 08:43:38.067759 r-panther-03 NOTICE swss#orchagent: :- delOperation: VxlanVrfMapOrch VRF VNI mapping 'vtep101032:evpn_map_500100_Vrf1' remove vrf Vrf1
2024 Dec 19 08:43:38.067809 r-panther-03 ERR swss#orchagent: :- doTask: Logic error in 15VxlanVrfMapOrch: map::at

Since this is thrown as exception the vlan map never gets removed and the error log keeps happening

The removal of vxlan tunnel doesn't check references. This needs to be introduced

https://github.com/sonic-net/sonic-swss/blob/eae729e22864ba31a412efcaa575fc0d61a2b25c/orchagent/vxlanorch.cpp#L1553

Steps to reproduce the issue:

  1. Please check the above commands

Describe the results you received:

Describe the results you expected:

Output of show version:

Seen in 202405 branch but its a day 1 behavior.

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant