-
Notifications
You must be signed in to change notification settings - Fork 520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Chassis][LAG_ID] Address the same lagid been used in two different LCs issue #3303
base: master
Are you sure you want to change the base?
Conversation
local lagid = redis.call("lpop", "SYSTEM_LAG_IDS_FREE_LIST") | ||
redis.call("hset", "SYSTEM_LAG_ID_TABLE", pcname, lagid) | ||
redis.call("sadd", "SYSTEM_LAG_ID_SET", lagid) | ||
if dblagid then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why use dblag here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why use dblag here?
This is just for handling the out of sync cases. As normal operation, this should not happen.
In the old method, when there is a valid dblagid number (it is from a broken entry in SYSTEM_LAG_ID_TABLE), if a new lagId is assigned, the dblagid needs to be removed from the SYSTEM_LAG_ID_SET.
With the new modification, we need to handle it as the same -- if old lagId (dblagId) is valid lag ID, we need to it to the free_list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should do
redis.call("srem", "SYSTEM_LAG_ID_SET", tostring(dblagid))
here too ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. In case of the old lagId (dblagid) still in the SYSTEM_LAG_ID_SET in a broken scenario, we just remove it. Anyway, this SYSTEM_LAG_ID_SET is just keeped for debugging purposes. We don't use it for any functionality. SYSTEM_LAG_ID_SET and SYSTM_LAG_IDS_FREE_LIST are mutual exclusive.
can you update the UT as well. |
f670342
to
509c5d0
Compare
…Cs issue. Signed-off-by: mlok <[email protected]>
Signed-off-by: mlok <[email protected]>
80f2a59
to
e9938d5
Compare
Signed-off-by: mlok <[email protected]>
UT has been updated. |
local lagid = redis.call("lpop", "SYSTEM_LAG_IDS_FREE_LIST") | ||
redis.call("hset", "SYSTEM_LAG_ID_TABLE", pcname, lagid) | ||
redis.call("sadd", "SYSTEM_LAG_ID_SET", lagid) | ||
if dblagid then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code under "if dblagid then" is duplicated in both if and else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are the same in the if and else. Not able to simplify them in one because of "if" case returns "tonumber(lagid)" while the "else" case returns "plagid"
What I did
Create SYSTEM_LAG_IDS_FREE_LIST for assign lagId for all portchannel creation.
a) If Portchannel is created with a valid plagid
* check if plagid is in free list, use plagid and remove it from SYSTEM_LAG_IDS_FREE_LIST. Add this lagid to the SYSTEM_LAG_ID_SET for debug info
* If plagid is not in the FREE_LIST, lpop and use the first lagid from the SYSTEM_LAG_IDS_FREE_LIST. Add this lagid to the SYSTEM_LAG_ID_SET for debug info
b) If Portchannel is created with invalid plagid or without any lagid
This PR works with the following 2 PRs:
sonic-net/sonic-platform-daemons#542
sonic-net/sonic-buildimage#20369
Why I did it
To address the issue of the same lagid could be used by two Portchannels in two different linecards. This issue occurs when reboot many Linecards together with 20 seconds delay in each LC reboot.
How I verified it
Details if related