Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added SmartSwitch support in chassisd and enabling chassisd #467

Open
wants to merge 85 commits into
base: master
Choose a base branch
from

Conversation

rameshraghupathy
Copy link

@rameshraghupathy rameshraghupathy commented Apr 15, 2024

Added SmartSwitch support in chassisd and enabling chassisd for fixed SmartSwitches

Description

chassisd is enabled only for modular chassis. Smartswitch is a fixed chassis. However it has been treated like a modular chassis to manage the DPU cards just like the line-cards of a modular chassis. chassisd will be enabled only on the smartswitch NPU and hence the scope of these changes are limited only to NPU and not applicable to DPU.

Motivation and Context

chassisd is enabled only for modular chassis. Smartswitch is a fixed chassis. However it has been treated like a modular chassis to manage the DPU cards just like the line-cards of a modular chassis. Hence, chassid is needed for SmartSwitch and enabled here. Also, some of the table updates and clean up are not required for SmartSwitch platform and hence using the is_smartswitch API to selectively enable it. the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here

How Has This Been Tested?

  1. Enabled and built and image and checked if the chassisd is running all the time without crashing
  2. Verify if the config change handler is working as expected by issuing config CLIs to startup and shutdown DPUs

Additional Information (Optional)

@oleksandrivantsiv
Copy link
Collaborator

What are the states supported by the DPUs in the Smart Switch?

@rameshraghupathy
Copy link
Author

What are the states supported by the DPUs in the Smart Switch?

”dpu_midplane_link_state”
”dpu_control_plane_state"
"dpu_data_plane_state"

sonic-chassisd/scripts/chassisd Outdated Show resolved Hide resolved
sonic-chassisd/scripts/chassisd Outdated Show resolved Hide resolved
accidental removal of a few lines from the original chassisd file
@@ -184,7 +188,7 @@ class ModuleUpdater(logger.Logger):
self.chassis = chassis
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that instead of modifying the existing ModuleUpdater class, we should implement SmartSwitchModuleUpdater(ModuleUpdater). SmartSwitchModuleUpdater should be derived from ModuleUpdater and overwrite the methods that should behave differently for the Smart Switch. This approach allows us to keep the original implementation untouched and guarantees full backward compatibility with the chassisd.

Copy link
Author

@rameshraghupathy rameshraghupathy Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will consider this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented this change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@oleksandrivantsiv oleksandrivantsiv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as commented

@rameshraghupathy
Copy link
Author

rameshraghupathy commented Oct 2, 2024

@rameshraghupathy can we add monitoring of the DPU midplane in Chassisd?

This is already part of chassisd -> check_midplane_reachability
@prgeor

@rameshraghupathy
Copy link
Author

rameshraghupathy commented Oct 2, 2024

@rameshraghupathy can you point to the Yang model change for admin_status for DPU in config DB table? https://github.com/sonic-net/sonic-buildimage/blob/master/src/sonic-yang-models/yang-models/sonic-chassis-module.yang#L23 needs update for DPU

I have created a PR sonic-net/sonic-buildimage#20445 for this and working on it.
@prgeor

@rameshraghupathy
Copy link
Author

@rameshraghupathy Please specify in the PR description if this chassisd changes is limited to NPU or DPU or both

Done
@prgeor


# Start configuration manager task on supervisor module
if self.module_updater.supervisor_slot == self.module_updater.my_slot:
if self.smartswitch:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rameshraghupathy comment above is incorrect

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor Fixed


(key, op, fvp) = sst.pop()

if op == 'SET':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rameshraghupathy SET operation can set ADMIN down or UP. please parse the key to know if the user is setting admin UP/DOWN

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rameshraghupathy SET operation can set ADMIN down or UP. please parse the key to know if the user is setting admin UP/DOWN

@prgeor Good question. The original behavior is: The default is no config - meaning admin up when the user wants to shutdown a module adds a config "admins_status: down". So, the SET operation has been used for "admins_status: down" and DEL has been used for setting the module up. If you want we can change this for Smartswitch. Please let me know.


if op == 'SET':
admin_state = MODULE_ADMIN_DOWN
elif op == 'DEL':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rameshraghupathy on DEL of the admin state why do module ADMIN UP?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rameshraghupathy on DEL of the admin state why do module ADMIN UP?

@prgeor The original behavior is: The default is no config - meaning admin up when the user wants to shutdown a module adds a config "admins_status: down". So, the SET operation has been used for "admins_status: down" and DEL has been used for setting the module up. If you want we can change this for Smartswitch. Please let me know.

@@ -73,6 +73,7 @@ CHASSIS_MODULE_REBOOT_INFO_TABLE = 'CHASSIS_MODULE_REBOOT_INFO_TABLE'
CHASSIS_MODULE_REBOOT_TIMESTAMP_FIELD = 'timestamp'
CHASSIS_MODULE_REBOOT_REBOOT_FIELD = 'reboot'
DEFAULT_LINECARD_REBOOT_TIMEOUT = 180
DEFAULT_DPU_REBOOT_TIMEOUT = 180
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 180 secs sufficient for entire DPU reboot time?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we add this to platform.json as vendor configurable timeout?

:return:
"""

def module_config_update(self, key, admin_state):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rameshraghupathy could you please add a quick summary of function here and also remaining functions for future clarity?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants