Skip to content

Add debuggability for reboot function#14341

Closed
JibinBao wants to merge 1 commit intosonic-net:masterfrom
JibinBao:add_debugability_reboot
Closed

Add debuggability for reboot function#14341
JibinBao wants to merge 1 commit intosonic-net:masterfrom
JibinBao:add_debugability_reboot

Conversation

@JibinBao
Copy link
Copy Markdown
Contributor

@JibinBao JibinBao commented Aug 30, 2024

Description of PR

  1. Add function to collect console log from starting reboot to dut up
  2. When dut is not up, check if dut is pingable and collect the mgmt interface config

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

Enhance the debugability for reboot fucntion

How did you do it?

  1. Add function to collect console log from starting reboot to dut up
  2. When dut is not up, check if dut is pingable and collect the mgmt interface config

How did you verify/test it?

Run reboot cases

Any platform specific information?

Any

Supported testbed topology if it's a new test case?

Documentation

@JibinBao JibinBao force-pushed the add_debugability_reboot branch from 45c9ba3 to 7df7160 Compare August 30, 2024 07:17
@JibinBao
Copy link
Copy Markdown
Contributor Author

/azpw run Azure.sonic-mgmt

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-mgmt

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@JibinBao
Copy link
Copy Markdown
Contributor Author

/azpw run Azure.sonic-mgmt

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-mgmt

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@JibinBao
Copy link
Copy Markdown
Contributor Author

/azpw run Azure.sonic-mgmt

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-mgmt

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@JibinBao
Copy link
Copy Markdown
Contributor Author

/azpw run Azure.sonic-mgmt

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-mgmt

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bingwang-ms
Copy link
Copy Markdown
Collaborator

The concern for this change is we may see test failure if the console connection is not right or not working fine, even though it's not related to the reboot feature.
@prgeor Can you please help review as well?

@JibinBao
Copy link
Copy Markdown
Contributor Author

When console connection is not right, it will not affect the reboot function, it just does not collect the console logs

@bingwang-ms
Copy link
Copy Markdown
Collaborator

Hi @JibinBao, I still have concern on this change. It's because there are various console types and some of them are not stable enough. We don't want to fail the reboot test for a console connection issue.

@JibinBao
Copy link
Copy Markdown
Contributor Author

JibinBao commented Oct 15, 2024

Hi @JibinBao, I still have concern on this change. It's because there are various console types and some of them are not stable enough. We don't want to fail the reboot test for a console connection issue.

Hi @bingwang-ms , You are right, I have the same conncern. So, we use try to create the dutconsole. If fail to create dutconsole, we will not collect the logs. so it will not affect the original test logic. You can check the funciton of try_create_dut_console

@prgeor
Copy link
Copy Markdown
Contributor

prgeor commented Oct 15, 2024

@JibinBao please resolve the conflicts

@JibinBao
Copy link
Copy Markdown
Contributor Author

@JibinBao please resolve the conflicts

Will handle it. Thanks

Comment thread tests/common/reboot.py
logger.warning("dut console is not ready, we can get log by console")


def collect_mgmt_config_by_console(duthost, localhost):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JibinBao what is the objective of collecting this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes, when dut is not up, i am not sure if it is the mgmt netowrk issue, so check if the mgmt ip is pingable and
the mgmt config is correct or not.

Comment thread tests/common/reboot.py
return False


def try_create_dut_console(duthost, localhost, conn_graph_facts, creds):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JibinBao How are you ensuring the reboot command is issued via console connection? What if the reboot is issued via SSH connection and there is no console connection?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is to create dut_console to collect conosle logs, and it is not related to reboot test logic.
If there is no console connection, it will not collect the console logs, and it does not affect the original reboot test.
It is an enhancement. If there is a console connection and the console connection can be created successfully, it collect the console logs, otherwise, it will do nothing.

1. Add function to collect console log from starting reboot to dut up
2. When dut is not up, check if dut is pingable and collect the mgmt interface config

Change-Id: I700340d3fbfc00d61bc6e508ac5be4e4dc0a25e6
@JibinBao JibinBao force-pushed the add_debugability_reboot branch from 2653915 to b5dcc6f Compare October 15, 2024 06:49
@JibinBao
Copy link
Copy Markdown
Contributor Author

prgeor

hi @prgeor , the Conflict has been fixed. Can you help review it again?

@JibinBao
Copy link
Copy Markdown
Contributor Author

Hi @bingwang-ms ,
I push a fix for resloving the conflict, but it is stuck on the status(please see picture below). Can you help check it? and What I need to do next?
image

Copy link
Copy Markdown
Collaborator

@bingwang-ms bingwang-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @JibinBao, I don't agree to merge this change.
Unstable console connection can lead to test_reboot failure.
Can you keep the change in your internal repo?

@JibinBao
Copy link
Copy Markdown
Contributor Author

JibinBao commented Nov 1, 2024

Hi @JibinBao, I don't agree to merge this change. Unstable console connection can lead to test_reboot failure. Can you keep the change in your internal repo?

Hi @bingwang-ms ,
Because we it will not be affect by the console issue.

  1. If it fails to create a console, it will not collect console logs, and execute the code as the original logic
  2. We have run the test with the fixed for more than tow months, there is not a failure caused by it
  3. If we keep it internal repo, it will alway confict when updating code from communtiy
  4. Can you merge it? if it is not stable, we can revert it.

Thanks

Comment thread tests/common/reboot.py
@liat-grozovik
Copy link
Copy Markdown
Collaborator

and lets please resolve conflict so we can approve and merge

@JibinBao JibinBao marked this pull request as draft December 4, 2024 03:13
@JibinBao JibinBao marked this pull request as ready for review December 4, 2024 03:14
@JibinBao JibinBao closed this Dec 4, 2024
@JibinBao
Copy link
Copy Markdown
Contributor Author

JibinBao commented Dec 4, 2024

cannot reopen

@JibinBao
Copy link
Copy Markdown
Contributor Author

JibinBao commented Dec 4, 2024

reopen it

@JibinBao
Copy link
Copy Markdown
Contributor Author

JibinBao commented Dec 4, 2024

When pushing the new fix it is hung in "processing-updates" status. So close it and open new PR: #15868

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants