Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test failure troubleshooting docs #3528

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/img/environment_logs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/log_dir_structure.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/log_file_segments.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/remote_command_output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/serial_console_logs.png
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This screenshot is from azure storage, not in upstream lisa. Please use a screen shot from the upstream lisa.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/smoke_test_result copy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/test_case_logs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/test_results_summary.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/run_test/run.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ Run LISA
Use command line <command_line>
Use transformers <transformers>
Analyze test results <understand_results>
Troubleshoot test failures <troubleshoot_failures>
116 changes: 116 additions & 0 deletions docs/run_test/troubleshoot_failures.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
Troubleshoot Test Failures
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please merge understand_results page into this one. It doesn't need a separated page to explain test status.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=======================

- `Overview <#overview>`__
- `Console output <#console-output>`__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sections should follow what we recommended in the Overview. Like the "Error Message" section, it explains the test results, and where the error message can be found, and so on.

- `Log Folder Structure <#log-folder-structure>`__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log folder structure should be a sub-topic of Log files.


Overview
--------

It's essential to understand the results after running tests. LISA has 7
kinds of test results in total: 3 of which are intermediate results, and
4 of which are final results, as explained here:
:ref:`run_test/understand_results:overview`. To understand a test
failure, the recommended troubleshooting path is:

1. Check the test result error messages in console output.
2. Check the log file. Search the root log file which contains traces and commands output, as well as the split log files which are smaller in size.
3. Search the LISA code for issues.
4. Try to reproduce failure manually, deploy and run resources.

Console Output
--------------------

The results of a test run are displayed in the console and saved in log
files generated by LISA. The console will display a summary at the end
of each run, containing the test suite and case name, test status and a
message if applicable. There will be a summary generated that tallies
results of all tests.

.. figure:: ../img/test_results_summary.png
:alt: test_results_summary

The test result message is the easiest, fastest way to understand a test
failure. It is derived from assertion or exception messages. Failures
are categorized by similar messages.

Log Folder Structure
--------------------

After a test run, the LISA log file will be generated. The log file can
be found in the `runtime/log` directory that is generated after test
runs. Navigate subfolders until you find the log with a timestamp
corresponding to the time of the test run. Inside the log's timestamped
folder, the contents are further split by environment and test case. The
logs will show INFO and above levels by default.

- **LOG FOLDER CONTENTS**

* **environment** folder, which contains logs split for the
environment.
* **tests** folder, which contains logs split for the test cases.
* **lisa.html** A formatted summary of test results. It can be viewed
by opening the file in a web browser.
* **lisa-<timestamp>.log** A full log of the test run. It contains all
the information about the test run, including the test cases,
environments, and results.

.. figure:: ../img/log_dir_structure.png
:alt: log_dir_structure

- **LOG FILE SEGMENTS**

Each line (log entry) in the log file contains the following segments
from left to right:

* **timestamp** The timestamp corresponding to log entry
* **thread number** The thread number of the log entry
* **log level** The log level of the log entry
* **component level** The component level provides the source of log entry

.. figure:: ../img/log_file_segments.png
:alt: log_file_segments

- **REMOTE COMMANDS LOGS**

LISA logs all the commands executed on the remote machine. The
commands are logged in the **lisa-<timestamp>.log** file, unless it
is too long. Each command has a random id that is used to collocate
async command outputs. Previous output may be reused, so check the
environment log to get previous output. The commands are logged in
the following format:

* **Command line info** The command line that was executed
* **stdout** The standard output of the command
* **exit info** The exit code of the command

.. figure:: ../img/remote_command_output.png
:alt: remote_command_output

- **ENVIRONMENT LOGS**

The environment logs are ordered by timestamp. An environment may
have multiple nodes.

.. figure:: ../img/environment_logs.png
:alt: environment_logs

- **SERIAL CONSOLE LOGS**

The serial console logs are for the Azure platform. Use the name
column from the environment_stats.log to locate the proper
environment folder. The serial console log will be uploaded when the
guest is in a bad state.

.. figure:: ../img/serial_console_logs.png
:alt: serial_console_logs

- **TEST RESULT LOGS - SPLIT BY CASE**

The test folder may contain more logs, split by test case. If so, a
folder with in the format <timestamp>-<testcase> will be created, that
containes log files named <timestamp>-<testcase>.log.

.. figure:: ../img/test_case_logs.png
:alt: test_case_logs
Loading