Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test failure troubleshooting docs #3528

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/img/environment_logs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/log_dir_structure.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/log_file_segments.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/remote_command_output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/serial_console_logs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/smoke_test_result copy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/test_case_logs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/test_results_summary.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/run_test/run.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ Run LISA
Use runbook <runbook>
Use command line <command_line>
Use transformers <transformers>
Analyze test results <understand_results>
Troubleshoot test failures <troubleshoot_failures>
184 changes: 184 additions & 0 deletions docs/run_test/troubleshoot_failures.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
Troubleshoot Test Failures
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please merge understand_results page into this one. It doesn't need a separated page to explain test status.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=======================

- `Overview <#overview>`__
- `Test results <#test-results>`__
- `Console output <#console-output>`__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sections should follow what we recommended in the Overview. Like the "Error Message" section, it explains the test results, and where the error message can be found, and so on.

- `Log Folder Structure <#log-folder-structure>`__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log folder structure should be a sub-topic of Log files.


Overview
--------

To understand a test failure, the recommended troubleshooting path is:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain more on why we recommend this. Because the test results should be able to troubleshoot most issues, and it's simple to read and get.


1. Check the test result error messages in console output.
2. Check the log file. Search the root log file which contains traces and commands output, as well as the split log files which are smaller in size.
3. Search the LISA code for issues.
4. Try to reproduce failure manually, deploy and run resources.

Test results
------------

It's essential to understand the results after running tests. LISA has 7
kinds of test results in total: 3 of which are intermediate results, and
4 of which are final results, as explained in sections below. Each test
case can and will be moved from one result to another but can never have
two or more results at the same time.

.. figure:: ../img/test_results.png
:alt: test_results

- **Intermediate results**

An intermediate result shows information of an unfinished test. It will
show up when a test changes its state. If a test run terminates because
of error or exception prior to running a test case, only the
intermediate result will be provided.

- **QUEUED**
QUEUED tests are tests that are created, and planned to run (but have
not started yet). They are pre-selected by extension/runbook
criteria. You can check log to see which test cases are included by
such criteria. They suggest that there are some tests waiting to be
performed.

QUEUED tests will try to match every created environment. They will
move forward to ASSIGNED if they match any, and to SKIPPED if they
match none of the environments.

- **ASSIGNED**
ASSIGNED tests are tests that are assigned to an environment, and
will start to run, if applicable, once the environment is
deployed/initialized. They suggest some environmental setting up is
going on.

ASSIGNED tests will end with FAILED if the environment fails to
deploy. Otherwise, they move forward to RUNNING. They will also move
backward to QUEUED if the environment is deployed and initialized
successfully.

- **RUNNING**
RUNNING tests are tests that are in test procedure.
RUNNING tests will end with one of the following final results.

- **Final results**

A final result shows information of a terminated test. It provides more
valuable information than the intermediate result. It only appears in
the end of a successful test run.

- **FAILED**
FAILED tests are tests that did not finish successfully and
terminated because of failures like ``LISA exceptions`` or
``Assertion failure``. You can use them to trace where the problem
was and why the problem happened.

- **PASSED**
PASSED tests are tests that passed, or at least partially passed,
with a special ``PASSException`` that warns there are minor errors in
the run but they do not affect the test result.

- **SKIPPED**
SKIPPPED tests are tests that did not start and would no longer run.
They suggest failure to meet some requirements in the environments
involved with the test.

- **ATTEMPTED**
ATTEMPTED tests are a special category of FAILED tests because of
known issues, which are not likely to be fixed soon.

Console Output
--------------------

The results of a test run are displayed in the console and saved in log
files generated by LISA. The console will display a summary at the end
of each run, containing the test suite and case name, test status and a
message if applicable. There will be a summary generated that tallies
results of all tests.

.. figure:: ../img/test_results_summary.png
:alt: test_results_summary

The test result message is the easiest, fastest way to understand a test
failure. It is derived from assertion or exception messages. Failures
are categorized by similar messages.

Log Folder Structure
--------------------

After a test run, the LISA log file will be generated. The log file can
be found in the `runtime/log` directory that is generated after test
runs. Navigate subfolders until you find the log with a timestamp
corresponding to the time of the test run. Inside the log's timestamped
folder, the contents are further split by environment and test case. The
logs will show INFO and above levels by default.

- **LOG FOLDER CONTENTS**

* **environment** folder, which contains logs split for the
environment.
* **tests** folder, which contains logs split for the test cases.
* **lisa.html** A formatted summary of test results. It can be viewed
by opening the file in a web browser.
* **lisa-<timestamp>.log** A full log of the test run. It contains all
the information about the test run, including the test cases,
environments, and results.

.. figure:: ../img/log_dir_structure.png
:alt: log_dir_structure

- **LOG FILE SEGMENTS**

Each line (log entry) in the log file contains the following segments
from left to right:

* **timestamp** The timestamp corresponding to log entry
* **thread number** The thread number of the log entry
* **log level** The log level of the log entry
* **component level** The component level provides the source of log entry

.. figure:: ../img/log_file_segments.png
:alt: log_file_segments

- **REMOTE COMMANDS LOGS**

LISA logs all the commands executed on the remote machine. The
commands are logged in the **lisa-<timestamp>.log** file, unless it
is too long. Each command has a random id that is used to collocate
async command outputs. Previous output may be reused, so check the
environment log to get previous output. The commands are logged in
the following format:

* **Command line info** The command line that was executed
* **stdout** The standard output of the command
* **exit info** The exit code of the command

.. figure:: ../img/remote_command_output.png
:alt: remote_command_output

- **ENVIRONMENT LOGS**

The environment logs are ordered by timestamp. An environment may
have multiple nodes.

.. figure:: ../img/environment_logs.png
:alt: environment_logs

- **SERIAL CONSOLE LOGS**

The serial console logs are for the Azure platform. Use the name
column from the environment_stats.log to locate the proper
environment folder. The serial console log will be uploaded when the
guest is in a bad state.

.. figure:: ../img/serial_console_logs.png
:alt: serial_console_logs

- **TEST RESULT LOGS - SPLIT BY CASE**

The test folder may contain more logs, split by test case. If so, a
folder with in the format <timestamp>-<testcase> will be created, that
containes log files named <timestamp>-<testcase>.log.

.. figure:: ../img/test_case_logs.png
:alt: test_case_logs
87 changes: 0 additions & 87 deletions docs/run_test/understand_results.rst

This file was deleted.

Loading