Skip to content

support new values for site.xcatdebugmode

penguhyang edited this page Jun 24, 2016 · 8 revisions

The mini-design of support new values for site.xcatdebugmode

Background

Currently, the site.xcatdebugmode support values are 0 1 or null. When site.xcatdebugmode=0 or site.xcatdebugmode value is null, then xcatdebugmode is off. When site.xcatdebugmode=1, the xcatdebugmode is on, the installer will halt for sles and ubuntu. After install xCAT on MN, the default action for xcatdebugmode is off. There is a need to add a new value to support logging the operation system installation without halt.

Planning Outputs

1. support values for xcatdebugmode

xCAT provides a batch of techniques to help user debug problems while using xCAT, especially on OS provision, such as collecting logs of the whole installation process and accessing the installing system via ssh, etc. These techniques will be enabled according to different xCAT debug levels specified by "xcatdebugmode".

planning support values:

xcatdebugmode=0: Only diagnose Log will be show in corresponding files.
xcatdebugmode=1: Diagnose Log will be show in corresponding files and debug port will be opened.
xcatdebugmode=2: SSH is supported while installing also with diagnose log show and debug port enable.

List of Supported OS:

RHEL: 6.7 and above
SLES: 12 and above
UBT: 14.04.3 and above

The following behavior is observed during OS install:

+-----------------+--------------+--------------+--------------+
|**xcatdebugmode**|      0       |       1      |       2      |
+-----------------+----+----+----+----+----+----+----+----+----+
|                 |RHEL|SLES|UBT |RHEL|SLES|UBT |RHEL|SLES|UBT |
+=================+====+====+====+====+====+====+====+====+====+
| Diagnose Log    | Y  | Y  | Y  | Y  | Y  | Y  | Y  | Y  | Y  |
+-----------------+----+----+----+----+----+----+----+----+----+
|Enable Debug Port| N  | N  | N  | Y  | Y  | Y  | Y  | Y  | Y  |
+-----------------+----+----+----+----+----+----+----+----+----+
|  Enable SSH     | N  | N  | N  | N  | N  | N  | Y  | Y  | Y  |
+-----------------+----+----+----+----+----+----+----+----+----+

Y means the behavior is supported by OS at current xcatdebugmode.

N means the opposite meaning.

2. collect logs during the diskfull installation process

The ability to collect logs during the diskfull installation process can be helpful when debugging installation problems.

  • Pre-Install logs: the logs of pre-installation scripts, the pre-installation scripts include “%pre” section in anaconda, “<pre-scripts/>” section for SUSE and “partman/early_command” and “preseed/early_command” sections for ubuntu. The logs include the STDOUT and STDERR of the scripts as well as the debug trace output of bash scripts with “set -x”
  • Installer logs: the logs from the os installer itself, i.e, the logs of installation program(anaconda, autoyast and preseed,etc.)
  • Post-Install logs: the logs of post-installation scripts, the post-installation scripts include “%post” section in anaconda, “<chroot-scripts/>” and “<post-scripts/>” sections for SUSE and “preseed/late_command” section for ubuntu. The logs include the STDOUT and STDERR of the scripts as well as the debug trace output of bash scripts with “set -x”
  • PostBootScript logs: the logs during the post boot scripts execution, which are specified in “postbootscripts” attribute of node and osimage definition and run during the 1st reboot after installation.

The following behavior is observed during OS install:

+------------------+--------------+--------------+--------------+
|**xcatdebugmode** |      0       |       1      |       2      |
+------------------+----+----+----+----+----+----+----+----+----+
|                  |RHEL|SLES|UBT |RHEL|SLES|UBT |RHEL|SLES|UBT |
+=============+====+====+====+====+====+====+====+====+====+====+
| Pre-Install | MN | N            | N            | N            |
+  logs       +----+----+----+----+----+----+----+----+----+----+
|             | CN | Y1           | Y2           | Y2           |
+-------------+----+----+----+----+----+----+----+----+----+----+
| Installer   | MN | N  | N  | N  | Y6 | Y6 | Y6 | Y6 | Y6 | Y6 |
+  logs       +----+----+----+----+----+----+----+----+----+----+
|             | CN | Y5 | Y5 | Y5 | Y5 | Y5 | Y5 | Y5 | Y5 | Y5 |
+-------------+----+----+----+----+----+----+----+----+----+----+
| Post-Install| MN | Y4           | Y3           | Y3           |
+  logs       +----+----+----+----+----+----+----+----+----+----+
|             | CN | Y1           | Y2           | Y2           |
+-------------+----+----+----+----+----+----+----+----+----+----+
| Post-Script | MN | Y4           | Y3           | Y3           |
+  logs       +----+----+----+----+----+----+----+----+----+----+
|             | CN | Y1           | Y2           | Y2           |
+-------------+----+----+----+----+----+----+----+----+----+----+

Y1 means the installation logs can be saved to the CN in /var/log/xcat/xcat.log file without export to subsequent commands.

For RHEL
Running Kickstart Pre-Installation script...
...
installstatus installing
[get_install_disk]Information from /proc/partitions:
major minor  #blocks  name
...
Running Kickstart Post-Installation script...
...
running /xcatpost/mypostscript.post
...
/xcatpost/mypostscript.post return


For SLES
Running AutoYaST Pre-Installation script...
...
[get_install_disk]Information from /proc/partitions:
installstatus installing
major minor  #blocks  name
...
Running AutoYaST Chroot-Installation script...
ready
done
Running AutoYaST Post-Installation script...
...
running /xcatpost/mypostscript.post
...
/xcatpost/mypostscript.post return


For UBT
Running preseeding early_command Installation script...
...
[get_install_disk]Information from /proc/partitions:
major minor  #blocks  name
...
Running preseeding late_command Installation script...
...
Generating grub configuration file ...

Y2 means the installation logs can be saved to the CN in /var/log/xcat/xcat.log file with export to subsequent commands.

For RHEL
Running Kickstart Pre-Installation script...
...
+ /tmp/baz.py 'installstatus installing'
installstatus installing
+ echo '[get_install_disk]Information from /proc/partitions:'
[get_install_disk]Information from /proc/partitions:
major minor  #blocks  name
...
Running Kickstart Post-Installation script...
...
+ echo '/opt/xcat/xcatinfo generated'
/opt/xcat/xcatinfo generated
...
running /xcatpost/mypostscript.post
...
+ echo '/opt/xcat/xcatinstallpost generated'
/opt/xcat/xcatinstallpost generated
...
/xcatpost/mypostscript.post return
service xcatpostinit1 disabled


For SLES
Running AutoYaST Pre-Installation script...
...
+ /tmp/bar.awk 'installstatus installing'
installstatus installing
+ echo '[get_install_disk]Information from /proc/partitions:'
[get_install_disk]Information from /proc/partitions:
major minor  #blocks  name
...
Running AutoYaST Chroot-Installation script...
...
+ /tmp/updateflag.awk
ready
done
...
Running AutoYaST Post-Installation script...
...
running /xcatpost/mypostscript.post
...
+ echo 'finished node installation, reporting status...'
finished node installation, reporting status...
...
/xcatpost/mypostscript.post return
service xcatpostinit1 disabled


For UBT
Running preseeding early_command Installation script...
...
[get_install_disk]Information from /proc/partitions:
major minor  #blocks  name
...
Running preseeding late_command Installation script...
...
+ echo 'postscripts downloaded successfully'
postscripts downloaded successfully
...
+ echo 'mypostscript returned'
mypostscript returned
...
+ echo 'finished node installation, reporting status...'
finished node installation, reporting status...

Y3 means the installation logs can be forwarded to the MN in /var/log/xcat/computes.log file without export to subsequent commands.

/opt/xcat/xcatinfo generated
...
postscripts downloaded successfully
...
/xcatpost/mypostscript.post generated
...
/etc/init.d/xcatpostinit1 generated
...
/opt/xcat/xcatinstallpost generated
...
/opt/xcat/xcatdsklspost generated
...
running mypostscript

Y4 means the error messages can be forwarded to the MN in /var/log/xcat/computes.log file only when critical error happens.

/usr/bin/openssl does not exist, halt ...
or
/usr/bin/wget does not exist, halt ...
or
failed to download postscripts from...
or
/xcatpost/getpostscript.awk does not exist, halt ...
or
generate mypostscript file failure, halt ...

Y5 means the installer log can be saved to the CN in /var/log/anaconda /var/log/YaST2 /var/log/installer for RHEL SLES and UBT.

Y6 means the installer log can be forwarded to the MN in /var/log/xcat/computes.log file.

N means the logs can not be forwarded or saved.

3. collect logs during the diskless installation process

The ability to collect logs during the diskless installation process can be helpful when debugging installation problems.

  • Provision logs: the logs during the diskless provision.
  • PostBootScript logs: the logs during the post boot scripts execution, which are specified in “postbootscripts” attribute of node and osimage definition and run during the 1st reboot after installation.

The following behavior is observed during OS install:

+---------------------+--------------+--------------+--------------+
|  **xcatdebugmode**  |      0       |       1      |       2      |
+---------------------+----+----+----+----+----+----+----+----+----+
| OS Distribution     |RHEL|SLES|UBT |RHEL|SLES|UBT |RHEL|SLES|UBT |
+================+====+====+====+====+====+====+====+====+====+====+
| Provision      | MN | N            | Y3           | Y3           |
+  logs          +----+----+----+----+----+----+----+----+----+----+
|                | CN | N            | N            | N            |
+----------------+----+----+----+----+----+----+----+----+----+----+
| PostBootScript | MN | Y3           | Y4           | Y4           |
+  logs          +----+----+----+----+----+----+----+----+----+----+
|                | CN | Y1           | Y2           | Y2           |
+----------------+----+----+----+----+----+----+----+----+----+----+

Y1 means the installation logs can be saved to /var/log/xcat/xcat.log file on CN.

downloaded postscripts successfully
node booted successfully,reporting status...
Y2 means the installation logs and debug trace("set -x" or "-o xtrace") of bash scripts can be saved to /var/log/xcat/xcat.log file on CN.
running /opt/xcat/xcatdsklspost
+
++
node booted successfully,reporting status...

Y3(xcatdebugmode=1 or xcatdebugmode=2) means the installation logs can be forwarded to /var/log/xcat/computes.log file on MN for redhat and sles.

running xcatroot....
Extracting root filesystem:
Done....

Y3(xcatdebugmode=1 or xcatdebugmode=2) means the installation logs can be forwarded to /var/log/xcat/computes.log file on MN for ubuntu.

running init script...
Extracting root filesystem:
Done...

Y3(xcatdebugmode=0 or null) means the installation logs can be forwarded to /var/log/xcat/computes.log file on MN.

node booted successfully,reporting status...
ready
done

Y4 means the installation logs and debug trace("set -x" or "-o xtrace") of bash scripts can be forwarded to /var/log/xcat/computes.log file on MN.

running /opt/xcat/xcatdsklspost
+
++
node booted successfully,reporting status...

4. Use ssh when site.xcatdebugmode=2

The ssh access to the installer is enabled, the admin can login into the installer through:

  1. For RHEL, the installation won’t halt, just login into the installer with ssh root@<node>.
  2. For SLES, the installation will halt after the ssh server is started, the console output looks like:
***  sshd has been started  ***


***  login using 'ssh -X root@<node>'  ***
***  run 'yast' to start the installation  ***
Just as the message above suggests, the admin can open 2 sessions and run ssh -X root@<node> with the configured system password in the passwd table to login into the installer, then run yast to continue installation in one session and inspect the installation process in the installer in the other session.

After the installation is finished, the system requires a reboot. The installation will halt again before the system configuration, the console output looks like:

*** Preparing SSH installation for reboot ***
*** NOTE: after reboot, you have to reconnect and call yast.ssh ***

For sles11, the console output will looks different but similar:

*** Starting YaST2 ***
*** Preparing SSH installation for reboot ***
*** NOTE: after reboot, you have to reconnect and call ***
*** /usr/lib/YaST2/startup/YaST2.ssh ***

Just as the message above suggests, the admin should run ssh -X root@<node> to access the installer and run yast.ssh or /usr/lib/YaST2/startup/YaST2.ssh to finish the installation.

Note: For sles12, during the second stage of an SSH installation YaST freezes. It is blocked by the SuSEFirewall service because the SYSTEMCTL_OPTIONS environment variable is not set properly. Workaround: When logged in for the second time to start the second stage of the SSH installation, call yast.ssh with the --ignore-dependencies as follows:

SYSTEMCTL_OPTIONS=--ignore-dependencies yast.ssh

  1. For UBT, the installation will halt on the following similar message in the console:
┌───────────┤ [!!] Continue installation remotely using SSH ├───────────┐
│                                                                       │
│                               Start SSH                               │
│ To continue the installation, please use an SSH client to connect to  │
│ the IP address <node> and log in as the "installer" user. For         │
│ example:                                                              │
│                                                                       │
│    ssh installer@<node>                                               │
│                                                                       │
│ The fingerprint of this SSH server's host key is:                     │
│ <SSH_host_key>                                                        │
│                                                                       │
│ Please check this carefully against the fingerprint reported by your  │
│ SSH client.                                                           │
│                                                                       │
│                              <Continue>                               │
│                                                                       │
└───────────────────────────────────────────────────────────────────────┘

Just as the message show, the admin can run ssh installer@<node> with the password “cluster” to login into the installer, the following message shows on login:

┌────────────────────┤ [!!] Configuring d-i ├─────────────────────┐
│                                                                 │
│ This is the network console for the Debian installer. From      │
│ here, you may start the Debian installer, or execute an         │
│ interactive shell.                                              │
│                                                                 │
│ To return to this menu, you will need to log in again.          │
│                                                                 │
│ Network console option:                                         │
│                                                                 │
│                Start installer                                  │
│                Start installer (expert mode)                    │
│                Start shell                                      │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

The admin can open 2 sessions and then select “Start installer” to continue installation in one session and select “Start shell” in the other session to inspect the installation process in the installer.

News

History

  • Oct 22, 2010: xCAT 2.5 released.
  • Apr 30, 2010: xCAT 2.4 is released.
  • Oct 31, 2009: xCAT 2.3 released. xCAT's 10 year anniversary!
  • Apr 16, 2009: xCAT 2.2 released.
  • Oct 31, 2008: xCAT 2.1 released.
  • Sep 12, 2008: Support for xCAT 2 can now be purchased!
  • June 9, 2008: xCAT breaths life into (at the time) the fastest supercomputer on the planet
  • May 30, 2008: xCAT 2.0 for Linux officially released!
  • Oct 31, 2007: IBM open sources xCAT 2.0 to allow collaboration among all of the xCAT users.
  • Oct 31, 1999: xCAT 1.0 is born!
    xCAT started out as a project in IBM developed by Egan Ford. It was quickly adopted by customers and IBM manufacturing sites to rapidly deploy clusters.
Clone this wiki locally