Automated installation of GH200 system using Ubuntu 22.04 using USB drive
A lot of this came from the Official Nvidia Ubuntu 22.04 Grace Installation Guide. If you have problems, please see that guide for more details.
This repo was created based on testing of the GH200 system (specifically the Supermicro ARS-111GL-NHR). The systems I'm testing on also have a Bluiefield-3 installed, but this should not matter for the installation.
The contained scripts will perform the following actions:
- Create a new user (as defined by user)
- Make
linux-nvidia-64k-hwe-22.04
the default kernel - Create a 12G "recovery partition"
- Install the Nvidia CUDA SBSA repo
- Install the Nvidia MLNX-OFED repo
- Update/Upgrade all packages
- Install the First-boot service that will run on the first boot
- Install cuda-drivers and nvidia-kernel-open drivers
- Install cuda-toolkit-12-4 nvidia-container-toolkit
- Install mlnx-fw-updater mlnx-ofed-all
- Update and enable the nvidia-persistenced service due to bug
- Disable the first-boot service
- Reboot the system
There are multiple ways to create an installable Ubuntu 22.04 USB drive. I used Rufus on a Windows-11 system.
- Download Rufus Portable and the latest Ubuntu-22.04.4 ISO.
- Select USB drive.
- Install as ISO (enuring you can modify the files on the USB afterwards)
Before copying over the files, you'll need to customize the cidata/user-data file with your installation details
Replace the following items:
Item | Description |
---|---|
<HOSTNAME> | Hostname of the system |
<PASSWORD> | Generated SHA-512 hash (can generate with openssl passwd -6 ) |
<USERNAME> | Initial User name |
<ADDRESS-CIDR> | Address of network port in CIDR form (e.g. 10.1.1.10/24) |
<GATEWAY-ADDRESS> | Address of network gateway |
<NAMESERVER-N> | Address of DNS Nameservers |
<SEARCH-DOMAIN> | DNS Search domain (e.g. my-domain.com) |
After creating a bootable Ubuntu installation drive, copy the files from cidata to the
-
Create directory
cidata
in the root of the Ubuntu USB drive -
Copy All files over to the cidata directory on the Ubuntu USB drive
- user-data : Ubuntu Autoinstall file
- meta-data : Ubuntu Meta-data files
- first-boot.service : One-shot service to launch the first-boot.sh script
- first-boot.sh : Script that will run on first boot
-
Update the
boot/grub/grub.cfg
file and add the following menuentry to the list:
menuentry "Install GH200 System (Requires Internet)" {
set gfxpayload=keep
linux /casper/vmlinuz quiet autoinstall 'ds=nocloud-net;s=file:///cdrom/cidata/'
initrd /casper/initrd
}