User Tools

Site Tools


warewulf_worker_node_images

Warewulf Worker Node Images

All Limulus HPC systems use the Warewulf Toolkit to manage worker node images. The Warewulf toolkit allows worker nodes to boot “disk-less” using a RAM disk. The RAM disk and bootable kernel are managed by Warewulf. The following is a description of how Warewulf is configured and run on Limulus systems. Consult the Warewulf site for more detailed information.

Note: Data Analytics systems use Apache Ambari to manage and install images to local disk on the worker nodes. This section does not apply to Data Analytics Systems.

Using Warewulf (Quick Start)

Under normal circumstances, administrators will not need to manage node images. All HPC systems are configured to use default Limulus Warewulf images. The following background information will help with managing and updating worker node images. There are three main components:

  1. File System Images - Each worker node is provided a RAM disk file system image. This image contains all the “local” files for the node, including configuration, executables, and libraries. This file system is called a Virtual Network File System (VNFS) image.
  2. Boostrap Kernel - A kernel image used to start the worker node. This image usually contains the same kernel version as the main node, but it is not required to.
  3. Files - These are generally configuration files that are maintained on the main node and sent to worker nodes when booting and then periodically as the node runs (if Warewulf notes any changes in the files). These files can be the same as the ones used on the headnode, or they can be specific to the worker nodes.

VNFS images, bootstrap kernels, and files can all be loaded into the Warewulf database. See the Modifying Limulus VNFS, Bootstrap, and System Files section below for more information on this process.

To view the installed VNFS Images, use the wwsh shell command as follows. Some helpful tips:

  • You can start a Warewulf shell by not including any arguments.
  • Incomplete arguments to wwsh bring up a help screen.
# wwsh vnfs list
VNFS NAME            SIZE (M)   ARCH       CHROOT LOCATION
centos7.7            353.4      x86_64     /opt/ohpc/admin/images/centos7.7
co7_base             403.5      x86_64     /var/warewulf/Limulus/co7_base

This example shows two VNFS images, centos7.7 and co7_base.

Similarly, the available bootstrap kernels can be listed as follows.

# wwsh bootstrap list
BOOTSTRAP NAME            SIZE (M)      ARCH
5.4.1-1.el7.elrepo.x86_64 42.5          x86_64

Node-specific files can be delivered to the nodes at boot time. The wwsh file list command can be used to show the current files that are available in the Warewulf database (these files must be specifically assigned using the wwsh provision command described below). For example, the default files available for the nodes are listed below. To accommodate new users, each node will check for an updated /etc/{passwd,group,shadow} on the main host. (Note that after any change is made to a local file that is part of the Warewulf database on the main node, a wwsh file sync command must be issued to update the file in the “files” database.)

wwsh file list
Limulus-node-startup.sh :  rwxr--r-- 1   root root             1113 /etc/Limulus/Limulus-node-startup.sh
dynamic_hosts           :  rw-r--r-- 0   root root              997 /etc/hosts
gmond.conf              :  rw-r--r-- 1   root root             8711 /etc/ganglia/gmond.conf
group                   :  rw-r--r-- 1   root root             1145 /etc/group
idmapd.conf             :  rw-r--r-- 1   root root             4849 /etc/idmapd.conf
munge.key               :  r-------- 1   munge munge           1024 /etc/munge/munge.key
passwd                  :  rw-r--r-- 1   root root             2811 /etc/passwd
report-ganglia-temp     :  rw-r--r-- 1   root root               74 /etc/cron.d/report-ganglia-temp
resolv.conf             :  rw-r--r-- 1   root root               54 /etc/resolv.conf
shadow                  :  rw-r----- 1   root root             1772 /etc/shadow
slurm.conf              :  rw-r--r-- 1   root root             2386 /etc/slurm/slurm.conf

The node provisioning can be viewed using the wwsh provision list command. This is a short listing of the VNFS images, bootstrap kernels, and files assigned to each node.

# wwsh provision list
NODE                VNFS            BOOTSTRAP             FILES                
================================================================================
n0                  co7_base        5.4.1-1.el7.elrepo... Limulus-node-start...
n1                  co7_base        5.4.1-1.el7.elrepo... Limulus-node-start...
n2                  co7_base        5.4.1-1.el7.elrepo... Limulus-node-start...

A much longer and detailed provision listing can be generated by using the wwsh provision print command. If no essential image or file has been assigned (in the case below, a VNFS image), the listing will show UNDEF and the node will not be able to boot.

# wwsh provision list
NODE                VNFS            BOOTSTRAP             FILES                
================================================================================
n0                  UNDEF           5.4.1-1.el7.elrepo... Limulus-node-start...
n1                  UNDEF           5.4.1-1.el7.elrepo... Limulus-node-start...
n2                  UNDEF           5.4.1-1.el7.elrepo... Limulus-node-start...

The wwsh provision command is used to assign specific components (VNFS images, bootstrap kernels, and files) to nodes. See below.

Updating Limulus Node VNFS Images

Limulus Computing provides Warewulf node images in an RPM package. These are easily installed and managed on the cluster. See below for instructions on how to modify these images.

As configured, Limulus HPC systems have a basic node image that can be used to run the worker nodes. From time to time Limulus Computing may update the Warewulf images. These can be installed by first deleting the existing VNFS and then using the yum package manager as follows to install the new VNFS (Note: the update can take several minutes).

# rpm -e vnfs-co7_base
# yum install vnfs-co7_base
Loaded plugins: fastestmirror, langpacks
Determining fastest mirrors
...

The next step is to assign the new VNFS to the worker nodes. Assuming the new VNFS installed properly, the wwsh provision command can be used to set the VNFS image for the nodes. (Use wwsh vnfs list to check that the VNFS image is available.) The following command sets the VNFS image for nodes n0, n1, and n2. Enter Yes to the confirmation question. (Change the set -n[0-2] to set -n[0-6] for the eight-node double-wide systems.)

# wwsh provision set n[0-2] --vnfs co7_base 
Are you sure you want to make the following changes to 3 node(s):

     SET: VNFS                 = co7_base

Yes/No> Yes

The current node provisioning can be checked by using the “list” option for the wwsh provision command.

# wwsh provision list
NODE                VNFS            BOOTSTRAP             FILES                
================================================================================
n0                  co7_base        5.4.1-1.el7.elrepo... Limulus-node-start...
n1                  co7_base        5.4.1-1.el7.elrepo... Limulus-node-start...
n2                  co7_base        5.4.1-1.el7.elrepo... Limulus-node-start...

The system should be ready to use the new images when the nodes are rebooted. To reboot all nodes enter:

# pdsh reboot

After a few minutes the nodes should be up and running (check with wwtop).

Updating Limulus Node Bootstrap Images

# wwsh provision set n[0-2] --vnfs co7_base 
Are you sure you want to make the following changes to 3 node(s):

     SET: VNFS                 = co7_base

Yes/No> Yes

Modifying Limulus VNFS, Bootstrap, and System Files

warewulf_worker_node_images.txt · Last modified: 2021/05/20 17:18 by brandonm