All Limulus HPC systems use the Warewulf Toolkit to manage worker node images. The Warewulf toolkit allows worker nodes to boot “disk-less” using a RAM disk. The RAM disk and bootable kernel are managed by Warewulf. The following is a description of how Warewulf is configured and run on Limulus systems. Consult the Warewulf site for more detailed information.
Note: Data Analytics systems use Apache Ambari to manage and install images to local disk on the worker nodes. This section does not apply to Data Analytics Systems.
Under normal circumstances, administrators will not need to manage node images. All HPC systems are configured to use default Limulus Warewulf images. The following background information will help with managing and updating worker node images. There are three main components:
VNFS images, bootstrap kernels, and files can all be loaded into the Warewulf database. See the Modifying Limulus VNFS, Bootstrap, and System Files section below for more information on this process.
To view the installed VNFS Images, use the wwsh
shell command as follows. Some helpful tips:
# wwsh vnfs list VNFS NAME SIZE (M) ARCH CHROOT LOCATION centos7.7 353.4 x86_64 /opt/ohpc/admin/images/centos7.7 co7_base 403.5 x86_64 /var/warewulf/Limulus/co7_base
This example shows two VNFS images, centos7.7 and co7_base.
Similarly, the available bootstrap kernels can be listed as follows.
# wwsh bootstrap list BOOTSTRAP NAME SIZE (M) ARCH 5.4.1-1.el7.elrepo.x86_64 42.5 x86_64
Node-specific files can be delivered to the nodes at boot time. The wwsh file list
command can be used to show the current files that are available in the Warewulf database (these files must be specifically assigned using the wwsh provision
command described below). For example, the default files available for the nodes are listed below. To accommodate new users, each node will check for an updated /etc/{passwd,group,shadow}
on the main host. (Note that after any change is made to a local file that is part of the Warewulf database on the main node, a wwsh file sync
command must be issued to update the file in the “files” database.)
wwsh file list Limulus-node-startup.sh : rwxr--r-- 1 root root 1113 /etc/Limulus/Limulus-node-startup.sh dynamic_hosts : rw-r--r-- 0 root root 997 /etc/hosts gmond.conf : rw-r--r-- 1 root root 8711 /etc/ganglia/gmond.conf group : rw-r--r-- 1 root root 1145 /etc/group idmapd.conf : rw-r--r-- 1 root root 4849 /etc/idmapd.conf munge.key : r-------- 1 munge munge 1024 /etc/munge/munge.key passwd : rw-r--r-- 1 root root 2811 /etc/passwd report-ganglia-temp : rw-r--r-- 1 root root 74 /etc/cron.d/report-ganglia-temp resolv.conf : rw-r--r-- 1 root root 54 /etc/resolv.conf shadow : rw-r----- 1 root root 1772 /etc/shadow slurm.conf : rw-r--r-- 1 root root 2386 /etc/slurm/slurm.conf
The node provisioning can be viewed using the wwsh provision list
command. This is a short listing of the VNFS images, bootstrap kernels, and files assigned to each node.
# wwsh provision list NODE VNFS BOOTSTRAP FILES ================================================================================ n0 co7_base 5.4.1-1.el7.elrepo... Limulus-node-start... n1 co7_base 5.4.1-1.el7.elrepo... Limulus-node-start... n2 co7_base 5.4.1-1.el7.elrepo... Limulus-node-start...
A much longer and detailed provision listing can be generated by using the wwsh provision print
command. If no essential image or file has been assigned (in the case below, a VNFS image), the listing will show UNDEF
and the node will not be able to boot.
# wwsh provision list NODE VNFS BOOTSTRAP FILES ================================================================================ n0 UNDEF 5.4.1-1.el7.elrepo... Limulus-node-start... n1 UNDEF 5.4.1-1.el7.elrepo... Limulus-node-start... n2 UNDEF 5.4.1-1.el7.elrepo... Limulus-node-start...
The wwsh provision
command is used to assign specific components (VNFS images, bootstrap kernels, and files) to nodes. See below.
Limulus Computing provides Warewulf node images in an RPM package. These are easily installed and managed on the cluster. See below for instructions on how to modify these images.
As configured, Limulus HPC systems have a basic node image that can be used to run the worker nodes.
From time to time Limulus Computing may update the Warewulf images. These can be installed by
first deleting the existing VNFS and then using the yum
package manager as follows to install the new VNFS
(Note: the update can take several minutes).
# rpm -e vnfs-co7_base # yum install vnfs-co7_base Loaded plugins: fastestmirror, langpacks Determining fastest mirrors ...
The next step is to assign the new VNFS to the worker nodes. Assuming the new VNFS installed properly, the wwsh provision
command can be used to set the VNFS image for the nodes. (Use wwsh vnfs list
to check that the VNFS image is available.) The following command sets the VNFS image for nodes n0, n1, and n2. Enter Yes
to the confirmation question. (Change the set -n[0-2]
to set -n[0-6]
for the eight-node double-wide systems.)
# wwsh provision set n[0-2] --vnfs co7_base Are you sure you want to make the following changes to 3 node(s): SET: VNFS = co7_base Yes/No> Yes
The current node provisioning can be checked by using the “list” option for the wwsh provision
command.
# wwsh provision list NODE VNFS BOOTSTRAP FILES ================================================================================ n0 co7_base 5.4.1-1.el7.elrepo... Limulus-node-start... n1 co7_base 5.4.1-1.el7.elrepo... Limulus-node-start... n2 co7_base 5.4.1-1.el7.elrepo... Limulus-node-start...
The system should be ready to use the new images when the nodes are rebooted. To reboot all nodes enter:
# pdsh reboot
After a few minutes the nodes should be up and running (check with wwtop
).
# wwsh provision set n[0-2] --vnfs co7_base Are you sure you want to make the following changes to 3 node(s): SET: VNFS = co7_base Yes/No> Yes