=====Warewulf Worker Node Images===== All Limulus HPC systems use the [[https://warewulf.lbl.gov/|Warewulf Toolkit]] to manage worker node images. The Warewulf toolkit allows worker nodes to boot "disk-less" using a RAM disk. The RAM disk and bootable kernel are managed by Warewulf. The following is a description of how Warewulf is configured and run on Limulus systems. Consult the Warewulf site for more detailed information. Note: Data Analytics systems use [[https://ambari.apache.org|Apache Ambari]] to manage and install images to local disk on the worker nodes. This section does not apply to Data Analytics Systems. ====Using Warewulf (Quick Start)==== Under normal circumstances, administrators will not need to manage node images. All HPC systems are configured to use default Limulus Warewulf images. The following background information will help with managing and updating worker node images. There are three main components: - **File System Images** - Each worker node is provided a RAM disk file system image. This image contains all the "local" files for the node, including configuration, executables, and libraries. This file system is called a Virtual Network File System (VNFS) image. - **Boostrap Kernel** - A kernel image used to start the worker node. This image usually contains the same kernel version as the main node, but it is not required to. - **Files** - These are generally configuration files that are maintained on the main node and sent to worker nodes when booting and then periodically as the node runs (if Warewulf notes any changes in the files). These files can be the same as the ones used on the headnode, or they can be specific to the worker nodes. VNFS images, bootstrap kernels, and files can all be loaded into the Warewulf database. See the [[warewulf_worker_node_images#Modifying Limulus VNFS, Bootstrap, and System Files]] section below for more information on this process. To view the installed **VNFS Images**, use the ''wwsh'' shell command as follows. Some helpful tips: * You can start a Warewulf shell by not including any arguments. * Incomplete arguments to wwsh bring up a help screen. # wwsh vnfs list VNFS NAME SIZE (M) ARCH CHROOT LOCATION centos7.7 353.4 x86_64 /opt/ohpc/admin/images/centos7.7 co7_base 403.5 x86_64 /var/warewulf/Limulus/co7_base This example shows two VNFS images, //centos7.7// and //co7_base//. Similarly, the available **bootstrap kernels** can be listed as follows. # wwsh bootstrap list BOOTSTRAP NAME SIZE (M) ARCH 5.4.1-1.el7.elrepo.x86_64 42.5 x86_64 **Node-specific files** can be delivered to the nodes at boot time. The ''wwsh file list'' command can be used to show the current files that are available in the Warewulf database (these files must be specifically assigned using the ''wwsh provision'' command described below). For example, the default files available for the nodes are listed below. To accommodate new users, each node will check for an updated ''/etc/{passwd,group,shadow}'' on the main host. (Note that after any change is made to a local file that is part of the Warewulf database on the main node, a ''wwsh file sync'' command must be issued to update the file in the "files" database.) wwsh file list Limulus-node-startup.sh : rwxr--r-- 1 root root 1113 /etc/Limulus/Limulus-node-startup.sh dynamic_hosts : rw-r--r-- 0 root root 997 /etc/hosts gmond.conf : rw-r--r-- 1 root root 8711 /etc/ganglia/gmond.conf group : rw-r--r-- 1 root root 1145 /etc/group idmapd.conf : rw-r--r-- 1 root root 4849 /etc/idmapd.conf munge.key : r-------- 1 munge munge 1024 /etc/munge/munge.key passwd : rw-r--r-- 1 root root 2811 /etc/passwd report-ganglia-temp : rw-r--r-- 1 root root 74 /etc/cron.d/report-ganglia-temp resolv.conf : rw-r--r-- 1 root root 54 /etc/resolv.conf shadow : rw-r----- 1 root root 1772 /etc/shadow slurm.conf : rw-r--r-- 1 root root 2386 /etc/slurm/slurm.conf The **node provisioning** can be viewed using the ''wwsh provision list'' command. This is a short listing of the VNFS images, bootstrap kernels, and files assigned to each node. # wwsh provision list NODE VNFS BOOTSTRAP FILES ================================================================================ n0 co7_base 5.4.1-1.el7.elrepo... Limulus-node-start... n1 co7_base 5.4.1-1.el7.elrepo... Limulus-node-start... n2 co7_base 5.4.1-1.el7.elrepo... Limulus-node-start... A much longer and detailed [[provision listing]] can be generated by using the ''wwsh provision print'' command. If no essential image or file has been assigned (in the case below, a VNFS image), the listing will show ''UNDEF'' and the node **will not be able to boot**. # wwsh provision list NODE VNFS BOOTSTRAP FILES ================================================================================ n0 UNDEF 5.4.1-1.el7.elrepo... Limulus-node-start... n1 UNDEF 5.4.1-1.el7.elrepo... Limulus-node-start... n2 UNDEF 5.4.1-1.el7.elrepo... Limulus-node-start... The ''wwsh provision'' command is used to assign specific components (VNFS images, bootstrap kernels, and files) to nodes. See below. ====Updating Limulus Node VNFS Images==== Limulus Computing provides Warewulf node images in an RPM package. These are easily installed and managed on the cluster. See below for instructions on how to modify these images. As configured, Limulus HPC systems have a basic node image that can be used to run the worker nodes. From time to time Limulus Computing may update the Warewulf images. These can be installed by first deleting the existing VNFS and then using the ''yum'' package manager as follows to install the new VNFS (Note: the update can take several minutes). # rpm -e vnfs-co7_base # yum install vnfs-co7_base Loaded plugins: fastestmirror, langpacks Determining fastest mirrors ... The next step is to assign the new VNFS to the worker nodes. Assuming the new VNFS installed properly, the ''wwsh provision'' command can be used to set the VNFS image for the nodes. (Use ''wwsh vnfs list'' to check that the VNFS image is available.) The following command sets the VNFS image for nodes n0, n1, and n2. Enter ''Yes'' to the confirmation question. (Change the ''set -n[0-2]'' to ''set -n[0-6]'' for the eight-node double-wide systems.) # wwsh provision set n[0-2] --vnfs co7_base Are you sure you want to make the following changes to 3 node(s): SET: VNFS = co7_base Yes/No> Yes The current node provisioning can be checked by using the "list" option for the ''wwsh provision'' command. # wwsh provision list NODE VNFS BOOTSTRAP FILES ================================================================================ n0 co7_base 5.4.1-1.el7.elrepo... Limulus-node-start... n1 co7_base 5.4.1-1.el7.elrepo... Limulus-node-start... n2 co7_base 5.4.1-1.el7.elrepo... Limulus-node-start... The system should be ready to use the new images when the nodes are rebooted. To reboot all nodes enter: # pdsh reboot After a few minutes the nodes should be up and running (check with ''wwtop''). ====Updating Limulus Node Bootstrap Images==== # wwsh provision set n[0-2] --vnfs co7_base Are you sure you want to make the following changes to 3 node(s): SET: VNFS = co7_base Yes/No> Yes ====Modifying Limulus VNFS, Bootstrap, and System Files====