Each Limulus system consists of one login-node and three or seven worker nodes. As shipped, the login-node has the alias name headnode
(or limulus
) and the worker node names are “n0, n1, and n2” and “n0, n1, n2, n3, n4, n5, and n6”. Do not attempt to change these the node names; many services rely on this naming scheme.
Power to the three (or seven) worker nodes is controlled by the main login-node (the motherboard to which the monitor, keyboard, and mouse are connected on the back of the case). There is a set of relays underneath the 1 GbE switch inside the case. These relays can be controlled directly by using the relayset
program (see below).
Direct use of relayset
is not recommended as a method to control the nodes, because it essentially cuts or applies power to the nodes without shutting down the operating system.
Use of node-poweron
and node-poweroff
utilities or the power control GUI is strongly recommended. These utilities are described below.
!!!Resetting or rebooting the login-node will shutdown all worker nodes!!!
All nodes are in the powered down state when the system is initially started. Nodes can be started using either the command line or the GUI tool described below. It is assumed that the user can control what nodes are powered on (or off) as they use the system. If all nodes are to be started when the system starts, add the line /usr/sbin/node-poweron
at the end of the Limulus systemd startup file (this file acts like an rc.local
file in SysV init systems). The file can be found in the “hidden” Limulus management directory: /etc/warewulf/.Limulus/Limulus-startup.sh
. Be careful modifying this file (make a backup before editing). As noted previously, node power control can be combined with the Slurm resource scheduler to automatically power-on/off nodes as they are needed. See Slurm Workload Manager.
Shutting down the login-node, either by rebooting or powering down, will gracefully shut down the worker nodes (i.e. they will receive a local poweroff
command sent by the login-node).
All nodes and all services (e.g. HDFS, YARN) are started when the system is initially powered on. Nodes can be started using either the command line or the GUI tool described below, however, power cycling nodes may disrupt the running service daemons.
Proper shutdown of Data Analytics (Hadoop) systems is provided in the Using the Apache Ambari Cluster Manager section. Basically, the Hadoop services should be gracefully shut down before the system is powered down. Note that these services are robust and can often recover from a sudden (or unexpected) power loss, however, data can be lost – particularly in HDFS.
The node-poweron
and node-poweroff
are command line utilities that are available on the login-node. These commands will only operate for the root users.
Executing either command with no arguments will power-on/off all nodes. An individual node can be supplied as an argument.
To turn on all nodes, simply enter:
# node-poweron
To turn on node n2
enter:
# node-poweron n2
The node-poweron
command will wait until the node(s) have fully booted (i.e. the operating system is up and running), or if no operating system can be detected it will “time-out.” Thus, the command can take up to several minutes to complete. Interrupting this command may put the system in an unstable state. The full option list is given below. Note, there is a -s
option to run the command with no output in a script.
Also note that if a node is already up and running, turning the power on with node-poweron
will have no effect.
# node-poweron -h node-power-on [-h help] [-s silent] [nodes] No node arguments turns all nodes ON. If a node is already on, nothing will happen. Node name(s) can be given as argument(s) in the range {n0,...,n6}. For example: # node-poweron n0 n2 # node-poweron -s n1 # node-poweron -s Invalid nodes will be ignored. Default Limulus nodes are {n0,n1,n2} The script waits until all nodes are started or the process times out. -s runs in quiet mode; -h provides this help.
To turn all nodes off gracefully (remove power), simply enter:
# node-poweroff
To turn off just node n0
enter:
# node-poweroff n0
The node-poweroff
command will wait until the node(s) have fully shut down (i.e. it cannot contact the node operating systems) or until a timeout occurs. If the timeout occurs AND the node relay indicates the power is applied, power is removed (relay is turned off) regardless of operating system state (i.e. the “plug is pulled” for the node).
Like node-powerup
, this command can take up to several minutes to complete. Interrupting this command may put the system in an unstable state. The full option list is given below. Note, there is a -s
option to run the command with no output.
Also note that if a node is already down, turning the power off with node-poweroff
will have no effect.
# node-poweroff -h node-power-on [-h help] [-s silent] [nodes] No node arguments turns all nodes OFF. If a node is already off, nothing will happen. Node name(s) can be given as argument(s) in the range {n0,...,n6}. For example: # node-poweroff n0 n2 # node-poweroff -s n1 # node-poweroff -s Invalid nodes will be ignored. Default Limulus nodes are {n0,n1,n2} A delay is included so nodes can properly shutdown before power is removed. Any node attached drives are placed in stand-by mode. -s runs in quiet mode; -h provides this help.
Power to the nodes can also be controlled using a GUI tool. Using both the command line and GUI tools at the same time may cause system instability.
The GUI power control tool can be started from the command line by entering, as the root user:
# NPstat
There is also a “Node Power Status” entry in the Applications/System Tools menu (only visible to the root user). An example menu is shown below.
The main power control panel shown below is displayed either through the command line or the Applications menus. In addition to power status there are several other status indicators. These indicators are described as follows:
Note, the Node Power Control tool is not intended as a monitoring tool. The response times can be slow due to how the information is obtained from the nodes. It is primarily designed to provide information needed for power control of the worker nodes.
There are three button at the bottom of the panel.
Selecting the node to control is done with the selection window shown below. Any combination of nodes can be powered on or off using this selection box. If a node is checked it will be powered on. If a node is not checked it will be powered off. The node name and the current status are indicated on the panel. Similar to the command line tools, if a node is already on (or off) setting it to on (or off), will have no effect.
In the selection window above, node n0
will have no change, node n1
will be turn off, node n2
will be turned on. Clicking “OK” will display the Confirmation Window shown below.
This window will indicate what will happen for each node. The power control choices can be changed by using the “Back” button. Once the choice is correct, entering “Yes” will start the power control operations. An indicator of which nodes are powering down will be displayed until they are finished. Like the command line tools, the shutdown will time out and cut power if the operating system's shutdown cannot be confirmed. The “Close” button does not stop the shutdown process.
Next, all nodes slated to power up will be started. In this case, node n2
will start. This process cannot be interrupted and the window will remain until the startup process is complete or the timeout has been reached.
At the end of the the power-up or power-down cycle, the main Power Control panel will display the current state of the system.
The low-level relayset
utility is available to administrators, however, it should only be used as a last resort. The node-power{on/off} utilities provide a much more controlled and graceful method of turning systems on and off.
For reference, the options to relayset
are provided below.
Important: All Limulus systems use the following convention: n0 is connected to relay-2, n1 is connected to relay-3, and n2 is connected to relay-4. Contact Limulus Computing for larger numbers of nodes.
# relayset Not enough or wrong arguments. To initialize (do first): relayset init To turn relay on/off: relayset 1|2|3|4 on|off To get status: relayset 1|2|3|4 status (Returns 1 if on, 0 if off) To list the devices (with ID): relayset list To create the relay node map: relayset map For multiple boards add the board ID to the command line (not needed for list and map) To print debug messages add "debug" to the command line Returns -1 on error, 0 or 1 if successful. Version: 0.3-05-30-17