User Tools

Site Tools


node_power_control

Node Power Control

Each of the three worker nodes (seven nodes on the double-wide Limulus) can be powered up or down from the login node. This feature is generally not used in the Data Analytics systems because most of the daemons must be constantly running. In particular, the Hadoop Distributed File System (HDFS) runs on all worker nodes.

This feature can be useful for HPC based Limulus systems. Users may turn nodes of and on as needed using the node-power{on/off} scripts or the NPstat utility. In addition, node power control can be fully managed by the Slurm Workload Manager. In this scenario, nodes that are not begin used are turned off and then turned on when jobs are submitted to the queue. See Configuring Slurm for Automatic Power Control

The Relayset Program

The low-level relayset utility is available to administrators, however, it should only be used as a last resort. The node-power{on/off} utilities provide a much more controlled and graceful method of turning systems on and off.

Each four node Limulus has a relay-board with four power relays that control the 12V power to each node. (Nodes operate on the power supply 12V power rails using a DC-DC convertor.) The relay-board is connected to a USB port on the login node. The eight node Limulus systems use two relay-boards that allow the login-node to control the additional four motherboards.

A program to drive the relay-bard is called relayset This program allows each relay to be turned off and on. Relayset also can be used to provide the status of each relay.

Four node Limulus systems use the following convention:

  • n0 is connected to relay-2
  • n1 is connected to relay-3
  • n2 is connected to relay-4

There is nothing connected to relay-1.

On eight node systems, the additional nodes are mapped to the second relay-board as follows:

  • n3 is connected to relay-1 on second board
  • n4 is connected to relay-2 on second board
  • n5 is connected to relay-3 on second board
  • n6 is connected to relay-4 on second board

In multiple relay-board systems, boards are identified by their unique ID. See below. If a single bard is used, there is not need for a board ID.

Running relayset without options will produce the following.

# relayset 
Not enough or wrong arguments.
  To initialize (do first):  relayset init
  To turn relay on/off:      relayset 1|2|3|4 on|off
  To get status:             relayset 1|2|3|4 status
                             (Returns 1 if on, 0 if off)
  To list the devices (with ID):  relayset list
  To create the relay node map: relayset map 
  For multiple boards add the board ID to the command line (not needed for list and map)
  To print debug messages add "debug" to the command line
  Returns -1 on error, 0 or 1 if successful.
Version: 0.3-05-30-17

There are several options available

  • init - this option must be run once when the systems boots (done automatically) Running relayset init after the system is started will cause any running nodes to reboot.
  • list - this option will list the number of boards in the system (can be only one or two)
  • map - created the file /etc/warewulf/.Limulus/relayboardID that is used when there are multiple relay-boards. The file contains the unique id of each relay-board. Used when confirming the system.
  • on - combined with a relay number (1,2,3,4), the command will turn the relay OFF. Returns -1 if not successful. Adding debug prints debug information.
  • off - combined with a relay number (1,2,3,4), the command will turn the relay ON. Returns -1 if not successful. Adding debug prints debug information.
  • status - combined with a relay number (1,2,3,4), the command will return the relay status 0=off, 1=on. Returns -1 if not successful. Adding debug prints debug information.
  • boardid - for multiple boards the board ID is added to the command line. It is only needed in multiple board systems.

Examples:

# relayset list
Number of FTDI devices found: 1
Checking device: 0
Manufacturer: TCTEC, Description: TCTEC USB RELAY, Serial: FT1PYQKM
# relayset 2 status debug
relay=2, set=-1, reportstate=1, debug=1 
ftdi open succeeded: 0
Reading current bits returned 0x1:  0000 0001
relay 2 is ON

Mapping Nodes to Relays

A utility program is available to determining the mapping between a node name (n0, n1, … n6) and the specific relay. The script returns the relay number. This script is intended for use by other management scripts. In normal usage, an administrator should never need to run this script.

Examples:

/etc/warewulf/.Limulus/limulus-map-node n1
3

If the file /etc/warewulf/.Limulus/relayboardID is present it will include the relay board-ID.

/etc/warewulf/.Limulus/limulus-map-node n1
3 FT1PYQKM
node_power_control.txt · Last modified: 2021/06/18 18:33 by deadline