This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
powering_up_down_nodes [2020/06/09 21:01] deadline created |
powering_up_down_nodes [2021/04/29 17:36] (current) brandonm [Using the Relayset Program] Punctuation, spelling, and word fixes |
||
---|---|---|---|
Line 1: | Line 1: | ||
===== Powering Nodes Up and Down===== | ===== Powering Nodes Up and Down===== | ||
- | ===HPC Systems:=== | + | Each Limulus system consists of one login-node and three or seven worker nodes. As shipped, the login-node has the alias name '' |
- | All nodes are in the powered down state when the systems boots. | + | Power to the three (or seven) worker |
- | To turn the nodes on, enter " | + | |
- | ===Data Analytics Systems=== | + | Direct use of '' |
+ | |||
+ | Use of '' | ||
+ | |||
+ | **!!!Resetting or rebooting the login-node will shutdown all worker nodes!!!** | ||
+ | |||
+ | ===Worker Node Startup on HPC Systems=== | ||
+ | |||
+ | All nodes are in the powered down state when the system is initially started. Nodes can be started using either the command line or the GUI tool described below. It is assumed that the user can control what nodes are powered on (or off) as they use the system. If all nodes are to be started when the system starts, add the line ''/ | ||
+ | |||
+ | ===HPC System Shutdown=== | ||
+ | |||
+ | Shutting down the login-node, either by rebooting or powering down, will gracefully shut down the worker nodes (i.e. they will receive a local '' | ||
+ | |||
+ | |||
+ | ===Worker Node Startup on Data Analytics | ||
+ | |||
+ | All nodes and all services (e.g. HDFS, YARN) are started when the system is initially powered on. Nodes can be started using either the command line or the GUI tool described below, however, power cycling nodes may disrupt the running service daemons. | ||
+ | |||
+ | ===Data Analytics (Hadoop) System Shutdown=== | ||
+ | |||
+ | Proper shutdown of Data Analytics (Hadoop) systems is provided in the [[Using the Apache Ambari Cluster Manager|Using the Apache Ambari Cluster Manager]] section. Basically, the Hadoop services should be gracefully shut down before the system is powered down. Note that these services are robust and can often recover from a sudden (or unexpected) power loss, however, data can be lost -- particularly in HDFS. | ||
====Command Line Power Control==== | ====Command Line Power Control==== | ||
- | Power to the three worker nodes is controlled by the login node. There are a set of relays underneath the 1 GbE switch. These relays can be controlled directly by using the relayset program (n0 is connected to relay-2, n1 is connected to relay-3, and n2 is connected to relay-4) However, it is not recommended to control the nodes using relayset. We highly recommended using the node-poweron and node-poweroff utilities | + | The '' |
+ | |||
+ | Executing either command | ||
+ | |||
+ | ===node-poweron=== | ||
+ | |||
+ | To turn on all nodes, simply enter: | ||
+ | |||
+ | # node-poweron | ||
+ | |||
+ | To turn on node '' | ||
+ | |||
+ | # node-poweron n2 | ||
+ | |||
+ | The '' | ||
+ | Also note that if a node is already up and running, turning the power on with '' | ||
+ | | ||
< | < | ||
# node-poweron -h | # node-poweron -h | ||
Line 22: | Line 58: | ||
# node-poweron -s | # node-poweron -s | ||
Invalid nodes will be ignored. Default Limulus nodes are {n0,n1,n2} | Invalid nodes will be ignored. Default Limulus nodes are {n0,n1,n2} | ||
- | The script waits untill | + | The script waits until all nodes are started or the process times out. |
-s runs in quiet mode; -h provides this help. | -s runs in quiet mode; -h provides this help. | ||
</ | </ | ||
+ | |||
+ | ===node-poweroff=== | ||
+ | |||
+ | To turn all nodes off gracefully (remove power), simply enter: | ||
+ | |||
+ | # node-poweroff | ||
+ | | ||
+ | To turn off just node '' | ||
+ | |||
+ | # node-poweroff n0 | ||
+ | | ||
+ | The '' | ||
+ | |||
+ | Like '' | ||
+ | |||
+ | Also note that if a node is already down, turning the power off with '' | ||
+ | | ||
< | < | ||
- | node-poweroff -h | + | # node-poweroff -h |
node-power-on [-h help] [-s silent] [nodes] | node-power-on [-h help] [-s silent] [nodes] | ||
Line 41: | Line 94: | ||
</ | </ | ||
+ | ====Power Control GUI==== | ||
+ | Power to the nodes can also be controlled using a GUI tool. Using both the command line and GUI tools at the same time may cause system instability. | ||
- | ====Power Control | + | The GUI power control tool can be started from the command line by entering, as the root user: |
- | {{: | + | # NPstat |
+ | |||
+ | There is also a "Node Power Status" | ||
+ | An example menu is shown below. | ||
- | {{: | ||
- | {{:wiki:node-power-control-select.png?320|}} | + | {{ :wiki:applications-menu.png?340 |}} |
- | {{: | + | The main power control |
- | {{: | + | {{ : |
- | {{: | + | * **Power** |
+ | * **OS Up** - indicates if the node Operating System is up and running (indicated by a " | ||
+ | * **Users** - indicates the number of users logged into the node. This information is provided so that user activity is not accidentally terminated. | ||
+ | * **Load** - an indication of how busy the node is. This information is provided so that background activity is not accidentally terminated. | ||
+ | * **Mem** - an indication of how much memory is being used. Similar to the Load and User status, this information is provided so that background activity is not accidentally terminated. | ||
+ | * **Days Alive** -- the number of days since the node was started. | ||
+ | **Note**, the Node Power Control tool is not intended as a monitoring tool. The response times can be slow due to how the information is obtained from the nodes. It is primarily designed to provide information needed for power control of the worker nodes. | ||
+ | |||
+ | There are three button at the bottom of the panel. | ||
+ | |||
+ | - **Refresh** - refresh the panel data. The last refresh time is shown at the top of the panel. The panles des not "auto refresh." | ||
+ | - **Node Power** - open the power control selection window shown below. | ||
+ | - **Quit** - quit the utility. | ||
+ | |||
+ | Selecting the node to control is done with the selection window shown below. Any combination of nodes can be powered on or off using this selection box. If a node is checked it will be powered on. If a node is not checked it will be powered off. The node name and the current status are indicated on the panel. Similar to the command line tools, if a node is already on (or off) setting it to on (or off), will have no effect. | ||
+ | |||
+ | {{ : | ||
+ | |||
+ | In the selection window above, node '' | ||
+ | |||
+ | {{ : | ||
+ | |||
+ | This window will indicate what will happen for each node. The power control choices can be changed by using the " | ||
+ | |||
+ | {{ : | ||
+ | |||
+ | Next, all nodes slated to power up will be started. In this case, node '' | ||
+ | |||
+ | At the end of the the power-up or power-down cycle, the main Power Control panel will display the current state of the system. | ||
+ | |||
+ | {{ : | ||
+ | |||
+ | |||
+ | |||
+ | ====Using the Relayset Program==== | ||
+ | |||
+ | The low-level '' | ||
+ | |||
+ | For reference, the options to '' | ||
+ | |||
+ | **Important: | ||
+ | |||
+ | < | ||
+ | # relayset | ||
+ | Not enough or wrong arguments. | ||
+ | To initialize (do first): | ||
+ | To turn relay on/ | ||
+ | To get status: | ||
+ | | ||
+ | To list the devices (with ID): relayset list | ||
+ | To create the relay node map: relayset map | ||
+ | For multiple boards add the board ID to the command line (not needed for list and map) | ||
+ | To print debug messages add " | ||
+ | Returns -1 on error, 0 or 1 if successful. | ||
+ | Version: 0.3-05-30-17 | ||
+ | </ |