Systems Manual

This is an old revision of the document!

System Wide Commands using pdsh

All Limulus systems have a a mechanism for executing commands on all worker nodes at the same time. The pdsh command allows the same command to be run on any combination of Limulus nodes. By default, and with out any arguments, the pdsh command work on all active nodes and use ssh to issue the command. For example:

# pdsh hostname
n2: n2
n1: n1
n0: n0

Piping the results into sort will keep the returned results ordered by node number (the pipe happens on the node that issued the pdsh command):

# pdsh uptime|sort
n0:  08:58:47 up 153 days, 18:29,  0 users,  load average: 0.00, 0.02, 0.00
n1:  08:58:47 up 153 days, 18:29,  0 users,  load average: 0.09, 0.08, 0.06
n2:  08:58:47 up 153 days, 18:29,  0 users,  load average: 0.13, 0.12, 0.05

pdsh also allows commands to be issued on specific nodes using the “[ ]” syntax. For example, to issue on a range of nodes (in this case trivial) the following command can be used:

# pdsh -w n[0-1] date
n1: Mon Jan 18 09:00:39 EST 2021
n0: Mon Jan 18 09:00:39 EST 2021

Nodes do not need to be “sequential” and can be separated by a comma.

# pdsh -w n[0,1] date
n0: Mon Jan 18 09:00:59 EST 2021
n1: Mon Jan 18 09:00:59 EST 2021

Although not very useful on Limulus systems, you can also use host lists with pdsh. See the man page.

Help with pdsh

There is a full man page for pdsh (run man pdsh)

Using pdsh to Copy Files

There may be times, when a file needs to be updated across the cluster. Be aware that important files are managed by either Warewulf or Ambari and there is no need to manipulate these files “by hand.”

In addition, there are two NFS mounted directories that appear across the cluster. On all systems /home is available on all nodes. This configuration is important for HPC systems and actually not really needed on Data Analytics systems (Hadoop). The second system wide NFS mount depends on the type of system.

On HPC systems, /opt/ohpc is mounted on all nodes.

On Data Analytics (Hadoop) systems. /opt/cluster is mounted on all nodes.

Under both these mounts is a private admin/etc path. Files needed on all nodes, can be conveniently located in these system-wide directories and thus eliminate the need to copy files.

In the event that copying a file is absolutely necessary, the following procedure is the preferred way to copy a file to the nodes:

Copy the file to the NFS shared admin/etc directory on the headnode (use /opt/ohpc/admin/etcon HPC systems):
```
 # cp TEMP-FILE /opt/cluster/admin/etc 
```
Next, use pdsh to copy the file to the /root directory on all the nodes (surround the command with “ or '):
```
# pdsh "/opt/cluster/admin/etc/TEMP-FILE /root" 
```

Check that the file arrived:

# pdsh ls /root/TEMP-FILE
n1: /root/TEMP-FILE
n2: /root/TEMP-FILE
n0: /root/TEMP-FILE

Finally, to remove the file on all nodes:
```
#pdsh rm /root/TEMP-FILE
```

By keeping a copy of the file in the NFS mounted ~/admin/etc path, a convenient record of file movement/changes can be consulted in the future.

Some Important Points about pdsh

While pdsh is an immensely useful command, it does have some limitations and cautions.

pdsh only works on the headnode (login node)

pdsh cannot be used for interactive commands (e.g. pdsh top will not work. It shoudl be used with commands that “finish.” You can break out of pdsh using multiple ctrl-c commands.
If you want multiple commands to execute on the node, then the command must be surrounded by single or double quotes. For example, if the above sort command was surround by quotes, the sort will take place “on the node” and not on the issuing node. The following example “sorts” on each node (there is nothing to sort) and the results are returned unordered:
```
 # pdsh "uptime|sort"
0:  09:08:14 up 153 days, 18:38,  0 users,  load average: 0.09, 0.08, 0.02
n2:  09:08:14 up 153 days, 18:38,  0 users,  load average: 0.04, 0.05, 0.01
n1:  09:08:14 up 153 days, 18:39,  0 users,  load average: 0.02, 0.04, 0.04
```

Although tempting, using pdsh to make changes on the nodes is not recommended. On HPC systems, the changes will go away on the next restart of the nodes (see VNFS images). On Data Analytics Systems (Hadoop) changing nodes may cause “personalities” to develop and eventual make managing the system confusing or almost impossible. With few exceptions, full management of the Analytics systems (Hadoop) should be possible through the |Ambari Cluster Manager.

Systems Manual

User Tools

Site Tools

Table of Contents

System Wide Commands using pdsh

Help with pdsh

Using pdsh to Copy Files

Some Important Points about pdsh

Page Tools