This is an old revision of the document!
All Limulus systems have a a mechanism for executing commands on all worker nodes at the same time. The pdsh
command allows the same command to be run on any combination of Limulus nodes. By default, and with out any arguments, the pdsh
command work on all active nodes and use ssh
to issue the command. For example:
# pdsh hostname n2: n2 n1: n1 n0: n0
Piping the results into sort
will keep the returned results ordered by node number (the pipe happens on the node that issued the pdsh
command):
# pdsh uptime|sort n0: 08:58:47 up 153 days, 18:29, 0 users, load average: 0.00, 0.02, 0.00 n1: 08:58:47 up 153 days, 18:29, 0 users, load average: 0.09, 0.08, 0.06 n2: 08:58:47 up 153 days, 18:29, 0 users, load average: 0.13, 0.12, 0.05
pdsh
also allows commands to be issued on specific nodes using the “[ ]” syntax. For example, to issue on a range of nodes (in this case trivial) the following command can be used:
# pdsh -w n[0-1] date n1: Mon Jan 18 09:00:39 EST 2021 n0: Mon Jan 18 09:00:39 EST 2021
Nodes do not need to be “sequential” and can be separated by a comma.
# pdsh -w n[0,1] date n0: Mon Jan 18 09:00:59 EST 2021 n1: Mon Jan 18 09:00:59 EST 2021
Although not very useful on Limulus systems, you can also use host lists with pdsh
. See the man page.
There is a full man page for pdsh
(run man pdsh
)
There may be times, when a file needs to be updated across the cluster. Be aware that important files are managed by either Warewulf or Ambari and there is no need to manipulate these files “by hand.”
In addition, there are two NFS mounted directories that appear across the cluster. On all systems /home
is available on all nodes. This configuration is important for HPC systems and actually not really needed on Data Analytics systems (Hadoop). The second system wide NFS mount depends on the type of system.
/opt/ohpc
is mounted on all nodes. /opt/cluster
is mounted on all nodes.
Under both these mounts is a private admin/etc
path. Files needed on all nodes, can be conveniently located in these system-wide directories and thus eliminate the need to copy files.
In the event that copying a file is absolutely necessary, the following procedure is the preferred way to copy a file to the nodes:
admin/etc
directory on the headnode (use /opt/ohpc/admin/etc
on HPC systems): # cp TEMP-FILE /opt/cluster/admin/etc
pdsh
to copy the file to the /root
directory on all the nodes (surround the command with “ or '): # pdsh "/opt/cluster/admin/etc/TEMP-FILE /root"
# pdsh ls /root/TEMP-FILE n1: /root/TEMP-FILE n2: /root/TEMP-FILE n0: /root/TEMP-FILE
#pdsh rm /root/TEMP-FILE
By keeping a copy of the file in the NFS mounted ~/admin/etc
path, a convenient record of file movement/changes can be consulted in the future.
While pdsh
is an immensely useful command, it does have some limitations and cautions.
pdsh
only works on the headnode (login node)pdsh
cannot be used for interactive commands (e.g. pdsh top
will not work. It shoudl be used with commands that “finish.” You can break out of pdsh
using multiple ctrl-c
commands.# pdsh "uptime|sort" 0: 09:08:14 up 153 days, 18:38, 0 users, load average: 0.09, 0.08, 0.02 n2: 09:08:14 up 153 days, 18:38, 0 users, load average: 0.04, 0.05, 0.01 n1: 09:08:14 up 153 days, 18:39, 0 users, load average: 0.02, 0.04, 0.04
pdsh
to make changes on the nodes is not recommended. On HPC systems, the changes will go away on the next restart of the nodes (see VNFS images). On Data Analytics Systems (Hadoop) changing nodes may cause “personalities” to develop and eventual make managing the system confusing or almost impossible. With few exceptions, full management of the Analytics systems (Hadoop) should be possible through the |Ambari Cluster Manager.