HP 2064-M ClusterPack V2.4 Tutorial - Page 99
Replace a Compute Node that has failed with a new machine
View all HP 2064-M manuals
Add to My Manuals
Save this manual to your list of manuals |
Page 99 highlights
Problem: When I try to add a node, I get "Properties file for doesn't exist." Solution: z Make sure that the hostname is fully qualified in /etc/hosts on both the Management Server and the managed node, if it exists in /etc/hosts, and that any shortened host names are aliases instead of primary names. For example: { 10.1.2.3 cluster.abc.com cluster z should be used instead of: { 10.1.2.3 cluster z Make sure that AgentConfig is installed on the managed node, and that mxrmi and mxagent are running. z ps -ef | grep mx should produce something like this: root 23332 1 0 15:42:17 ? 1:08 /opt/mx/lbin/mxagent root 23334 1 0 15:42:17 ? 0:59 /opt/mx/lbin/mxrmi root 24269 24252 1 01:30:51 pts/0 0:00 grep mx z If AgentConfig is installed and running, uninstall it and then reinstall it: % /usr/sbin/swremove AgentConfig z To install AgentConfig, type; % /usr/sbin/swinstall -s :/var/opt/mx/depot11 AgentConfig z where is the hostname of the Management Server. Problem: scmgr prints "out of memory" errors. Solution: z On the Management Server, using SAM or kmtune, make sure that the Kernel Configurable Parameter max_thread_proc is at least 256, and that nkthread is at least 1000. Back to Top 1.9.7 Replace a Compute Node that has failed with a new machine If a Compute Node fails due to a hardware problem, and must be replaced, the new node can