HP Cluster Platform Interconnects v2010 Quadrics QsNetII Interconnect - Page 130

Using the qsnetsoak Command

Page 130 highlights

# qsportmap name Where name is an interconnect name, such as QR0T03. The command displays a table of boards and ports on the top-level interconnect. Each location shows the port on the node-level interconnect module to which a port on the top-level interconnect is connected. Typical output follows: # qsportmap QR0T01 Module QR0T01 Port "Board 0" "Board 1" "Board 2" "Board 3" 00 "QR0N00 B4 P12" "QR0N00 B6 P12" "QR0N00 B4 P08" "QR0N00 B6 P08" 01 "QR0N01 B4 P12" "QR0N01 B6 P12" "QR0N01 B4 P08" "QR0N01 B6 P08" 02 "QR0N02 B4 P12" "QR0N02 B6 P12" "QR0N02 B4 P08" "QR0N02 B6 P08" Module QR0T01 Port "Board 4" "Board 5" "Board 6" "Board 7" 00 "QR0N00 B4 P04" "QR0N00 B6 P04" "QR0N00 B4 P00" "QR0N00 B6 P00" 01 "QR0N01 B4 P04" "QR0N01 B6 P04" "QR0N01 B4 P00" "QR0N01 B6 P00" 02 "QR0N02 B4 P04" "QR0N02 B6 P04" "QR0N02 B4 P00" "QR0N02 B6 P00" 03 "QR0N03 B4 P04" "QR0N03 B6 P04" "QR0N03 B4 P00" "QR0N03 B6 P00" 04 "QR0N00 B4 P06" "QR0N00 B6 P06" "QR0N00 B4 P02" "QR0N00 B6 P02" 05 "QR0N01 B4 P06" "QR0N01 B6 P06" "QR0N01 B4 P02" "QR0N01 B6 P02" 06 "QR0N02 B4 P06" "QR0N02 B6 P06" "QR0N02 B4 P02" "QR0N02 B6 P02" Locations that have no information represent links that have a badly connected or missing link cable. The mapping of cables to ports should match the Cabling Tables for your cluster. 12.17 Using the qsnetsoak Command Invoke the qsnetsoak command with no options by using RMS prun or other local cluster process scheduler such as pdsh or SLERM srun. The test includes a mirrored dmatest followed by a global exchange to ensure that the network is saturated with data. You can detect the type and location of any reported errors by using qsnetstat. When run on the entire cluster, all points of the network are exercised. The qsnetsoak command tests the network from all the connected nodes at the same time, verifying the following: • The QM500 PCI cards. • Link cables from the QM500 to the Interconnect node level interconnects. • Link cables from the Interconnect node level to the top-level interconnects. • All switch chips (on any installed switch card modules) and their interconnecting links. • The QM501 switch cards in the node and top-level interconnects. • The QM502 switch cards in the node-level interconnects. The performance impact of this test high because the test runs on all the nodes, transferring a large amount of test data over the link cables. All the routes through the network are loaded and the test might take several hours to complete on a large cluster. 12-28 Maintenance and Diagnostic Procedures

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166

#
qsportmap
name
Where name is an interconnect name, such as QR0T03. The command displays
a table of boards and ports on the top-level interconnect. Each location shows
the port on the node-level interconnect module to which a port on the top-level
interconnect is connected. Typical output follows:
#
qsportmap QR0T01
Module QR0T01
Port "Board 0" "Board 1" "Board 2" "Board 3"
00 "QR0N00 B4 P12" "QR0N00 B6 P12" "QR0N00 B4 P08" "QR0N00 B6 P08"
01 "QR0N01 B4 P12" "QR0N01 B6 P12" "QR0N01 B4 P08" "QR0N01 B6 P08"
02 "QR0N02 B4 P12" "QR0N02 B6 P12" "QR0N02 B4 P08" "QR0N02 B6 P08"
<display truncated>
Module QR0T01
Port "Board 4" "Board 5" "Board 6" "Board 7"
00 "QR0N00 B4 P04" "QR0N00 B6 P04" "QR0N00 B4 P00" "QR0N00 B6 P00"
01 "QR0N01 B4 P04" "QR0N01 B6 P04" "QR0N01 B4 P00" "QR0N01 B6 P00"
02 "QR0N02 B4 P04" "QR0N02 B6 P04" "QR0N02 B4 P00" "QR0N02 B6 P00"
03 "QR0N03 B4 P04" "QR0N03 B6 P04" "QR0N03 B4 P00" "QR0N03 B6 P00"
04 "QR0N00 B4 P06" "QR0N00 B6 P06" "QR0N00 B4 P02" "QR0N00 B6 P02"
05 "QR0N01 B4 P06" "QR0N01 B6 P06" "QR0N01 B4 P02" "QR0N01 B6 P02"
06 "QR0N02 B4 P06" "QR0N02 B6 P06" "QR0N02 B4 P02" "QR0N02 B6 P02"
<display truncated>
Locations that have no information represent links that have a badly connected
or missing link cable. The mapping of cables to ports should match the
Cabling
Tables
for your cluster.
12.17 Using the qsnetsoak Command
Invoke the
qsnetsoak
command with no options by using
RMS prun
or other
local cluster process scheduler such as
pdsh
or
SLERM srun
. The test includes a
mirrored
dmatest
followed by a global exchange to ensure that the network is
saturated with data. You can detect the type and location of any reported errors
by using
qsnetstat
. When run on the entire cluster, all points of the network
are exercised.
The
qsnetsoak
command tests the network from all the connected nodes at the
same time, verifying the following:
The QM500 PCI cards.
Link cables from the QM500 to the Interconnect node level interconnects.
Link cables from the Interconnect node level to the top-level interconnects.
All switch chips (on any installed switch card modules) and their
interconnecting links.
The QM501 switch cards in the node and top-level interconnects.
The QM502 switch cards in the node-level interconnects.
The performance impact of this test high because the test runs on all the nodes,
transferring a large amount of test data over the link cables. All the routes
through the network are loaded and the test might take several hours to complete
on a large cluster.
12-28
Maintenance and Diagnostic Procedures