IBM E027SLL-H Troubleshooting Guide - Page 174

Unable to start the Tivoli Enterprise Monitoring Server after

Page 174 highlights

Distribution to a monitoring server on the Object group editor is not equivalent to a monitoring server distribution in the historical collection configuration window. To decrypt a password, KDS_VALIDATE_EXT='Y' is required KDS_VALIDATE_EXT='Y' is required on a SLES 10 64-bit zLinux monitoring server to successfully decrypt a password sent by the portal server for validation. This operating system uses Pluggable Authentication Modules (PAM) and this monitoring server parameter for this purpose. For all other purposes, PAM is not supported by adding the parameter KDS_VALIDATE_EXT=Y to a monitoring server configuration. Remote Tivoli Enterprise Monitoring Server consumes high CPU when large number of agents connect In enterprise environments, a large number of agents can connect to a remote Tivoli Enterprise Monitoring Server in a short period of time. Examples of when this might occur are during startup of the Tivoli Enterprise Monitoring Server, or when agents failover from a primary to secondary Tivoli Enterprise Monitoring Server. In these cases, the amount of CPU processing is directly proportional to the total number of situations that have been distributed to agents connected to the remote Tivoli Enterprise Monitoring Server. For example, if there are 1000 agents connecting to the remote Tivoli Enterprise Monitoring Server, and each agent has an average of 20 situations distributed to it, the total number of situations distributed to agents connected to the remote Tivoli Enterprise Monitoring Server would be 20 thousand. To minimize the amount of CPU processing when a large number of agents connect, consider reducing the total number of situations distributed by avoiding distribution of situations that are not being used. Some situations, including predefined situations, have the default distribution set as a managed system list. These situations are distributed to all managed systems in the managed system list, even if the situation is not being used. Limiting the distribution to only managed systems where the situation will be used minimizes the total number of situations distributed from the remote Tivoli Enterprise Monitoring Server, and minimizes the CPU processing when a large number of agents connect. The distribution specification for a situation can be changed using the Situation editor or the tacmd editsit command. Unable to start the Tivoli Enterprise Monitoring Server after the kdsmain process is terminated abnormally When the kdsmain process is terminated abnormally, a stale cms process is left behind. This stale cms process prevents the proper startup of the Tivoli Enterprise Monitoring Server. The cms process should be killed first, and then a startup of the Tivoli Enterprise Monitoring Server should be retried for a successful startup. A restart of the Tivoli Enterprise Monitoring Server should be attempted only after verifying the CMS.EXE process is also terminated. A CMS.EXE left running in response to the earlier failure is likely to cause a subsequent start of Tivoli Enterprise Monitoring Server to fail. 156 IBM Tivoli Monitoring: Troubleshooting Guide

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283
  • 284
  • 285
  • 286
  • 287
  • 288
  • 289
  • 290
  • 291
  • 292
  • 293
  • 294
  • 295
  • 296
  • 297
  • 298
  • 299
  • 300
  • 301
  • 302
  • 303
  • 304
  • 305
  • 306
  • 307
  • 308
  • 309
  • 310

Distribution to a monitoring server on the Object group editor is not equivalent to
a monitoring server distribution in the historical collection configuration window.
To decrypt a password, KDS_VALIDATE_EXT='Y' is required
KDS_VALIDATE_EXT='Y' is required on a SLES 10 64-bit zLinux monitoring server
to successfully decrypt a password sent by the portal server for validation. This
operating system uses Pluggable Authentication Modules (PAM) and this
monitoring server parameter for this purpose. For all other purposes, PAM is not
supported by adding the parameter KDS_VALIDATE_EXT=Y to a monitoring
server configuration.
Remote Tivoli Enterprise Monitoring Server consumes high
CPU when large number of agents connect
In enterprise environments, a large number of agents can connect to a remote
Tivoli Enterprise Monitoring Server in a short period of time. Examples of when
this might occur are during startup of the Tivoli Enterprise Monitoring Server, or
when agents failover from a primary to secondary Tivoli Enterprise Monitoring
Server. In these cases, the amount of CPU processing is directly proportional to the
total number of situations that have been distributed to agents connected to the
remote Tivoli Enterprise Monitoring Server. For example, if there are 1000 agents
connecting to the remote Tivoli Enterprise Monitoring Server, and each agent has
an average of 20 situations distributed to it, the total number of situations
distributed to agents connected to the remote Tivoli Enterprise Monitoring Server
would be 20 thousand.
To minimize the amount of CPU processing when a large number of agents
connect, consider reducing the total number of situations distributed by avoiding
distribution of situations that are not being used. Some situations, including
predefined situations, have the default distribution set as a managed system list.
These situations are distributed to all managed systems in the managed system list,
even if the situation is not being used. Limiting the distribution to only managed
systems where the situation will be used minimizes the total number of situations
distributed from the remote Tivoli Enterprise Monitoring Server, and minimizes the
CPU processing when a large number of agents connect.
The distribution specification for a situation can be changed using the Situation
editor or the
tacmd editsit
command.
Unable to start the Tivoli Enterprise Monitoring Server after
the kdsmain process is terminated abnormally
When the kdsmain process is terminated abnormally, a stale cms process is left
behind. This stale cms process prevents the proper startup of the Tivoli Enterprise
Monitoring Server. The cms process should be killed first, and then a startup of the
Tivoli Enterprise Monitoring Server should be retried for a successful startup. A
restart of the Tivoli Enterprise Monitoring Server should be attempted only after
verifying the
CMS.EXE
process is also terminated. A
CMS.EXE
left running in response
to the earlier failure is likely to cause a subsequent start of Tivoli Enterprise
Monitoring Server to fail.
156
IBM Tivoli Monitoring: Troubleshooting Guide