Dell PowerEdge MX7000 EMC OpenManage Enterprise-Modular Edition Version 1.20.0 - Page 108

Monitoring the MCM group, Backup Sync, Overview, Group Information, Hardware Logs, Alerts, Alert Log

Page 108 highlights

Initially, the backup health status is displayed as "Critical" while the configuration data is being synchronized before changing to "OK". Wait for the backup health to transition to "OK" before proceeding. If the backup health continues to report "Critical" or "Warning" even after 30 minutes of the assign task, it is an indication that there are persistent communication issues. Unassign the backup and repeat the Step 5 to choose another member as the new backup. Also, Dell EMC recommends that you create an alert policy on lead to take notification actions through email, SNMP trap, system log, for backup health alerts. Backup health alerts are part of the chassis configuration and system health category. 6. Configure the member chassis that is designated as the backup. It is mandatory for the backup chassis to have its own management network IP. The IP enables the backup to forward backup health alerts. Create an alert policy on the backup to take notification actions (email, SNMP trap, system log) for backup health alerts. Backup health alerts are part of Chassis (Configuration, System Health) category. The backup chassis raises warning or critical alerts when it detects that the backup synchronization status is bad because of communication or other irrecoverable errors. Monitoring the MCM group 1. Complete all the configuration tasks before assigning the backup lead. However, if you have to modify the configuration after assigning the backup, the changes are automatically copied to the backup. The process of copying the changes to the backup may take up to 90 minutes, based on the configuration change. 2. The backup synchronization status of the lead and backup lead chassis is available at the following GUI locations: a. On the lead chassis: ● Home page-Backup Sync status under the member (backup) ● Lead Overview page-Redundancy and backup synchronization status under Group Information b. On the backup chassis: ● Home > Overview page-Backup Sync status under the Group Information. 3. Interpreting the backup health: ● If backup sync is healthy, the status is displayed as "Ok" and no further actions are needed. ● If backup sync is not healthy, the status is displayed as "Warning" or "Critical". The "Warning" indicates a momentary synchronization problem that is resolved automatically. The "Critical" status indicates a permanent problem and requires user action. ● When the backup sync status changes to "Warning" or "Critical", the associated alerts are generated under alert categories Chassis (Configuration, System Health). These alerts are logged to the Home > Hardware Logs and Alerts > Alert Log. The alerts are also shown as faults under the Home > Chassis Subsystems (top right-hand corner) under the MM subsystem. If an alert policy is configured, the actions are taken as configured in the policy. 4. Required user actions when Backup health is "Warning" or "Critical": ● Warning-A momentary status and must transition to "Ok" or "Critical". But if the status continues to report "Warning" for more than 90 minutes, Dell EMC recommends that you assign a new backup. ● Critical-A permanent status indicative of issues with the backup or lead. Identify the underlying issues and take appropriate actions as described below: ○ Health is critical because of alert CDEV4006: The lead or member chassis has drifted its firmware version causing a lead/backup incompatibility. It is recommended that the firmware of the lead or member chassis is brought back to the same version (1.10.00 or later). ○ Health is critical because of alert CDEV4007: one of the several underlying issues contributes to this status, see the following flow chart to determine the cause and take the recommended action. 108 Use case scenarios

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122

Initially, the backup health status is displayed as "Critical" while the configuration data is being synchronized before changing
to "OK". Wait for the backup health to transition to "OK" before proceeding. If the backup health continues to report
"Critical" or "Warning" even after 30 minutes of the assign task, it is an indication that there are persistent communication
issues. Unassign the backup and repeat the Step 5 to choose another member as the new backup. Also, Dell EMC
recommends that you create an alert policy on lead to take notification actions through email, SNMP trap, system log, for
backup health alerts. Backup health alerts are part of the chassis configuration and system health category.
6.
Configure the member chassis that is designated as the backup.
It is mandatory for the backup chassis to have its own management network IP. The IP enables the backup to forward
backup health alerts.
Create an alert policy on the backup to take notification actions (email, SNMP trap, system log) for backup health alerts.
Backup health alerts are part of Chassis (Configuration, System Health) category. The backup chassis raises warning or
critical alerts when it detects that the backup synchronization status is bad because of communication or other irrecoverable
errors.
Monitoring the MCM group
1.
Complete all the configuration tasks before assigning the backup lead. However, if you have to modify the configuration
after assigning the backup, the changes are automatically copied to the backup. The process of copying the changes to the
backup may take up to 90 minutes, based on the configuration change.
2.
The backup synchronization status of the lead and backup lead chassis is available at the following GUI locations:
a.
On the lead chassis:
Home
page—
Backup Sync
status under the member (backup)
Lead
Overview
page—Redundancy and backup synchronization status under
Group Information
b.
On the backup chassis:
Home
>
Overview
page—
Backup Sync
status under the
Group Information
.
3.
Interpreting the backup health:
If backup sync is healthy, the status is displayed as "Ok" and no further actions are needed.
If backup sync is not healthy, the status is displayed as "Warning" or "Critical". The "Warning" indicates a momentary
synchronization problem that is resolved automatically. The "Critical" status indicates a permanent problem and requires
user action.
When the backup sync status changes to "Warning" or "Critical", the associated alerts are generated under alert
categories Chassis (Configuration, System Health). These alerts are logged to the
Home
>
Hardware Logs
and
Alerts
>
Alert Log
. The alerts are also shown as faults under the
Home
>
Chassis Subsystems
(top right-hand corner) under
the MM subsystem. If an alert policy is configured, the actions are taken as configured in the policy.
4.
Required user actions when Backup health is "Warning" or "Critical":
Warning—A momentary status and must transition to "Ok" or "Critical". But if the status continues to report "Warning"
for more than 90 minutes, Dell EMC recommends that you assign a new backup.
Critical—A permanent status indicative of issues with the backup or lead. Identify the underlying issues and take
appropriate actions as described below:
Health is critical because of alert CDEV4006: The lead or member chassis has drifted its firmware version causing a
lead/backup incompatibility. It is recommended that the firmware of the lead or member chassis is brought back to
the same version (1.10.00 or later).
Health is critical because of alert CDEV4007: one of the several underlying issues contributes to this status, see the
following flow chart to determine the cause and take the recommended action.
108
Use case scenarios