Dell DX6004S DX Object Storage Administration Guide - Page 54

Appendix B. Using SNMP with DX Storage, B.1. SNMP Management Information Base (MIB) Reference

Page 54 highlights

Appendix B. Using SNMP with DX Storage This appendix explains how to integrate a DX Storage cluster into an enterprise SNMP monitoring infrastructure. The DX Storage SNMP agent implementation provides the mechanism through which to monitor the health of cluster nodes, collect usage data, and control node actions. B.1. SNMP Management Information Base (MIB) Reference For documentation for the DX Storage Object Identifiers (OIDs) referred to in this chapter, see the SNMP MIB: • If you boot from a CSN, an aggregate MIB for the entire cluster is available in /usr/share/ snmp/mibs. • If you do not boot from a CSN, the MIB is located in the root directory of the DX Storage software distribution. B.2. Managing DX Storage Nodes DX Storage cluster nodes are controlled through the SNMP action commands. These commands provide a mechanism through which nodes and volumes within nodes can be taken down for service or retired from a DX Storage cluster. 1. castorShutdownAction 2. castorRetireAction B.2.1. Shutdown Action for Nodes In order to gracefully shutdown a DX Storage node, the string "shutdown" is written to the castorShutdownAction OID. Similarly, writing the string "reboot" to this OID will cause a DX Storage node to reboot. Upon receipt of a shutdown or reboot value, the node will initiate a graceful stop by unmounting all of its volumes and removing itself from the cluster. For a shutdown, the node will be powered off it the hardware supports this. For a reboot, the node will reboot to machine, re-read the node and/or cluster configuration files and startup DX Storage. A graceful node stop is necessary in order to reboot quickly. If a node stops ungracefully, it will be required to perform consistency checks on all its volumes before it can rejoin the cluster. Before shutting down or rebooting, a node's status page or the SNMP castorErrTable OID should be checked for critical error messages. Any critical messages logged there will be cleared upon reboot. B.2.2. Retire Action for Nodes and Volumes The retire action is used to permanently remove a node or a volume within a node from the cluster. Retire is intended for retiring old hardware or pre-emptively pushing content away from a volume that has seen an IO error. Retired volumes and nodes are visible in the Admin Console until after the cluster has been rebooted. Note Retire is not tuned for fast completion. Completing a retire action requires at least three health processor cycles. Copyright © 2010 Caringo, Inc. All rights reserved 49 Version 5.0 December 2010

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74

Copyright © 2010 Caringo, Inc.
All rights reserved
49
Version 5.0
December 2010
Appendix B. Using SNMP with DX Storage
This appendix explains how to integrate a DX Storage cluster into an enterprise SNMP monitoring
infrastructure. The DX Storage SNMP agent implementation provides the mechanism through which
to monitor the health of cluster nodes, collect usage data, and control node actions.
B.1. SNMP Management Information Base (MIB) Reference
For documentation for the DX Storage Object Identifiers (OIDs) referred to in this chapter, see the
SNMP MIB:
If you boot from a CSN, an aggregate MIB for the entire cluster is available in
/usr/share/
snmp/mibs
.
If you do not boot from a CSN, the MIB is located in the root directory of the DX Storage software
distribution.
B.2. Managing DX Storage Nodes
DX Storage cluster nodes are controlled through the SNMP action commands. These commands
provide a mechanism through which nodes and volumes within nodes can be taken down for service
or retired from a DX Storage cluster.
1. castorShutdownAction
2. castorRetireAction
B.2.1. Shutdown Action for Nodes
In order to gracefully shutdown a DX Storage node, the string “shutdown” is written to the
castorShutdownAction OID. Similarly, writing the string “reboot” to this OID will cause a DX Storage
node to reboot.
Upon receipt of a shutdown or reboot value, the node will initiate a graceful stop by unmounting all
of its volumes and removing itself from the cluster. For a shutdown, the node will be powered off it
the hardware supports this. For a reboot, the node will reboot to machine, re-read the node and/or
cluster configuration files and startup DX Storage.
A graceful node stop is necessary in order to reboot quickly. If a node stops ungracefully, it will be
required to perform consistency checks on all its volumes before it can rejoin the cluster.
Before shutting down or rebooting, a node’s status page or the SNMP castorErrTable OID should be
checked for critical error messages. Any critical messages logged there will be cleared upon reboot.
B.2.2. Retire Action for Nodes and Volumes
The retire action is used to permanently remove a node or a volume within a node from the cluster.
Retire is intended for retiring old hardware or pre-emptively pushing content away from a volume
that has seen an IO error. Retired volumes and nodes are visible in the Admin Console until after
the cluster has been rebooted.
Note
Retire is not tuned for fast completion. Completing a retire action requires at least three
health processor cycles.