Dell DX6004S DX Object Storage Administration Guide - Page 49

Copyright © 2010 Caringo, Inc.

All rights reserved

44

Version 5.0

December 2010

Chapter 7. Managing Volumes

In normal operations, there are no required actions on the part of the administrator in order to

manage DX Storage volumes. However, there are some special cases that occur if a volume or a

node has a problem or if the administrator wishes to perform hardware maintenance on a node.

7.1. Volume Expiration

The DX Storage cluster is designed to automatically adapt in the event of a failed volume (hard disk)

or a failed node. Every volume in the DX Storage cluster is checked during the startup of a node. If

a volume has been disconnected from the cluster for more than 14 days, it is considered "stale" and

its contents are not used unless an administrator specifically overrides this behavior.

Although the 14-day time limit applies to volumes, if a node is shut down for more than 14 days, all

of its volumes are considered stale and they are not used. After 14 days, an administrator can force

a volume to be remounted by modifying the volume specification and adding the

:k

(keep) policy

option. See

Section 6.5, “Managing Volumes”

for details about how this is done.

When a volume that is older than 14 days is forced to return to service, care must be taken because

you might resurrect content that had been explicitly deleted by clients. This is not a problem for

content that was deleted by automatic lifepoint policies because the content is discovered and

deleted by DX Storage’s continuous health processor.

7.2. Movement Between Nodes

Physical volumes can be moved between nodes if this becomes necessary due to hardware failures

or other constraints as determined by an administrator.

When a volume goes off-line due to a failure of the volume, the failure of the node, or the shutdown

of a node, the cluster will immediately begin the process of ensuring that the correct number of

replicas exists for all the streams in the cluster. If a volume or node returns to the cluster during this

operation and prior to the 14-day time limit, the checks will continue, but the replicas on the returned

volumes will be considered when validating the stream constraints.

Warning

When adding volumes, either new or those from another machine, to a node, care should

be taken to ensure that the node has sufficient RAM to handle the additional storage. If

the RAM is not sufficient, the node might be unable to mount some of the volumes.

Volumes may also be moved to nodes that are in a different cluster. When this is done, the streams

on that volume become part of the new cluster and they will be checked for the correct constraints

within the context of the new cluster.

7.3. Physical Errors

In order to provide for autonomous operations, a DX Storage node watches for physical errors

when reading and writing to its volumes. If the node receives any physical errors from a volume, the

volume is immediately retired and the node will avoid any further requests to the failed device.

Due to the sophistication of modern disk storage devices and interfaces, there are many error

detection steps, bad sector re-mapping, and retry attempts that are performed by the underlying disk

system. If a physical error propagates up to the DX Storage software level, there is little chance that

a deterministic set of steps can be performed to work around the failure. Additionally, there is no

Section	Page
DX Object Storage Administration Guide	1
Table of Contents	3
Chapter 1. Welcome to DX Storage	6
1.1. Overview of DX Storage	6
1.2. Components	6
1.3. About this Document	6
1.3.1. Audience	6
1.3.2. Scope	6
Chapter 2. Introduction to the Admin Console	7
2.1. Accessing the Admin Console	7
2.2. Initial View of the Admin Console	7
2.2.1. Viewing the Cluster Status Page	7
2.2.2. Viewing a Node's Status Page	8
2.3. Printing the Admin Console	8
2.4. Viewing License Information	8
Chapter 3. Managing the Cluster	10
3.1. Viewing the Cluster Status Page	10
3.2. Authenticating Cluster-Wide Actions	11
3.3. Shutting Down or Restarting the Cluster	12
3.4. Viewing Nodes in the Cluster	12
3.5. Searching for Nodes By IP Address	13
3.6. Searching for Nodes by Status	13
3.7. Choosing and Preserving Cluster Settings	13
3.7.1. Enabling Logging	15
3.7.2. Replication	15
3.7.3. Suspend	15
3.7.4. Power	15
3.7.5. Managing Tenants	15
3.8. Cluster Name	15
3.9. Cluster Multicast Address	15
3.10. % Used Indicator	15
Chapter 4. Managing Tenants	17
4.1. Terminology Related to Tenant Security	18
4.2. About the Default Cluster Domain	19
4.3. Security Privileges for Administrative Operations	20
4.4. Rules and Recommendations for Managing Tenants	21
4.5. Domain Naming Rules	21
4.6. Adding, Editing, or Deleting Tenants	22
4.7. Other Cluster Administrator Tasks	24
4.7.1. Using Administrative Override	25
4.7.1.1. Using Override to Delete an Object	25
4.7.1.2. Using Override to GET or APPEND User Lists	25
4.7.1.3. Using Override to Resolve Authorization Specification Issues	27
4.7.2. Working With Inaccessible Objects	28
Chapter 5. Managing Nodes	30
5.1. Viewing the Node Status Page	30
5.1.1. Shutting Down or Restarting a Node	30
5.1.2. Retiring or Identifying a Node	30
5.1.2.1. Retiring a Node or Volume	30
5.1.2.2. Identifying a Volume	31
5.1.3. Errors and Announcements	31
5.1.4. Additional Node Status Information	31
5.1.4.1. Hardware Status Reporting	33
5.2. Displaying Subcluster Information	34
Chapter 6. Configuring the Node	35
6.1. Option Names and Descriptions	35
6.2. Managing DX Storage Administrators and Users	41
6.2.1. Defining CAStor administrators and SNMP Administrators	41
6.2.2. Defining DX Storage Operators	42
6.2.3. Securing the Administrator and Operator Passwords	42
6.3. Managing Content Integrity Settings	43
6.3.1. autoRepOnWrite	43
6.3.2. repPriority	43
6.3.3. autoValidateRead	43
6.4. Managing Other Stream Replication Settings	44
6.4.1. minreps, maxreps, and defreps	44
6.4.2. hpStartDelay	44
6.5. Managing Volumes	44
6.5.1. device	45
6.5.2. policy	46
6.5.3. Specifying Exceptions	46
6.6. Configuring Power Management Settings	46
6.6.1. sleepAfter	46
6.6.2. wakeAfter	46
6.7. Managing Other Settings	46
6.7.1. consolePort	46
6.7.2. domainHeaders	46
6.7.3. loghost	47
6.7.4. timeSource	47
Chapter 7. Managing Volumes	49
7.1. Volume Expiration	49
7.2. Movement Between Nodes	49
7.3. Physical Errors	49
Appendix A. Implementation of Multi-Server Chassis	51
A.1. Configuration Parameters	51
A.1.1. Processes parameter	51
A.1.2. Network Setup parameters	51
A.1.3. Using the vols parameter	52
A.1.4. Using the subcluster parameter	52
A.2. Monitoring and Administration	53
Appendix B. Using SNMP with DX Storage	54
B.1. SNMP Management Information Base (MIB) Reference	54
B.2. Managing DX Storage Nodes	54
B.2.1. Shutdown Action for Nodes	54
B.2.2. Retire Action for Nodes and Volumes	54
B.2.2.1. Single Volumes	55
B.2.2.2. Entire Node	55
B.3. SNMP Tools and Monitoring Systems	55
B.3.1. Open Source Tools	55
B.3.2. SNMP Examples with DX Storage	56
B.3.3. SNMP Action OIDs	56
B.3.3.1. castorShutdownAction	56
B.3.3.2. castorRetireAction	56
B.3.3.3. castorLogLevelAction	56
B.3.3.4. castorSyslogHostAction	56
B.3.3.5. volumeRecoverySuspend	57
B.3.4. Practical SNMP with DX Storage	57
B.3.4.1. Health Monitoring	57
B.3.4.2. Capacity Monitoring	57
B.3.4.3. Client Activity Reporting	58
Appendix C. Upgrading a License or Cluster	59
C.1. Upgrading a License File	59
C.2. Software Upgrade Overview	59
C.2.1. Preparation	59
C.2.2. Upgrade Steps	60
C.2.2.1. Example Shutdown Script	60
C.2.2.2. Cluster Reboot	60
C.3. Back-out Steps	61
Appendix D. Troubleshooting	62
D.1. Restoring Domains and Buckets	62
D.1.1. Recovering a Deleted Domain	62
D.1.2. Recovering a Deleted Bucket	65
D.2. Resolving Duplicate Domain Names in a Mirrored or Disaster Recovery (DR) Cluster	66
D.2.1. Renaming a Domain in its Source Cluster (DR Cluster Conflict Only)	66
D.2.2. Renaming a Domain in a Mirrored or DR Cluster	67
D.3. Using DX Content Router to List Buckets and Objects	69
D.4. Boot Errors	69
D.5. Configuration	70
D.6. Operational Problems	70
Appendix E. Drive Identification API	72
E.1. Overview	72
E.2. Customization Steps	72
Appendix F. Customizing the Admin Console	73

Dell DX6004S DX Object Storage Administration Guide - Page 49

Managing Volumes, 7.1. Volume Expiration, 7.2. Movement Between Nodes, 7.3. Physical Errors

Page 49 highlights