Compaq ProLiant 1000 Architecting and Deploying High-Availability Solutions - Page 5

Availability Options

Page 5 highlights

Architecting and Deploying High-Availability Solutions 5 Loss can be measured in more than money. But if money is the measure then the figures can be astounding. In a recent study, the Standish Group (1998) reports that costs of downtime typically range from $1,000 to $27,000 per minute. What's more, they report that in some cases, the cost of downtime for a single incident has exceeded $10,000,000. And if you consider the estimates of the Gartner Group as noted, the costs can be in the Billions! Think about what downtime means to your organization. 2. Recovery Point and Recovery Time High availability means different things to different people. At the high end it is called "continuous availability" or "nonstop computing" and has come to mean something on the order of 99.999% uptime, some five minutes a year of downtime. Pretty impressive. But what is your definition of high availability? Perhaps you don't need "five-nines" but you'd like to come as close as you can. Your requirement may not be for continuous computing 24 hours per day, 365 days per year, but you may require that when your system is in operation it cannot go down. An airborne surveillance and target acquisition system might be in operation for only eight hours over the forward edge of battle but it better be available every second that it's there. Or a retail operation that does 90% of its business during a holiday season had better not go down during those few weeks or months. Each type of availability may demand very different requirements. The first thing to keep in mind, though, is that defining availability depends on your needs in terms of Recovery Point and Recovery Time. While the inherent reliability of Information Systems has been increasing, things still do happen that cause applications to stop. Disaster Recovery specialists tend to examine the impact possibilities in terms of Recovery Point -- the amount of "acceptable" loss -- and Recovery Time -- the amount of time needed to get back in operation. Recovery Point is most important in data-centric operations where the loss of data is unacceptable. Recovery Time is most important in transaction-centric operations where realtime continuity is key. Do you need fast recovery, or recovery to the exact state prior to the failure... or both? What is the impact on your operations measured by a Recovery Point standard? If you don't resume processing right where you left off will it be inconvenient? Damaging? Catastrophic? What is the most effective and efficient method to use to recover the information? What is the impact on business measured in Recovery Time? If you don't resume processing within a second will it be inconvenient? Damaging? Catastrophic? Thus the recovery strategy you use depends on this assessment of Recovery Point and Recovery Time. The diagram below displays four availability options measured in those terms. Availability Options Weeks Recovery Time Electronic Vaulting Remote Hot Sites Machine Cycles 24 x 365 On-Line Hot Backup 0 Transactions Recovery Point 1000's of Transactions ECG064/1198

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

Architecting and Deploying High-Availability Solutions
5
ECG064/1198
Loss can be measured in more than money. But if money is the measure then the figures can be astounding.
In a recent study, the Standish Group (1998) reports that costs of downtime typically range from $1,000 to
$27,000 per minute. What’s more, they report that in some cases, the cost of downtime for a single incident
has exceeded $10,000,000. And if you consider the estimates of the Gartner Group as noted, the costs can
be in the
Billions!
Think about what downtime means to your organization.
2. Recovery Point and Recovery Time
High availability means different things to different people. At the high end it is called “continuous
availability” or “nonstop computing” and has come to mean something on the order of 99.999% uptime,
some five minutes a year of downtime. Pretty impressive. But what is your definition of high availability?
Perhaps you don’t need “five-nines” but you’d like to come as close as you can. Your requirement may not
be for continuous computing 24 hours per day, 365 days per year, but you may require that when your
system is in operation it
cannot
go down. An airborne surveillance and target acquisition system might be
in operation for only eight hours over the forward edge of battle but it better be available every second that
it’s there. Or a retail operation that does 90% of its business during a holiday season had better not go down
during those few weeks or months. Each type of availability may demand very different requirements.
The first thing to keep in mind, though, is that defining availability depends on your needs in terms of
Recovery Point
and
Recovery Time
.
While the inherent reliability of Information Systems has been increasing, things still do happen that cause
applications to stop. Disaster Recovery specialists tend to examine the impact possibilities in terms of
Recovery Point -- the amount of “acceptable” loss -- and Recovery Time -- the amount of time needed to
get back in operation. Recovery Point is most important in
data-centric
operations where the loss of data is
unacceptable. Recovery Time is most important in
transaction-centric
operations where realtime continuity
is key.
Do you need fast recovery, or recovery to the exact state prior to the failure…
or both? What is the impact
on your operations measured by a Recovery Point standard? If you don’t resume processing right where
you left off will it be inconvenient? Damaging? Catastrophic? What is the most effective and efficient
method to use to recover the information? What is the impact on business measured in Recovery Time? If
you don’t resume processing within a second will it be inconvenient? Damaging? Catastrophic?
Thus the recovery strategy you use depends on this assessment of Recovery Point and Recovery Time. The
diagram below displays four availability options measured in those terms.
Availability Options
On-Line
Hot Backup
Remote Hot Sites
Electronic
Vaulting
Weeks
Machine
Cycles
1000’s of
Transactions
Recovery
Time
Recovery Point
24 x 365
0
Transactions