Compaq ProLiant 1000 Architecting and Deploying High-Availability Solutions - Page 8

Availability Technologies

Page 8 highlights

Architecting and Deploying High-Availability Solutions 8 the high-availability goals, once met, are maintained. These capabilities are very often described as Availability Reviews, Disaster Recovery Services, or Business Impact Analyses and are primarily consulting services to help you understand and/or implement a high-availability environment. In addition, at each intersection in the matrix there are specific products designed to address the level of availability you require. What is most critical to know is that these technologies are available and that they play key roles in implementing high-availability solutions. Your choice of an appropriate partner that has "been there, done that" can make the difference in implementing the solution. 5. Availability Technologies As we have seen thus far, there are a number of important elements to consider in architecting a highavailability solution: the cost of downtime; trade-offs between recovery point and time; potential negative events. Only after these elements have been qualified and quantified can the next step be taken: choosing a technical strategy for achieving the level of availability required by a particular environment. The next few paragraphs provide a quick overview of some availability technology options. Power supply Implementing solutions for any of the situations described requires some element of redundancy. At the component level the most basic element is the power supply to the system. Uninterruptible Power Supplies (UPS) are a common tool to deal with the possibility of power outages. Multiple power sources, dual battery feeds, and power connections should also be considered in planning for component failures. Storage Once the power supply is assured, the next element to consider in a particular environment might be storage. The ability to maintain redundancy of the data and applications is key to any recovery situation. After all, physical storage devices are electromechanical and that in itself makes them more failure-prone than other elements in the environment. Database Database replication is a combination of hardware and software implementations specifically focused at protecting the data in your database. It may, or may not, require redundant physical storage. It provides application-transparent database backup and may also provide for use of multiple storage media. Backup and Restore is typically a capability that integrates a number of database and storage technologies and media with the intent of providing both database backup and fast restoration capabilities. Among other things, a TP monitor manages system resources, replication, load balancing, and failover capabilities. TP monitors also enhance security and assure that transactions are completed before confirming their completion by using a two-phase commit process. Processors, operating systems, and interconnects Fault-tolerant (FT) is a term used to describe the ability of a server or solution to tolerate failures. FT includes both Hardware and Software components. Hardware fault-tolerant: The ideal fault-tolerant solution is where the recovery point is instantaneous through the use of alternate data paths to deliver continuous availability. To deliver continuous availability, the components of the system need to operate in a fast-fail mode, that is, to identify the problem quickly, isolate it from the integrity of operation of the total solution, and recover by using alternate paths. (Note that this ability to route around failures allows the user to select a lower level of support response time as mean-time-to-repair (MTTR) is not a critical component in ensuring the availability of the solution.) Software fault-tolerant: The ability to recover from software failures. The main cause of server failures in today's architectures is software. The ability to tolerate software failures is a key requirement for the delivery of continuous availability. ECG064/1198

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

Architecting and Deploying High-Availability Solutions
8
ECG064/1198
the high-availability goals, once met, are maintained. These capabilities are very often described as
Availability Reviews
,
Disaster Recovery Services,
or
Business Impact Analyses
and are primarily consulting
services to help you understand and/or implement a high-availability environment. In addition, at each
intersection in the matrix there are specific products designed to address the level of availability you
require.
What is most critical to know is that these technologies are available and that they play key roles in
implementing high-availability solutions. Your choice of an appropriate partner that has “been there, done
that” can make the difference in implementing the solution.
5. Availability Technologies
As we have seen thus far, there are a number of important elements to consider in architecting a high-
availability solution: the cost of downtime; trade-offs between recovery point and time; potential negative
events. Only after these elements have been qualified and quantified can the next step be taken: choosing a
technical strategy for achieving the level of availability required by a particular environment. The next few
paragraphs provide a quick overview of some availability technology options.
Power supply
Implementing solutions for any of the situations described requires some element of redundancy. At the
component level the most basic element is the power supply to the system. Uninterruptible Power Supplies
(UPS) are a common tool to deal with the possibility of power outages. Multiple power sources, dual
battery feeds, and power connections should also be considered in planning for component failures.
Storage
Once the power supply is assured, the next element to consider in a particular environment might be
storage. The ability to maintain redundancy of the data and applications is key to any recovery situation.
After all, physical storage devices are electromechanical and that in itself makes them more failure-prone
than other elements in the environment.
Database
Database replication is a combination of hardware and software implementations specifically focused at
protecting the data in your database. It may, or may not, require redundant physical storage. It provides
application-transparent database backup and may also provide for use of multiple storage media.
Backup and Restore is typically a capability that integrates a number of database and storage technologies
and media with the intent of providing both database backup and fast restoration capabilities.
Among other things, a TP monitor manages system resources, replication, load balancing, and failover
capabilities. TP monitors also enhance security and assure that transactions are completed before
confirming their completion by using a two-phase commit process.
Processors, operating systems, and interconnects
Fault-tolerant (FT) is a term used to describe the ability of a server or solution to tolerate failures. FT
includes both Hardware and Software components.
Hardware fault-tolerant
: The ideal fault-tolerant solution is where the recovery point is instantaneous
through the use of alternate data paths to deliver continuous availability. To deliver continuous availability,
the components of the system need to operate in a fast-fail mode, that is, to
identify
the problem quickly,
isolate
it from the integrity of operation of the total solution, and
recover
by using alternate paths. (Note
that this ability to route around failures allows the user to select a lower level of support response time as
mean-time-to-repair (MTTR) is not a critical component in ensuring the availability of the solution.)
Software fault-tolerant:
The ability to recover from software failures. The main cause of server failures in
today’s architectures is software. The ability to tolerate software failures is a key requirement for the
delivery of continuous availability.