HP DL740 hot plug RAID memory technology for fault tolerance and scalability - Page 2

abstract, introduction, memory reliability, A well-designed memory subsystem

Page 2 highlights

hot plug RAID memory technology for fault tolerance and scalability abstract introduction memory reliability This technology brief describes the Hot Plug RAID Memory technology developed by HP to give enterprise-class servers the level of memory fault tolerance today's 7x24 applications demand. It provides background information on memory reliability, reviews current error detection and correction techniques, and explains why the likelihood of memory errors grows as memory capacity increases. It discusses Hot Plug RAID Memory in depth and provides information on less robust, alternative fault-tolerant memory solutions. The 1990s brought fundamental changes in enterprise computing. The proliferation of web browsers and the Internet led to a dynamic, global marketplace that demands instant answers, products, and services. Customer requirements for a high-performance, highly available, and easily managed computing infrastructure have increased exponentially. As a result, the changes of the 1990s spurred innovation in one of the most critical subsystems of enterprise-class servers: memory. Operating system support for more than 4 gigabytes (GB) of memory and availability of low-cost, high-capacity memory modules have driven requirements to support unprecedented memory capacity in today's industrystandard servers. Recent ProLiant servers support up to 64 GB of memory, and memory capacities will continue to grow in the near future. Error checking and correcting (ECC) memory, introduced in PC servers in 1992, still offers excellent protection for many servers. As memory capacity grows, however, the level of effectiveness ECC provides actually decreases. HP developed Hot Plug RAID Memory to extend the effectiveness of ECC and give enterprise-class servers the level of memory fault tolerance today's 7x24 applications demand. Hot Plug RAID Memory provides redundancy and hot-plug capabilities for industry-standard dual inline memory modules (DIMMs) to deliver unprecedented levels of availability, scalability, and fault tolerance. A well-designed memory subsystem, such as those employed in ProLiant servers, can be extremely reliable. For example, the memory subsystems in ProLiant servers are designed and extensively tested to ensure the highest quality possible. The memory modules in ProLiant servers undergo extensive qualification through the HP World Class Suppliers Process to ensure compliance with the industry-standard specifications. Memory system integrity begins with the reliability of the DIMMs. All ProLiant servers use industry-standard DIMMs, but just meeting industry standards is not enough. Rigorous testing also ensures that all DIMMs in ProLiant servers meet exacting electrical standards. Because memory is an electronic storage device, it has the potential to return information different from what was originally stored. Dynamic random access memory (DRAM) stores ones and zeros as charges on extremely small capacitors that must be frequently refreshed to ensure the data is not lost. Every bit of memory is either a zero or a one, the standard in a digital system. A relatively small electrical disturbance near the memory cell can alter the amount of charge on the capacitor, changing the state of the data bit stored in that memory cell and causing a memory data error. 2

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

hot plug RAID memory technology for fault tolerance and scalability
abstract
This technology brief describes the Hot Plug RAID Memory technology developed by HP
to give enterprise-class servers the level of memory fault tolerance today’s 7x24
applications demand. It provides background information on memory reliability, reviews
current error detection and correction techniques, and explains why the likelihood of
memory errors grows as memory capacity increases. It discusses Hot Plug RAID Memory
in depth and provides information on less robust, alternative fault-tolerant memory
solutions.
introduction
The 1990s brought fundamental changes in enterprise computing. The proliferation of
web browsers and the Internet led to a dynamic, global marketplace that demands
instant answers, products, and services. Customer requirements for a high-performance,
highly available, and easily managed computing infrastructure have increased
exponentially.
As a result, the changes of the 1990s spurred innovation in one of the most critical
subsystems of enterprise-class servers:
memory. Operating system support for more than
4 gigabytes (GB) of memory and availability of low-cost, high-capacity memory modules
have driven requirements to support unprecedented memory capacity in today’s industry-
standard servers. Recent ProLiant servers support up to 64 GB of memory, and memory
capacities will continue to grow in the near future.
Error checking and correcting (ECC) memory, introduced in PC servers in 1992, still
offers excellent protection for many servers. As memory capacity grows, however, the
level of effectiveness ECC provides actually decreases.
HP developed Hot Plug RAID Memory to extend the effectiveness of ECC and give
enterprise-class servers the level of memory fault tolerance today’s 7x24 applications
demand. Hot Plug RAID Memory provides redundancy and hot-plug capabilities for
industry-standard dual inline memory modules (DIMMs) to deliver unprecedented levels
of availability, scalability, and fault tolerance.
memory
reliability
A well-designed memory subsystem, such as those employed in ProLiant servers, can be
extremely reliable. For example, the memory subsystems in ProLiant servers are designed
and extensively tested to ensure the highest quality possible. The memory modules in
ProLiant servers undergo extensive qualification through the HP World Class Suppliers
Process to ensure compliance with the industry-standard specifications.
Memory system integrity begins with the reliability of the DIMMs. All ProLiant servers use
industry-standard DIMMs, but just meeting industry standards is not enough. Rigorous
testing also ensures that all DIMMs in ProLiant servers meet exacting electrical standards.
Because memory is an electronic storage device, it has the potential to return information
different from what was originally stored. Dynamic random access memory (DRAM)
stores ones and zeros as charges on extremely small capacitors that must be frequently
refreshed to ensure the data is not lost. Every bit of memory is either a zero or a one, the
standard in a digital system. A relatively small electrical disturbance near the memory
cell can alter the amount of charge on the capacitor, changing the state of the data bit
stored in that memory cell and causing a memory data error.
2