HP DL740 hot plug RAID memory technology for fault tolerance and scalability - Page 4

hot plug RAID memory, Error Condition, Parity, Standard ECC, RAID Memory

Page 4 highlights

hot plug RAID memory technology for fault tolerance and scalability hot plug RAID memory Cumulative failures per 10,000 systems (logarithmic scale) figure 1: server outages during a one-year period due to memory failures1 1000000 100000 10000 1000 100 10 ECC for large memory systems is only about as good as parity checking is for smaller capacities 75% 4.6% 3% .3% 120% Parity ECC 48% Nearly 50% system failures per year 1 64 MB 1 GB 16 GB Memory Capacity To help meet the availability and scalability demands of today's eBusiness world, HP developed a solution that allows customers to take advantage of industry-standard memory technology, increase server fault-tolerance, increase memory capacity, and increase server availability. Hot Plug RAID Memory provides a level of protection far greater than standard ECC-based solutions and allows the detection of otherwise undetectable errors (table 1). table 1: comparison of protection provided by parity checking, ECC, and Hot Plug RAID Memory Error Condition Single-bit Double-bit 4-bit DRAM 8-bit DRAM Greater than DRAM Parity Detect Unreliable Unreliable Unreliable Unreliable Standard ECC Correct Detect Detect Unreliable Unreliable RAID Memory Correct Correct Correct Correct Detect For years, the computer industry has used redundant array of independent disk (RAID) technology to provide fault tolerance and high availability for disk drive subsystems in servers. The technology used in Hot Plug RAID Memory is conceptually similar to RAID storage technology. However, in the context of the memory solution, RAID stands for redundant array of industry-standard DIMMs. 1 Source: Timothy J. Dell, "A White Paper on the Benefits of Chipkill-Correct ECC for PC Server Main Memory," IBM Microelectronics Division - Rev. 11/19/97 4

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

hot plug RAID memory technology for fault tolerance and scalability
figure 1: server outages during a one-year period due to memory failures
1
1
10
100
1000
10000
100000
1000000
64 MB
1 GB
16 GB
Memory Capacity
Cumulative failures per 10,000 systems
Parity
ECC
(logarithmic scale)
ECC for large memory
systems is only about as
good as parity checking
is for smaller capacities
Nearly 50%
system failures
per year
120%
75%
4.6%
48%
3%
.3%
hot plug RAID
memory
To help meet the availability and scalability demands of today’s eBusiness world, HP
developed a solution that allows customers to take advantage of industry-standard
memory technology, increase server fault-tolerance, increase memory capacity, and
increase server availability. Hot Plug RAID Memory provides a level of protection far
greater than standard ECC-based solutions and allows the detection of otherwise
undetectable errors (table 1).
table 1: comparison of protection provided by parity checking, ECC, and Hot Plug RAID Memory
Error Condition
Parity
Standard ECC
RAID Memory
Single-bit
Detect
Correct
Correct
Double-bit
Unreliable
Detect
Correct
4-bit DRAM
Unreliable
Detect
Correct
8-bit DRAM
Unreliable
Unreliable
Correct
Greater than DRAM
Unreliable
Unreliable
Detect
For years, the computer industry has used redundant array of independent disk (RAID)
technology to provide fault tolerance and high availability for disk drive subsystems in
servers. The technology used in Hot Plug RAID Memory is conceptually similar to RAID
storage technology. However, in the context of the memory solution, RAID stands for
redundant array of industry-standard DIMMs.
1
Source: Timothy J. Dell, “A White Paper on the Benefits of Chipkill-Correct ECC for PC Server Main Memory,” IBM Microelectronics
Division – Rev. 11/19/97
4