HP DL740 HP F8 Architecture Technology Brief - Page 8

HP F8 Architecture

8

Architectural

Differences From

Storage Subsystem

RAID

The technology used in HP Hot-Plug RAID Memory is conceptually similar to RAID technology

that provides fault tolerance and high availability in storage subsystems for servers.

However, there are some key performance and implementation differences between Hot-Plug

RAID Memory and typical storage subsystem RAID.

Hot-Plug RAID Memory does not have the mechanical delays of seek time and rotational

latency associated with hard disk drive arrays. Storage subsystem arrays use a single bus to

write the stripes sequentially across multiple drives. In contrast, HP Hot-Plug RAID Memory

uses parallel point-to-point connections so that data is written simultaneously across multiple

memory cartridges.

Also, HP Hot-Plug RAID Memory eliminates the write bottleneck associated with typical

storage subsystem RAID implementations. In a storage array, the RAID controller generally

performs a read operation of existing parity before a write operation can be completed. If a

dedicated parity drive is being used, a bottleneck occurs. But because HP Hot-Plug RAID

Memory operates on an entire cache line of data, there is no need to read existing parity

before a write operation, thus eliminating this performance bottleneck.

When a traditional striped RAID storage subsystem rebuilds data, there is no data protection

should another drive fail. However, the F8 chipset operates in a typical (nonredundant) ECC

mode while data is being rebuilt. As a result, even if a secondary memory failure occurs

during a rebuild operation, the data is protected by ECC

F8 Crossbar

Switch

One of the key advantages that the Profusion architecture has over other 8-way designs is its

use of a nonblocking, multiported crossbar switch. This switch allows simultaneous

communication among the processors, I/O, and memory. The F8 architecture also uses a

nonblocking, multiported crossbar switch that provides even higher performance than the

Profusion crossbar switch and accommodates increased processor speeds and peripheral

bandwidths. The F8 chipset also includes a cache coherency filter, or cache accelerator,

similar to that in the Profusion architecture. The cache coherency filter removes (or filters)

unnecessary snoop cycles on the processor buses.

HP engineers designed the F8 crossbar switch to increase bus efficiency far beyond that of

the Profusion crossbar switch. The design includes:

•

Larger and reorganized buffers.

The F8 crossbar switch can hold 128 cache

lines, twice the number that the Profusion chipset can hold in its buffers.

•

More ports.

The F8 crossbar switch has thirteen read and four write ports, compared

with five read and five write ports used in the Profusion chipset. This increases the

number of transactions that can run concurrently.

•

Optimized cross-bus traffic through a patent-pending algorithm.

Optimizing the cross-bus traffic significantly enhances the ability to scale beyond 4-way

multiprocessing.

Buffer Design

The Profusion chipset uses a single centralized buffer, or queue, for storing data requests. In

certain cases, a processor on one bus could request the same address as a processor on the

other bus, resulting in the need to arbitrate for which request could be granted first. One of

the requests has to go through a retry process, using up additional bandwidth on the

processor bus.

In the F8 architecture, the crossbar switch (Figure 5) contains a separate buffer for each of

the processor buses, the I/O subsystem, and the memory subsystem. The buffers in the

Section	Page
Abstract	2
Introduction	2
Need for F8 Architecture	2
Overview of F8 Chipset	4
Hot-Plug RAID Memory	5
Memory Configuration	5
RAID Memory Striping	5
Hot-Plug Memory Capabilities	6
Benefits of Data Protection With RAID	7
Error Detection and Correction	7
Architectural Differences From Storage Subsystem RAID	8
F8 Crossbar Switch	8
Buffer Design	8
Multiport Design	9
Cache Coherency Filter	9
Optimizing Cross-bus Traffic	10
I/O Subsystem	11
PCI Mode	11
PCI-X Mode	11
Xeon MP Processor Subsystem	12
Hyper-Threading Technology	12
Frequency and Full-Speed Cache	13
Processor and I/O Bus Design	13
SIMD Instructions	13
Out-of-order Execution	13
Branch Prediction	13
Conclusion	14

HP DL740 HP F8 Architecture Technology Brief - Page 8

Architectural Differences From Storage Subsystem RAID, F8 Crossbar Switch, Buffer Design

Page 8 highlights