HP ML530 RDMA protocol: improving network performance - Page 6

RDMA data transfer operations, Send operations, send with invalidate message

Page 6 highlights

DDP allows upper layer protocol (ULP) data, such as application messages or disk I/O, contained within DDP segments to be placed directly into its final destination in memory without processing by the ULP. This may occur even when the DDP segments arrive out of order. A DDP segment is the smallest unit of data transfer for the DDP protocol. It includes a DDP header and ULP payload. The DDP header contains control and placement fields that define the final destination for the payload, which is the actual data being transferred. A DDP message is a ULP-defined unit of data interchange that is subdivided into one or more DDP segments. This segmentation may occur for a variety of reasons, including segmentation to respect the maximum segment size of TCP. A sequence of DDP messages is called a DDP stream. DDP uses two data transfer models: a tagged buffer model and an untagged buffer model. A tagged buffer model is used to transfer tagged buffers between the two members of the transfer (the local peer and the remote peer). A tagged buffer is explicitly advertised to the remote peer through exchange of a steering tag (STag), tagged offset, and length. An STag is simply an identifier of a tagged buffer on a node, and the tagged offset identifies the base address of the buffer. Tagged buffers are typically used for large data transfers, such as large data structures and disk I/O. An untagged buffer model is used to transfer untagged buffers from the local peer to the remote peer. Untagged buffers are not explicitly advertised to the remote peer. Untagged buffers are typically used for small control messages, such as operation and I/O status messages. RDMA data transfer operations The RDMA protocol provides seven data transfer operations. Except for the RDMA read operation, each operation generates exactly one RDMA message. The RDMA information is included inside of fields within the DDP header. With an RDMA-aware network interface controller (RNIC), the data target and data source host processors are not involved in the data transfer operations, so they can continue to do useful work. The RNIC is responsible for generating outgoing and processing incoming RDMA packets: The data is placed directly where the application advertises that it wants the data to go and is pulled from where the application indicates the data is located. This eliminates the copies of data that occur in the traditional operating system protocol stack on both the send and receive sides. Send operations RDMA uses four variations of the send operation: • Send operation • Send with invalidate operation • Send with solicited event • Send with solicited event and invalidate A send operation transfers data from the data source (the peer sending the data payload) into a buffer that has not been explicitly advertised by the data target (the peer receiving the data payload). The send message uses the DDP untagged buffer model to transfer the ULP message into the untagged buffer of the data target. Send operations are typically used to transfer small amounts of control data where the overhead of creating an STag for DDP does not justify the small amount of memory bandwidth consumed by the data copy. The send with invalidate message includes all functionality of the send message, plus the capability to invalidate a previously advertised STag. After the message has been placed and delivered at the data target, the data target's buffer identified by the STag included in the message can no longer be accessed remotely until the data target's ULP re-enables access and advertises the buffer again. 6

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

DDP allows upper layer protocol (ULP) data, such as application messages or disk I/O, contained
within DDP segments to be placed directly into its final destination in memory without processing by
the ULP. This may occur even when the DDP segments arrive out of order.
A DDP segment is the smallest unit of data transfer for the DDP protocol. It includes a DDP header and
ULP payload. The DDP header contains control and placement fields that define the final destination
for the payload, which is the actual data being transferred.
A DDP message is a ULP-defined unit of data interchange that is subdivided into one or more DDP
segments. This segmentation may occur for a variety of reasons, including segmentation to respect the
maximum segment size of TCP. A sequence of DDP messages is called a DDP stream.
DDP uses two data transfer models: a tagged buffer model and an untagged buffer model.
A tagged buffer model is used to transfer tagged buffers between the two members of the transfer (the
local peer and the remote peer). A tagged buffer is explicitly advertised to the remote peer through
exchange of a steering tag (STag), tagged offset, and length. An STag is simply an identifier of a
tagged buffer on a node, and the tagged offset identifies the base address of the buffer. Tagged
buffers are typically used for large data transfers, such as large data structures and disk I/O.
An untagged buffer model is used to transfer untagged buffers from the local peer to the remote peer.
Untagged buffers are not explicitly advertised to the remote peer. Untagged buffers are typically used
for small control messages, such as operation and I/O status messages.
RDMA data transfer operations
The RDMA protocol provides seven data transfer operations. Except for the RDMA read operation,
each operation generates exactly one RDMA message. The RDMA information is included inside of
fields within the DDP header.
With an RDMA-aware network interface controller (RNIC), the data target and data source host
processors are not involved in the data transfer operations, so they can continue to do useful work.
The RNIC is responsible for generating outgoing and processing incoming RDMA packets: The data is
placed directly where the application advertises that it wants the data to go and is pulled from where
the application indicates the data is located. This eliminates the copies of data that occur in the
traditional operating system protocol stack on both the send and receive sides.
Send operations
RDMA uses four variations of the send operation:
Send operation
Send with invalidate operation
Send with solicited event
Send with solicited event and invalidate
A
send
operation transfers data from the data source (the peer sending the data payload) into a
buffer that has not been explicitly advertised by the data target (the peer receiving the data payload).
The send message uses the DDP untagged buffer model to transfer the ULP message into the untagged
buffer of the data target. Send operations are typically used to transfer small amounts of control data
where the overhead of creating an STag for DDP does not justify the small amount of memory
bandwidth consumed by the data copy.
The
send with invalidate message
includes all functionality of the send message, plus the capability to
invalidate a previously advertised STag. After the message has been placed and delivered at the data
target, the data target’s buffer identified by the STag included in the message can no longer be
accessed remotely until the data target’s ULP re-enables access and advertises the buffer again.
6