Seagate ST3500630A Serial ATA Native Command Queuing (670K, PDF) - Page 4

Serial ATA Native Command Queuing

Keep in mind that re-ordering of pending commands based strictly on the ending location of the

heads over the logical block address (LBA) of the last completed command is not the most

efficient solution.

Similar to an elevator that will not screech to a halt when a person pushes a

button for a floor just being passed, HDDs will use complex algorithms to determine the best

command to service next.

Complexity involves possible head-switching, times to seek to different

tracks and different modes of operation, for example, quiet seeks. Parameters taken into account

encompass seek length, starting location and direction, acceleration profiles of actuators,

rotational positioning (which includes differences between read and write settle times), read

cache hits vs. misses, write cache enabled vs. disabled, I/O processes that address the same

LBAs, as well as fairness algorithms to eliminate command starvation, to mention a few.

Rotational Latencies

Rotational latency is the amount of time it takes for the starting LBA to rotate under the head after

the head is on the right track.

In the worst-case scenario, this could mean that the drive will

waste one full rotation before it can access the starting LBA and then continue to read from the

remaining target LBAs. Rotational latencies depend on the spindle RPM, that is, a 7200-RPM

drive will have a worst-case rotational latency of 8.3 msec, a 5400-RPM drive will need up to 11.1

msec, and a 10K-RPM drive will have up to 6 msec rotational latency. In a random distribution of

starting LBAs relative to the angular position of the drive’s head, the average rotational latency

will be one half of the worst-case latency.

I/O delays in the order of milliseconds are quite dramatic compared to the overall performance of

any modern system. This is particularly true in scenarios where modern operating systems are

utilizing multi-threading or where Hyper-Threading Technology allows quasi-simultaneous

execution of independent workloads, all of which need data from the same drive almost

simultaneously.

Higher RPM spindles are one approach to reduce rotational latencies.

However, increasing RPM

spindle rates carries a substantial additional cost.

Rotational latencies can also be minimized by

two other approaches.

The first is to re-order the commands outstanding to the drive in such a

way that the rotational latency is minimized.

This optimization is similar to the linear optimization

to reduce seek latencies, but instead takes into account the rotational position of the drive head in

determining the best command to service next.

A second-order optimization is to use a feature

called out-of-order data delivery.

Out-of-order data delivery means that the head does not need

to access the starting LBA first but can start reading the data at any position within the target

LBAs. Instead of passing up the fraction of a rotation necessary to return to the first LBA of the

requested data chunk, the drive starts reading the requested data as soon as it has settled on the

correct track and adds the missing data at the end of the same rotation.

Using out-of-order data delivery, for the worst case, the entire transfer will be complete within

exactly one rotation of the platter. Without out-of-order data delivery, the worst case time needed

to complete the transfer will be one rotation plus the amount of time it takes to rotate over all

target LBAs.

Benefits of Native Command Queuing

It is clear that there is a need for reordering outstanding commands in order to reduce mechanical

overhead and consequently improve I/O latencies. It is also clear, however, that simply collecting

commands in a queue is not worth the silicon they are stored on.

Efficient reordering algorithms

take both the linear and the angular position of the target data into account and will optimize for

both in order to yield the minimal total service time.

This process is referred to as “command re-

ordering based on seek and rotational optimization” or tagged command queuing. A side effect of

command queuing and the reduced mechanical workload will be less mechanical wear, providing

the additional benefit of improved endurance.

Serial ATA II provides an efficient protocol

implementation of tagged command queuing called Native Command Queuing.

4

Section	Page
Serial ATA	1
Native Command Queuing	1
Summary	1
Introduction	2
Drive Basics	2
Seek Latency Optimization	3
Rotational Latencies	4
Benefits of Native Command Queuing	4
Detailed Description of NCQ	5
Building a Queue	5
Transferring Data	6
Status Return	7
How Applications Take Advantage of Queuing	8
Using Asynchronous I/O in Windows*	9

Seagate ST3500630A Serial ATA Native Command Queuing (670K, PDF) - Page 4

Rotational Latencies, Benefits of Native Command Queuing

Page 4 highlights