Seagate ST3500630A Serial ATA Native Command Queuing (670K, PDF) - Page 4

Rotational Latencies, Benefits of Native Command Queuing

Page 4 highlights

Serial ATA Native Command Queuing Keep in mind that re-ordering of pending commands based strictly on the ending location of the heads over the logical block address (LBA) of the last completed command is not the most efficient solution. Similar to an elevator that will not screech to a halt when a person pushes a button for a floor just being passed, HDDs will use complex algorithms to determine the best command to service next. Complexity involves possible head-switching, times to seek to different tracks and different modes of operation, for example, quiet seeks. Parameters taken into account encompass seek length, starting location and direction, acceleration profiles of actuators, rotational positioning (which includes differences between read and write settle times), read cache hits vs. misses, write cache enabled vs. disabled, I/O processes that address the same LBAs, as well as fairness algorithms to eliminate command starvation, to mention a few. Rotational Latencies Rotational latency is the amount of time it takes for the starting LBA to rotate under the head after the head is on the right track. In the worst-case scenario, this could mean that the drive will waste one full rotation before it can access the starting LBA and then continue to read from the remaining target LBAs. Rotational latencies depend on the spindle RPM, that is, a 7200-RPM drive will have a worst-case rotational latency of 8.3 msec, a 5400-RPM drive will need up to 11.1 msec, and a 10K-RPM drive will have up to 6 msec rotational latency. In a random distribution of starting LBAs relative to the angular position of the drive's head, the average rotational latency will be one half of the worst-case latency. I/O delays in the order of milliseconds are quite dramatic compared to the overall performance of any modern system. This is particularly true in scenarios where modern operating systems are utilizing multi-threading or where Hyper-Threading Technology allows quasi-simultaneous execution of independent workloads, all of which need data from the same drive almost simultaneously. Higher RPM spindles are one approach to reduce rotational latencies. However, increasing RPM spindle rates carries a substantial additional cost. Rotational latencies can also be minimized by two other approaches. The first is to re-order the commands outstanding to the drive in such a way that the rotational latency is minimized. This optimization is similar to the linear optimization to reduce seek latencies, but instead takes into account the rotational position of the drive head in determining the best command to service next. A second-order optimization is to use a feature called out-of-order data delivery. Out-of-order data delivery means that the head does not need to access the starting LBA first but can start reading the data at any position within the target LBAs. Instead of passing up the fraction of a rotation necessary to return to the first LBA of the requested data chunk, the drive starts reading the requested data as soon as it has settled on the correct track and adds the missing data at the end of the same rotation. Using out-of-order data delivery, for the worst case, the entire transfer will be complete within exactly one rotation of the platter. Without out-of-order data delivery, the worst case time needed to complete the transfer will be one rotation plus the amount of time it takes to rotate over all target LBAs. Benefits of Native Command Queuing It is clear that there is a need for reordering outstanding commands in order to reduce mechanical overhead and consequently improve I/O latencies. It is also clear, however, that simply collecting commands in a queue is not worth the silicon they are stored on. Efficient reordering algorithms take both the linear and the angular position of the target data into account and will optimize for both in order to yield the minimal total service time. This process is referred to as "command reordering based on seek and rotational optimization" or tagged command queuing. A side effect of command queuing and the reduced mechanical workload will be less mechanical wear, providing the additional benefit of improved endurance. Serial ATA II provides an efficient protocol implementation of tagged command queuing called Native Command Queuing. 4

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

Serial ATA Native Command Queuing
Keep in mind that re-ordering of pending commands based strictly on the ending location of the
heads over the logical block address (LBA) of the last completed command is not the most
efficient solution.
Similar to an elevator that will not screech to a halt when a person pushes a
button for a floor just being passed, HDDs will use complex algorithms to determine the best
command to service next.
Complexity involves possible head-switching, times to seek to different
tracks and different modes of operation, for example, quiet seeks. Parameters taken into account
encompass seek length, starting location and direction, acceleration profiles of actuators,
rotational positioning (which includes differences between read and write settle times), read
cache hits vs. misses, write cache enabled vs. disabled, I/O processes that address the same
LBAs, as well as fairness algorithms to eliminate command starvation, to mention a few.
Rotational Latencies
Rotational latency is the amount of time it takes for the starting LBA to rotate under the head after
the head is on the right track.
In the worst-case scenario, this could mean that the drive will
waste one full rotation before it can access the starting LBA and then continue to read from the
remaining target LBAs. Rotational latencies depend on the spindle RPM, that is, a 7200-RPM
drive will have a worst-case rotational latency of 8.3 msec, a 5400-RPM drive will need up to 11.1
msec, and a 10K-RPM drive will have up to 6 msec rotational latency. In a random distribution of
starting LBAs relative to the angular position of the drive’s head, the average rotational latency
will be one half of the worst-case latency.
I/O delays in the order of milliseconds are quite dramatic compared to the overall performance of
any modern system. This is particularly true in scenarios where modern operating systems are
utilizing multi-threading or where Hyper-Threading Technology allows quasi-simultaneous
execution of independent workloads, all of which need data from the same drive almost
simultaneously.
Higher RPM spindles are one approach to reduce rotational latencies.
However, increasing RPM
spindle rates carries a substantial additional cost.
Rotational latencies can also be minimized by
two other approaches.
The first is to re-order the commands outstanding to the drive in such a
way that the rotational latency is minimized.
This optimization is similar to the linear optimization
to reduce seek latencies, but instead takes into account the rotational position of the drive head in
determining the best command to service next.
A second-order optimization is to use a feature
called out-of-order data delivery.
Out-of-order data delivery means that the head does not need
to access the starting LBA first but can start reading the data at any position within the target
LBAs. Instead of passing up the fraction of a rotation necessary to return to the first LBA of the
requested data chunk, the drive starts reading the requested data as soon as it has settled on the
correct track and adds the missing data at the end of the same rotation.
Using out-of-order data delivery, for the worst case, the entire transfer will be complete within
exactly one rotation of the platter. Without out-of-order data delivery, the worst case time needed
to complete the transfer will be one rotation plus the amount of time it takes to rotate over all
target LBAs.
Benefits of Native Command Queuing
It is clear that there is a need for reordering outstanding commands in order to reduce mechanical
overhead and consequently improve I/O latencies. It is also clear, however, that simply collecting
commands in a queue is not worth the silicon they are stored on.
Efficient reordering algorithms
take both the linear and the angular position of the target data into account and will optimize for
both in order to yield the minimal total service time.
This process is referred to as “command re-
ordering based on seek and rotational optimization” or tagged command queuing. A side effect of
command queuing and the reduced mechanical workload will be less mechanical wear, providing
the additional benefit of improved endurance.
Serial ATA II provides an efficient protocol
implementation of tagged command queuing called Native Command Queuing.
4