Seagate ST9900805SS Savvio 10K.1 SCSI Product Manual - Page 27

Method of Reporting Informational Exceptions field MRIE on the Informational Exceptions Control IEC

Page 27 highlights

Determining rate S.M.A.R.T. monitors the rate at which errors occur and signals a predictive failure if the rate of degraded error rate increases to an unacceptable level. To determine rate, error events are logged and compared to the number of total operations for a given attribute. The interval defines the number of operations over which to measure the rate. The counter that keeps track of the current number of operations is referred to as the Interval Counter. S.M.A.R.T. measures error rate, hence for each attribute the occurrence of an error is recorded. A counter keeps track of the number of errors for the current interval. This counter is referred to as the Failure Counter. Error rate is simply the number of errors per operation. The algorithm that S.M.A.R.T. uses to record rates of error is to set thresholds for the number of errors and the interval. If the number of errors exceeds the threshold before the interval expires, then the error rate is considered to be unacceptable. If the number of errors does not exceed the threshold before the interval expires, then the error rate is considered to be acceptable. In either case, the interval and failure counters are reset and the process starts over. Predictive failures S.M.A.R.T. signals predictive failures when the drive is performing unacceptably for a period of time. The firmware keeps a running count of the number of times the error rate for each attribute is unacceptable. To accomplish this, a counter is incremented whenever the error rate is unacceptable and decremented (not to exceed zero) whenever the error rate is acceptable. Should the counter continually be incremented such that it reaches the predictive threshold, a predictive failure is signaled. This counter is referred to as the Failure History Counter. There is a separate Failure History Counter for each attribute. 5.2.8 Thermal monitor Savvio SCSI drives implement a temperature warning system which: 1. Signals the host if the temperature exceeds a value which would threaten the drive. 2. Signals the host if the temperature exceeds a user-specified value. 3. Saves a S.M.A.R.T. data frame on the drive which exceed the threatening temperature value. A temperature sensor monitors the drive temperature and issues a warning over the interface when the temperature exceeds a set threshold. The temperature is measured at power-up and then at ten-minute intervals after power-up. The thermal monitor system generates a warning code of 01-0B01 when the temperature exceeds the specified limit in compliance with the SCSI standard. The drive temperature is reported in the FRU code field of mode sense data. You can use this information to determine if the warning is due to the temperature exceeding the drive threatening temperature or the user-specified temperature. This feature is controlled by the Enable Warning (EWasc) bit, and the reporting mechanism is controlled by the Method of Reporting Informational Exceptions field (MRIE) on the Informational Exceptions Control (IEC) mode page (1Ch). The current algorithm implements two temperature trip points. The first trip point is set at 68°C which is the maximum temperature limit according to the drive specification. The second trip point is user-selectable using the Log Select command. The reference temperature parameter in the temperature log page (see Table 11) Savvio SCSI Product Manual, Rev. D 21

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78

Savvio SCSI Product Manual, Rev. D
21
Determining rate
S.M.A.R.T. monitors the rate at which errors occur and signals a predictive failure if the rate of degraded error
rate increases to an unacceptable level. To determine rate, error events are logged and compared to the num-
ber of total operations for a given attribute. The interval defines the number of operations over which to mea-
sure the rate. The counter that keeps track of the current number of operations is referred to as the Interval
Counter.
S.M.A.R.T. measures error rate, hence for each attribute the occurrence of an error is recorded. A counter
keeps track of the number of errors for the current interval. This counter is referred to as the Failure Counter.
Error rate is simply the number of errors per operation. The algorithm that S.M.A.R.T. uses to record rates of
error is to set thresholds for the number of errors and the interval. If the number of errors exceeds the threshold
before the interval expires, then the error rate is considered to be unacceptable. If the number of errors does
not exceed the threshold before the interval expires, then the error rate is considered to be acceptable. In
either case, the interval and failure counters are reset and the process starts over.
Predictive failures
S.M.A.R.T. signals predictive failures when the drive is performing unacceptably for a period of time. The firm-
ware keeps a running count of the number of times the error rate for each attribute is unacceptable. To accom-
plish this, a counter is incremented whenever the error rate is unacceptable and decremented (not to exceed
zero) whenever the error rate is acceptable. Should the counter continually be incremented such that it
reaches the predictive threshold, a predictive failure is signaled. This counter is referred to as the Failure His-
tory Counter. There is a separate Failure History Counter for each attribute.
5.2.8
Thermal monitor
Savvio SCSI drives implement a temperature warning system which:
1.
Signals the host if the temperature exceeds a value which would threaten the drive.
2.
Signals the host if the temperature exceeds a user-specified value.
3.
Saves a S.M.A.R.T. data frame on the drive which exceed the threatening temperature value.
A temperature sensor monitors the drive temperature and issues a warning over the interface when the tem-
perature exceeds a set threshold. The temperature is measured at power-up and then at ten-minute intervals
after power-up.
The thermal monitor system generates a warning code of 01-0B01 when the temperature exceeds the speci-
fied limit in compliance with the SCSI standard. The drive temperature is reported in the FRU code field of
mode sense data. You can use this information to determine if the warning is due to the temperature exceeding
the drive threatening temperature or the user-specified temperature.
This feature is controlled by the Enable Warning (EWasc) bit, and the reporting mechanism is controlled by the
Method of Reporting Informational Exceptions field (MRIE) on the Informational Exceptions Control (IEC)
mode page (1Ch).
The current algorithm implements two temperature trip points. The first trip point is set at 68°C which is the
maximum temperature limit according to the drive specification. The second trip point is user-selectable using
the Log Select command. The reference temperature parameter in the temperature log page (see Table 11)