HP 12000 HP StorageWorks 12000 Gateway Virtual Library System User Guide (AH81 - Page 91

Deduplication, How It Works

Page 91 highlights

6 Deduplication Deduplication is the functionality in which only a single copy of a data block is stored on a device. Duplicate information is removed, allowing you to store more data in a given amount of space and restore data using lower bandwidth links. The HP StorageWorks virtual library system uses Accelerated deduplication. NOTE: The deduplication feature is only available on systems running VLS software version 3.0 or higher. This section describes deduplication including getting deduplication running on your system, configuring deduplication, and viewing reports. NOTE: See the HP StorageWorks VLS and D2D Solutions Guide for more detailed information. How It Works HP Accelerated deduplication compares the most recent version of a backup to the previous version using object-level differencing code. It places pointers in the earlier version that identify duplicated content in the new version. Deduplication then eliminates the redundant data in the earlier version while retaining the complete, new version. You can improve deduplication performance simply by adding additional nodes. NOTE: Deduplication takes place after the data has been processed to the backup tapes. Therefore, any data backed up to compression-enabled virtual tape drives (both software and hardware compression) is compressed before it is deduplicated. The following is an overview of the deduplication process. See the HP StorageWorks VLS and D2D Solutions Guide for more detailed information. 1. When a backup runs, a data grooming exercise is performed on the fly. Using meta-data attached by the backup application, data grooming maps the content or "objects" of the backup, and assembles a content database. This process has minimal performance impact. 2. After the scheduled backups have completed, the content database is used to "delta-difference" (compare) objects in current and previous backups from the same hosts. There are different levels of comparison. For example, files may be compared using a strong hashing function, while other objects may be compared at a byte level. HP StorageWorks 12000 Gateway Virtual Library System User Guide 91

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264

6 Deduplication
Deduplication is the functionality in which only a single copy of a data block is stored on a device.
Duplicate information is removed, allowing you to store more data in a given amount of space and
restore data using lower bandwidth links. The HP StorageWorks virtual library system uses
Accelerated
deduplication
.
NOTE:
The deduplication feature is only available on systems running VLS software version 3.0 or higher.
This section describes deduplication including getting deduplication running on your system, configuring
deduplication, and viewing reports.
NOTE:
See the
HP StorageWorks VLS and D2D Solutions Guide
for more detailed information.
How It Works
HP Accelerated deduplication compares the most recent version of a backup to the previous version
using object-level differencing code. It places pointers in the earlier version that identify duplicated
content in the new version. Deduplication then eliminates the redundant data in the earlier version
while retaining the complete, new version. You can improve deduplication performance simply by
adding additional nodes.
NOTE:
Deduplication takes place after the data has been processed to the backup tapes. Therefore, any data
backed up to compression-enabled virtual tape drives (both software and hardware compression) is
compressed before it is deduplicated.
The following is an overview of the deduplication process. See the
HP StorageWorks VLS and D2D
Solutions Guide
for more detailed information.
1.
When a backup runs, a data grooming exercise is performed on the fly. Using meta-data attached
by the backup application, data grooming maps the content or
objects
of the backup, and
assembles a content database. This process has minimal performance impact.
2.
After the scheduled backups have completed, the content database is used to
delta-difference
(compare) objects in current and previous backups from the same hosts. There are different levels
of comparison. For example, files may be compared using a strong hashing function, while other
objects may be compared at a byte level.
HP StorageWorks 12000 Gateway Virtual Library System User Guide
91