HP 12000 HP StorageWorks 12000 Gateway Virtual Library System User Guide (AH81 - Page 91
Deduplication, How It Works
View all HP 12000 manuals
Add to My Manuals
Save this manual to your list of manuals |
Page 91 highlights
6 Deduplication Deduplication is the functionality in which only a single copy of a data block is stored on a device. Duplicate information is removed, allowing you to store more data in a given amount of space and restore data using lower bandwidth links. The HP StorageWorks virtual library system uses Accelerated deduplication. NOTE: The deduplication feature is only available on systems running VLS software version 3.0 or higher. This section describes deduplication including getting deduplication running on your system, configuring deduplication, and viewing reports. NOTE: See the HP StorageWorks VLS and D2D Solutions Guide for more detailed information. How It Works HP Accelerated deduplication compares the most recent version of a backup to the previous version using object-level differencing code. It places pointers in the earlier version that identify duplicated content in the new version. Deduplication then eliminates the redundant data in the earlier version while retaining the complete, new version. You can improve deduplication performance simply by adding additional nodes. NOTE: Deduplication takes place after the data has been processed to the backup tapes. Therefore, any data backed up to compression-enabled virtual tape drives (both software and hardware compression) is compressed before it is deduplicated. The following is an overview of the deduplication process. See the HP StorageWorks VLS and D2D Solutions Guide for more detailed information. 1. When a backup runs, a data grooming exercise is performed on the fly. Using meta-data attached by the backup application, data grooming maps the content or "objects" of the backup, and assembles a content database. This process has minimal performance impact. 2. After the scheduled backups have completed, the content database is used to "delta-difference" (compare) objects in current and previous backups from the same hosts. There are different levels of comparison. For example, files may be compared using a strong hashing function, while other objects may be compared at a byte level. HP StorageWorks 12000 Gateway Virtual Library System User Guide 91