HP 12000 HP VLS Solutions Guide Design Guidelines for Virtual Library Systems - Page 90

Data comparison, Data reassembly, constantly changing file names, such as database dump files

Page 90 highlights

d. the type of backup - full or incremental e. the type of data in the backup - files, database, etc. The deduplication software then queries the metadata database to find an equivalent older version of the same backup job to compare it against the new backup. If the current backup is full, it will be compared against the last equivalent full backup version. If the current backup is differential or incremental, on systems running firmware version 3.3 or higher it will be compared against the previous incremental or differential. (TSM "incremental forever" backups are all treated as full backups by deduplication.) 3. Data comparison This phase is also called delta differencing. After the data grooming phase is complete, and if there is an older version of the backup, either full or incremental, the new backup is compared, or delta-differenced against it. The deduplication software looks for differences at the byte level between backup objects with two different differencing schemes depending on the type of the backup: • File-level differencing is for standard file backups and compares any changed file against the previous version of that file at the byte level. This improves deduplication performance because it does not have to compare unchanged files. • Backup-level differencing is for any non-file backup types (all database agent backups and any unknown agent backups) and compares across the entire backup at the byte level. This differencing method can optionally be used for file backups that contain constantly changing file names, such as database dump files, because the differencing is done at the job level without any awareness of file names within the backup. 4. Data reassembly In this phase, duplicate data is replaced with pointers, and pointers are readjusted so that they point to the most recent instance of data. If the new backup is a full backup, this phase does not touch the new data, so that the most recent backup is complete and available for restores. Instead, a new copy of the older version is created. See Figure 43 (page 91). If the new backup is incremental, it is reassembled so that duplicate data is replaced with pointers to the last full backup version. 90 Accelerated Deduplication

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160

d.
the type of backup – full or incremental
e.
the type of data in the backup – files, database, etc.
The deduplication software then queries the metadata database to find an equivalent older
version of the same backup job to compare it against the new backup. If the current backup
is full, it will be compared against the last equivalent full backup version. If the current backup
is differential or incremental, on systems running firmware version 3.3 or higher it will be
compared against the previous incremental or differential. (TSM “incremental forever” backups
are all treated as full backups by deduplication.)
3.
Data comparison
This phase is also called delta differencing. After the data grooming phase is complete, and
if there is an older version of the backup, either full or incremental, the new backup is
compared, or delta-differenced against it. The deduplication software looks for differences at
the byte level between backup objects with two different differencing schemes depending on
the type of the backup:
File-level differencing is for standard file backups and compares any changed file against
the previous version of that file at the byte level. This improves deduplication performance
because it does not have to compare unchanged files.
Backup-level differencing is for any non-file backup types (all database agent backups
and any unknown agent backups) and compares across the entire backup at the byte
level. This differencing method can optionally be used for file backups that contain
constantly changing file names, such as database dump files, because the differencing
is done at the job level without any awareness of file names within the backup.
4.
Data reassembly
In this phase, duplicate data is replaced with pointers, and pointers are readjusted so that
they point to the most recent instance of data.
If the new backup is a full backup, this phase does not touch the new data, so that the most
recent backup is complete and available for restores. Instead, a new copy of the
older
version
is created. See
Figure 43 (page 91)
.
If the new backup is incremental, it is reassembled so that duplicate data is replaced with
pointers to the last full backup version.
90
Accelerated Deduplication