Dell DR4000 Administrator Guide - Page 13
Microsoft Office application documents including Powerpoint, MS-Word, Excel, and Sharepoint
View all Dell DR4000 manuals
Add to My Manuals
Save this manual to your list of manuals |
Page 13 highlights
FILE LOCATION: C:\Users\bruce_wylie\Desktop\Dell Docs\~Sidewinder_Docs_DR4000\FRAME_Conversion_DR4000_AdminGuide\~DR4000_AG_F Block-level deduplication works efficiently where there are multiple duplicate versions of the same file. This is because it looks at the actual sequence of the data-the 0s and 1s-that comprise the data. Whenever a document is repeatedly backed up, the 0s and 1s stay the same because the file is simply being duplicated. The similarities between two files can be easily identified using block deduplication because the sequence of their 0s and 1s remain exactly the same. In contrast to this, there are differences in online data. Online data has few exact duplicates. Instead, online data files include files that may contain a lot of similarities between each file. For example, a majority of files that contribute to increased data storage requirements come pre-compressed by their native applications, such as: • Images and video (such as the JPEG, MPEG, TIFF, GIF, PNG formats) • Compound documents (such as .zip files, E-mail, HTML, web pages, and PDFs) • Microsoft Office application documents (including Powerpoint, MS-Word, Excel, and Sharepoint) NOTE: The DR4000 system experiences a reduced savings rate when the data it ingests is already compression-enabled by the native data source. It is highly recommended that you disable data compression used by the data source, and especially for first-time backups. For optimal savings, the native data sources need to send data to the DR4000 system in a raw state for ingestion. Block deduplication is not as effective on existing compressed files due to the nature of file compression because its 0s and 1s change from the original format. Data deduplication is a specialized form of data compression that eliminates a lot of redundant data. The compression technique improves storage utilization, and it can be used in network data transfers to reduce the number of bytes that must be sent across a link. Using deduplication, unique chunks of data, or byte patterns, can be identified and stored during analysis. As the analysis continues, other chunks are compared to the stored copy and when a match occurs, the redundant chunk is replaced with a small reference that points to its stored chunk. This reduces the amount of data that must be stored or transferred. DELL CONFIDENTIAL - PRELIMINARY 1/10/12 - FOR PROOF ONLY Understanding the DR4000 System 5