Improving Restore Speed for Backup Systems that Use Inline Chunk-Based Deduplication

Share

Abstract: Slow restoration due to chunk fragmentation is a serious problem facing inline chunk-based data deduplication systems: restore speeds for the most recent backup can drop orders of magnitude over the lifetime of a system. We study three techniques--increasing cache size, container capping, and using a forward assembly area--for alleviating this problem. Container capping is an ingest-time operation that reduces chunk fragmentation at the cost of forfeiting some deduplication, while using a forward assembly area is a new restore-time caching and prefetching technique that exploits the perfect knowledge of future chunk accesses available when restoring a backup to reduce the amount of RAM required for a given level of caching at restore time. We show that using a larger cache per stream--we see continuing benefits even up to 8 GB--can produce up to a 5-16X improvement, that giving up as little as 8% deduplication with capping can yield a 2-6X improvement, and that using a forward assembly area is strictly superior to LRU, able to yield a 2-4X improvement while holding the RAM budget constant.

15 pages

Additional Publication Information: Published at FAST 2013 (feb 2013). Citation: Mark Lillibridge, Kave Eshghi, and Deepavali Bhagwat. Improving Restore Speed for Backup Systems that Use Inline Chunk-Based Deduplication. In Proceedings of the 11nd USENIX Conference on File and Storage Technologies (FAST'13), pp. 183-197, San Jose, California, February 2013.

  • External Posting Date: June 21, 2013 [Fulltext]. Approved for External Publication
  • Internal Posting Date: June 21, 2013 [Fulltext]

Back to Listing