Click here for full text:
Instant Snapshots in a Federated Array of Bricks
Keyword(s): snapshots; checkpoints; storage; distributed systems; clusters
Abstract: Snapshot has become a fundamental requirement on mid to high end storage systems. Its applications include archiving, recovery, report generation, decision making tools and remote mirroring. State-of-the-art snapshot techniques on existing storage systems typically work on a single (fault-tolerant) controller, and need to pause the applications or change the operation mode of the file systems or databases when a snapshot is taken. In a federated array of bricks (FAB), a snapshot may involve tens to thousands of independent controllers or processors, and may be taken at a high frequency, e.g., once every 30 seconds for atomic updates in remote mirroring. Therefore, an efficient distributed snapshot algorithm that can make the snapshot operations transparent to applications is needed in FAB. In this paper, we propose such an algorithm, which avoids pausing or aborting write requests by the novel use of a tentative data structure during the two phase commit of a snapshot creation. The snapshot operations are serializable with data operations (i.e., reads and writes), hence ensure consistency of the snapshots. Read-only operations on snapshots are optimized in common cases, only requiring communications to a small subset of the bricks, in particular, a single replica set or three bricks in FAB. The algorithm has been prototyped in FAB and has been tested with trace-based experiments.
Back to Index