Globally Distributed BookPrep
Reddy, Prakash; Dudekula, Shariff; Puthanveedu, Susanth; Milojicic, Dejan
Keyword(s): No keywords available
Abstract: BookPrep is a Print-On-Demand service that takes raw scans and converts them to print-ready files. It requires large amount of storage and takes an average of 5 hours of CPU time to process a single book with about 300 pages. The experiment we conducted is processing of books on Open Cirrus where the data is close to compute servers. At three Open Cirrus sites we installed BookPrep service and we pre-populated each site with region-specific scanned books. When request comes in to process the book, it is routed to the compute node closest to the source data. The compute node is then expected to store the processed data on the same network. The compute nodes are allocated and de-allocated based on demand. There is a cloud based metadata repository that is used to update the metadata associated with each book regardless of where the source and derived data is stored. The goal of this experiment is to determine if performance can be improved if the compute is moved close to data and we would like to see if that same principal can be applied to pull based scheduling model.
External Posting Date: August 21, 2011 [Fulltext]. Approved for External Publication
Internal Posting Date: August 21, 2011 [Fulltext]