HP Labs cloud-computing test bed: VideoToon demo

Video transcript
We are going to show an example of a research project utilizing our cloud computing test bed to investigate some of the resource allocation issues involved in sharing a distributed set of resources among applications and users.
As shared computational resources become a commodity and grow in scale, deciding who gets which resources becomes a big problem. The demand for a resource may be bursty and so the price users are willing to pay for a resource over time will also vary. Fixed-priced schemes suffer both from underutilization and demand truncation. Our approach is to provision the resources on a computational market where prices vary with demand. This market uses virtualization (Xen VMM, Intel VT) to provide performance isolation, fine-grained and agile user-exposed resource configuration control, and
allows users and applications to share the same infrastructure on-demand.
One application of our market-based cloud computing platform uses an image processing algorithm that transforms arbitrary photos and videos into cartoons. Our demo showcases four HP, Yahoo! and Intel technologies; HP Labs Tycoon, a market based resource allocator; HP Labs VideoToon, an image processing algorithm; Yahoo! Hadoop, a distributed filesystem and programming model for cloud computing; and finally Intel VT Virtualization Technology.
In our demo we first configure the number of hosts and their memory and disk. We also specify how much we are willing to pay for our virtualized cluster through a spending rate. All of these variables can be changed at any point without interrupting running jobs. Increasing the spending rate will immediately increase the CPU share on the cluster nodes.

The system both creates the virtual cluster and configures a Hadoop installation. The whole process takes less than five minutes. We can browse the distributed filesystem (HDFS) that will store our files. We have uploaded an mpeg video stream of a HP marketing video in a directory called video which will serve as our input.

The next step is to run our application. The video processing logic uses three Hadoop Map/Reduce jobs. The first one splits the video stream into substreams, the second one performs the transformation on a substream, and the third one joins the transformed video substreams into the original order. The performance bottleneck is in the transformation phase which is why this stage is fully parallelized. The Map/Reduce jobs and a descriptor file specifying the order in which the jobs should be piped are packaged in a cloud archive file. This archive file is all the platform needs to run a job.
During the transformation phase we can watch the stripped video streams as they are being processed.
While the job is running we can monitor its performance. This is based on the overall demand in the system as well as your own spending rate. The price dynamics of different resources, your capacity over time, and your spending history can all be easily monitored to determine whether reconfiguration is necessary.
Once the join phase is complete we can download and watch to resulting video stream. A new job can be submitted with different algorithm parameters if the result was not satisfactory, or alternatively individual pieces of videos could be processed with different parameters. In that sense the platform allows both batch-like and more interactive processing. The key point is that the resources can be configured to meet the specific demand of the user at any time.
To summarize, we were able to quickly test our resource allocation system at a large scale with real application deployments solving real-world problems. And we invite other researchers to use the test bed as well for their projects.
NASA Goddard Space Flight Center Image by Reto Stöckli (land surface, shallow water, clouds). Enhancements by Robert Simmon (ocean color, compositing, 3D globes, animation). Data and technical support: MODIS Land Group; MODIS Science Data Support Team; MODIS Atmosphere Group; MODIS Ocean Group Additional data: USGS EROS Data Center (topography); USGS Terrestrial Remote Sensing Flagstaff Field Center (Antarctica); Defense Meteorological Satellite Program (city lights).