(Readers may find this
press handout useful while reading this piece.)
HP Laboratories demonstrated the flexibility and power of its utility rendering service, codenamed Frame Factory, in 2003, through collaborative work with 422 and the production of The Painter animation.
But the HP Labs research team wanted to take that project further - using the same approach but building a bigger, better service for multiple customers that would also act as a real test-bed for a number of HP Labs research technologies.
The answer: build a prototype utility computing platform - the Service Utility - on which the 10 participating animation companies can each operate their own replicated copy of the next generation version of Frame Factory. The Service Utility platform is capable of running different services for multiple customers simultaneously.
For the SE3D showcase the platform will only be running Frame Factory - 10 services, one for each of the animation companies participating. But the platform could equally well run services such as HP Labs' experimental gene sequencing system, codenamed Gene Factory, for instance, or a future financial or engineering service.
The Service Utility integrates a number of experimental components - the results of HP Labs research projects focused on technology that could be at the heart of a future utility data centre. Together they form a platform that manages, allocates and controls the banks of processors and storage used by each Frame Factory service. It deals with resource failures and restores, cleans and redistributes nodes as required. The initial server pool will have more than 100 servers, each with two processors.
Experimental services such as Frame Factory have been built to help identify the Service Utility's requirements and test its performance.
Research underpinning the Service Utility platform
A key experimental technology for the platform is an extensible market-based resource allocation system codenamed Sumatra.
In the future world of utility computing customers will pay for the service and computing resources - processors, storage, networking etc - that they use. Even a massive data centre with tens of thousands of processors will have a finite number of resources available for each customer. With multiple customers, each could try to reserve, and hold on to, as many processors as they can 'just in case' - even if they don't actually require them.
One way to control demand - and also predict and manage fluctuations - is with automated markets. What will happen in a real situation, for instance, if a film studio suddenly needs many extra nodes to render its next animation, or if a laboratory calls on more processors to sequence a new virus in an emergency? It is Sumatra's job to regulate demand through a variety of techniques.
Sumatra can host and run different kinds of markets in which the animator participants can acquire computing resources. A Futures market will sell processor time to the animators several days in advance, allowing them to reserve rendering time just when they think they will need it. A Vickrey Spot market will operate for participants who need more processors at short notice - say 30 minutes in advance.
Another experimental market is Network of Favours, in which the processors are divided equally between the participants, while those that require more can borrow from others with resources to spare. The borrowers would then be expected to lend their rendering resources to other users, if required, if they have any spare.
The Tycoon market, developed by another HP Labs research group, is able to reallocate resources rapidly depending on what users are prepared to spend at any given moment.
The animators will use "computing credit" to buy access to resources. In a business context animators would purchase such credit from the service supplier with real money but for the purposes of the experiment they are being given a finite amount of free computing credit, although they have a strong incentive to spend it wisely.
Sumatra interfaces with the Service Utility's Resource Manager to handle the actual allocation of resources to users based on the results of the market mechanisms. The Resource Manager configures the Service Utility's processors, and is able to keep track of what processors are available - or have ceased to be available for whatever reason - through a research technology called Anubis. This is a small program that sits on every processor and sends an "I'm here" message to the Resource Manager and also to all other processors, so that it is clear to all which of them, and how many, are available at any time.
Anubis can therefore report a failure in one or more of the processors, allowing the Service Utility platform to recover the information on the affected devices and distribute it to other nodes. Anubis is fully decentralised, meaning it can detect and respond to a wide variety of problems, and provide a consistent picture of what is happening to every resource in the system.
In the future world of utility computing, contractual agreements between the customer and the service provider, called Service Level Agreements (SLAs), will be automatically enforced. An SLA might state that a job has to be finished by a certain time, to a particular standard and with guaranteed security. If the SLA is not met the provider would face penalties.
What if a group of processors fails during rendering, affecting several participants? Some of them may have stricter SLAs than others, with tighter guarantees. HP Labs is using the platform to test an experimental technology called Management by Business Objectives (MBO). MBO, a key component of the Resource Manager, has a reasoning engine that assesses all the SLAs, including the various penalties that the service provider faces. If all goes well then MBO is not needed but should something unexpected happen, for example a hardware failure, then MBO decides automatically which customer should receive a reduced level of service in favour of one that has tighter guarantees and higher levels of compensation.
When processors are reallocated they are cleaned of data so that the previous user's sensitive data cannot be accidentally accessed by the next customer.
High levels of IT Security are of crucial importance if customers are to have confidence in utility computing, where they could be sharing resources with companies that are in direct competition with them. The Service Utility platform has novel forms of Secure Storage developed by HP Labs researchers that gives each user a unique cryptographic key. This key is the only access to their part of the platform's storage, and is managed securely and transparently by the Service Utility.
Another project that will be incorporated into the platform is the Trust Record, which will take a snapshot of the utility and generate audit reports about the data centre, and its performance, for all the users. Not only will this allow customers to build trust in the utility, it will also support the requirements of corporate governance.
All of these platform components are deployed and managed using an HP Labs technology called SmartFrog (Smart Framework for Object Groups). SmartFrog describes distributed software systems as collections of cooperating components, and then activates and manages them. Its core has been released as Open Source for developers.
Project Frame Factory: stress-testing the platform
But what about Frame Factory, the experimental utility rendering service that runs on top of the Service Utility platform? Frame Factory is a just one example of the kind of service that could operate on a utility computing service.
By running 10 individual Frame Factory services - one for each of the participants - for the duration of the animation showcase, the HP Labs researchers are stress-testing their prototype platform to see how it performs.
For the users, sitting at their computers in their animation companies, accessing Frame Factory through its Web portal is simple. The service is virtualised: they have no need to worry about where the storage and the servers they have been allocated actually are. They send content to be rendered over a standard broadband internet connection. They can check on the progress of the work and then pull the rendered frames back to their own systems when they are complete.
However, behind the virtualisation there is a lot going on. Like the Service Utility platform, Frame Factory has a number of research components from HP Labs. Plus, at its heart, it has the Maya 3D rendering application from Alias®.
HP Labs components include the Asset Store, which stores the wire-frame animation content received from the client, and the Service Manager, which distributes the content to the Processing Nodes allocated by the Service Utility for conversion into completed, rendered frames. The Asset Store allows the clients to download the completed content from Frame Factory over a secure, encrypted link, and manages multiple versions of content over time.
The source content to be rendered can grow to be many gigabytes in size, but usually only a fraction of this changes each time the user wants to upload their latest work for rendering. So a distributed versioned filestore, codenamed Elephant Store, is being used to help by only transferring the parts of files that have changed. It is even smart enough to know whether a file has simply been renamed and avoids uploading redundant information. Each completed upload of the source content is treated as a unique version.
This allows the storage of large amounts of data online, giving the animator access to a comprehensive historical record of their work and the ability to check and revert to earlier versions of frames if necessary.
As with the Service Utility platform, SmartFrog describes, assembles, launches and manages the components to form the Frame Factory service. SmartFrog uses service templates so that, once described, a service can be launched repeatedly.