by Jamie Beckett
HP Labs helped invent HP's Utility Data Center, an adaptive IT
infrastructure that dynamically allocates computing resources to meet
changing needs. Now the lab is putting the data center to work as
one of the most sophisticated research facilities around, a testbed
for future utility computing work and other compute-intensive projects
such as bioinformatics and quantum physics.
At the same time, the Utility Data Center (UDC) has become HP Labs' computing infrastructure, increasing the number of servers in data
centers in its two largest labs from 150 to 400 - without adding
staff or requiring additional space.
John Sontag, the HP Labs research manager who oversees the installations, recently discussed the technology behind the UDC, ongoing research in utility computing and HP Labs' vision for planetary computing, in which globally connected resources are allocated dynamically and securely on demand.
» What is the Utility Data Center testbed?
» What are the primary research issues?
» Why use the UDC as a production environment for HP Labs?
» What is the goal of this work?
» Can you talk about the HP Labs technology at work in the UDC?
» What applications are you exploring?
The testbed is two things, really - first, it's state-of-the-art research lab for our work in planetary computing. But at the same time, we're using it as a production environment for HP Labs.
Planetary computing is our long-term vision of how we tie all the computation of a company or a state or a nation or even the world together as an interchangeable utility. The Utility Data Center (UDC) is a step toward thinking about what the generation facilities for that utility will look like.
The testbed is the ultimate research playground, state-of-the-art technology and building on all the things we've invented before. It's reconfigurable virtually, so you can try out things you couldn't try out in any other way, and you can change it from day to day. We can do large-scale experiments using hundreds of systems in ways that were impractical before because we couldn't put systems together fast enough.
For the first time, we can also include partners from the outside; by putting this in this utility framework it can now be accessible over the Internet.
The big questions we're looking at are, how interchangeable can that computational facility be made, how does it federate across the many instances of the UDC and how do construct a software management structure that can take full advantage of that?
What's going on here in our labs in Palo Alto, California (USA) and in Bristol (UK) is to understand the nature of these generational facilities, how they'll change over time, how will we virtualize resources and how we will provision them.
This is a largely unexamined space. Just by consolidating things into one room, you learn something. Two years ago, the room we have in Palo Alto had 50 computers; by the end of the year, it'll have around 400. If you look at the next generation of servers, we probably could get 5,000 servers in here in three to five years, and by the end of decade - if you consider some of the work being done by HP Labs around systems architectures -- we probably could fit 75,000 to 100,000 servers into that same 2,200-2,300-square foot space. We're going to get more and more dense.
We're working with a number of collaborators on these projects -- CERN (the world's largest particle physics center) CITRIS (a partnership between four University of California campuses, California industry and the State of California) and PlanetLab (a university-based testbed for developing, deploying and accessing planetary-scale services) -- looking at how to distribute computing work around the world and what issues this raises.
By having a production environment in the same place as our research lab, we can measure its functions in new ways. We can examine how the cooling and the power in the room work, how well the UDC functions under different loads, what the network traffic looks like.
There's nothing like using what you built to learn what you like and don't like about it. We're our own first customer. Right now, we've installed the first instance of the UDC product HP announced in November 2001; it can be reconfigured dynamically, simplifying data center management by consolidating and standardizing IT resources and automating data center processes.
On top of that, we're using our prototype of the utility controller (software that dynamically allocates computing resources) and we're developing the next-generation UDC, all in the same physical space.
It's all about the convergence toward the computing utility. This is why we started working a long time ago what became the architecture for the Itanium Processor Family; it gave us a common processor architecture that can run our three operating systems -- HP-UX and Linux and Windows.
Now we're working on a structure that can support our technical computing, our traditional commercial computing and the things that are coming at us, things like the GRID and the expansion of digital publishing. We're creating the utility computing infrastructure needed to support all those things in an interchangeable way.
The biggest part of this has to do with the virtualization and virtual wiring work that was done in HP Labs in 1999 and 2000. That is really the genesis of how you can share all these facilities sharable. That work is now the UDC's Utility Controller Software, which simplifies the design, allocation and billing of IT resources for applications and services.
HP's OpenView software monitors the UDC operation, a necessary element in assessing the health of the operations.
Once you have that we can start adding on the other technology that we have. You can bring in Intel Itanium processor family systems, which are derived from chip architecture designed at HP Labs. The work that people are doing around Smart Cooling -- which enables efficient cooling of high-density data centers -- is unique to HP and HP Labs. Smart Cooling has been critical to achieving the density of systems both in Palo Alto and Bristol.
Our research team in the Bristol (UK) lab is using technnology they developed called SmartFrog to create the next generation of service specification and deployment technologies that will drive data-center automation to the next level. (The Palo Alto, Calif., lab will implement it later).
By being able to capture the specifications of a service in an electronic form, it is possible to both capture the design of a service and automate the deployment and adaptation of services. Today, that information is held ad hoc in the heads of the IT people of every organization. Learning and sharing best practices across the community that provides these services is very difficult. SmartFrog gives us a way to capture that knowledge and share it amongst all of the practitioners in an area of expertise.
The UDC touches nearly every part of the technology that we have in storage, commutation, and networking. At some point, all this work will be brought into play here.
We're going to try to meet the needs of a wide variety of application classes and industry segment, from rich media, to interaction-intensive applications like Frame Factory (which dramatically speeds the rendering process for digital animators) and the distribution of that media to mobile communication devices, to compute- intensive applications such as scientific and technical computing to communication-intensive applications multicast media and wide area file systems. We're also going to lend out resources to some of our collaborators so we can understand how fast we can free the resources and get them running.
What this does is helps customers and partners handle shifting business priorities. We've been finding that there are people in every organization who wish they had a whole lot of computers to try something out. But because it is risky, because it takes months to get things approved and bought and built, they'll never do it.
Most organizations that we talk to tell us that the time from when they see a new business priority or opportunity to when they actually get the computing resources to implement plans is at least four months. Those sorts of boundaries tend to constrain your thinking. I'd like to be able to say to customers, "You want 100 computers for this weekend? No problem."