A few weeks ago I completed a series of posts describing the ways that cloud computing will change the way we utilize virtual machines and operating systems. The very heart and soul of software systems design is being challenged by the decoupling of infrastructure architectures from the software architectures that run on them.
Over the last few weeks, I've been slowly trying to get a grip on what the state of the union is with respect to software "packaging" architectures in cloud computing environments. Specifically, I've been focusing on infrastructure-as-a-service (IaaS) and platform-as-a-service (PaaS) offerings, and the enabling infrastructure that will handle application deployment to these services in the future. How will they evolve to make deployment and operations as simple as possible?
My search started innocently enough. After writing the "big rethink" series, I formed a theory that there are really only two interface points that IaaS and PaaS services needed to standardize:
The management interfaces that enable a wide variety of tools to monitor and manipulate the resources and services being offered
The "unit of delivery" that includes the software to be hosted and any required supporting data, configuration, and policy required to allow that software to work.
The former interface is well covered, with a large number of interfaces attempting to either be the sole vehicle for cloud management, or to map heterogeneous options to a single interface.
The "unit of delivery" interface, however, is actually far behind its management brethren when it comes to concerted efforts to provide a standard. There is OVF, which the Distributed Management Task Force, a standards body, is developing in part as a server-centric packaging for IaaS applications. However, OVF still requires developers and administrators to build an image from the ground up (or to build on top of an image provided by others), including configuring the operating system, any management and security utilities, and the virtual machines themselves.
The more I explore this question, in light of the "big rethink," the more I think there is an opportunity to simplify cloud computing through changing the focus from infrastructure to applications. Specifically, I think there are some advantages to a uniform description of an application, its configuration, and its operational requirements, that can be used to describe any software deliverable to the cloud, whether meant for IaaS or PaaS.
The diagram below describes my vision in a nutshell:
The package could be an archive file of some kind, or it could be some other association of files (such as a source control file system). The four elements displayed above are:
Metadata describing the manifest of the package itself, and any other metadata required for processing the package such as the spec version, application classification, etc. The manifest should describe enough that the receiving cloud infrastructure could decide if it was an acceptable package or not.
The bits that make up the software and data being delivered. This can be in just about any applicable format, I think, including an OVF file, a VHD, a TAR file or whatever else works. Remember, the manifest would describe the format the bits are delivered in--e.g. "vApp" or "RoR app" or "AMI" or "OVF," or whatever--and the cloud environment could decide if it could handle that format or not.
An appropriate deployment and/or configuration description, or pointers to the appropriate descriptions. I've always thought of this as a Puppet configuration, a Chef recipe, or something similar, but it could simply be a pointer to a JEE deployment descriptor in a WAR file provided in the "bits" section.
The deployment/configuration section must contain the information required to successfully get the application up and running in the target cloud environment, beyond what is contained in the bits themselves. This could potentially include a lot of information, such as required server and storage configurations, required network connections to services the app depends on, and potentially things like acceptable pricing and/or billing terms.
The information could be proprietary to a single vendor, but in the interest of some level of portability, I would hope we would see some more generalized standards for each application classification.
Orchestration and service level policies required to handle the automated run-time operation of the application bits. Again, I would hope to see some standards appear in this space, but this section should allow for a variety of ways to declare the required information.
Examples of what I would expect to find in this section are spot pricing limits (if needed), service level metrics and limits, information or code describing how the system should respond to increases or decreases in load, etc.
I don't expect the specific contents of the package to be uniform, just the overall structure and the manifest itself. Because of this, it is important to point out that this application packaging is not about portability, but rather about packaging, inventory, and interpretation. You would use these files to consistently store all types of cloud deliverables in a format interpretable by a standardized inventory system, digitally "ship" the deliverables to any arbitrary cloud service that supports the packaging standard, and to allow the cloud vendor to decide if and how it can support the needs of the application.
All of which leads to a simple question: why would anyone want or need this form of application packaging? Here are my thoughts on that:
It lets customers build an inventory of all cloud (and, in reality, non-cloud) application components in a format that makes automated deployment to a wider variety of cloud vendors theoretically possible, and packages all deployment and runtime automation parameters with the application code for change management purposes.
It would allow cloud vendors to begin to accept applications from competing environments using the same core platform or infrastructure without giving up the ability to add differentiated services, configuration, or orchestration features. This would be extremely beneficial in the PaaS market, where common use of open-source platforms means that there is some level of code portability, and where the service offerings of each vendor is what differentiates the offering.
It would greatly aid the open-source community in creating a simple, consistent way to describe complex applications to folks looking for software alternatives. Without this approach, the open-source provider is required to either build a virtual appliance with their code, or to require the end user to do all of the "heavy lifting" of application installation into an IaaS environment.
Clearly this is an outline of a vision, not a standard that is under way or a "running code and loose consensus" demonstration of that vision. Why not keep this to myself and build a business around it? Because such a packaging format would have to be open and standard, and I'm hoping some of you will get inspired to explore the idea further.
What do you think? What works, doesn't work, or is missing for you?
A special thanks to Heroku's Oren Teich and the Clouderati on Twitter for their contributions and challenges to this idea.