As reported on CNET, Amazon Web Services has announced a new pricing option that lets its customers take advantage of spare capacity within the EC2 infrastructure at variable, supply-and-demand-driven pricing.
The news has taken the cloud community by storm. For some, it represents the beginning of a long-anticipated move to market pricing for core IT infrastructure services.
(Credit:
Wikimedia Commons)
While there is some truth to the importance of AWS spot pricing to the history of cloud computing, let's keep things in perspective: this pricing is set by Amazon, not any market. We are a long way from a true commodity market for any form of cloud computing service.
Before I go any further, let's review how the feature works:
Each customer sets a maximum price he or she is willing to pay for "spot instances."
Amazon sets a "spot price" for instances hour-by-hour, based on available supply and demand.
Customers pay whatever the spot price is up to their maximum price. So, if someone bids $0.07/hour, and the spot price is $0.05/hour, the person pays $0.05/hour.
If the spot price exceeds the customer's maximum price, the customer's instances are terminated.
Spot pricing is the third EC2 pricing option, joining existing on-demand and reserved instance options. The first two options targeted two critical-use cases for cloud computing: reserved instances for mission-critical apps where capacity must always be available to meet demand, and on-demand pricing for just about everything else.
However, the success of both options likely left Amazon with a big problem: excess capacity. The success of reserved instances means that Amazon has to keep around enough capacity to guarantee that it can handle any spike in demand that might come along. The success of on-demand pricing means that Amazon has to build out new capacity fast enough to stay ahead of the voracious demand curve.
So, what to do? Enter spot pricing. Amazon's new pricing is an incredibly creative way to encourage consumption of unused data center capacity, by providing that capacity at clearance sale prices on the condition that Amazon can take it back at a moment's notice. For the right kind of applications, it's a true win-win situation.
Why not profit from what would otherwise be a liability?
Note, however, that this feature is not market-based pricing. Amazon determines the spot price and can raise that price enough to gain back capacity at will, at no real cost to itself. There is no competition. There is no commoditization. There is just consumption of what is not being used.
The truth is, real commoditization of infrastructure services--or any other cloud service, for that matter--isn't in the best interest of Amazon or any other service provider.
Regardless, commoditization can't happen without open standards that allow easy portability and interoperability of data and code, as well as security, control, service-level assurance and compliance systems. Those standards are coming, but it is impossible to predict when they will arrive. I only hope Amazon embraces them when they do.
In the meantime, we can watch with admiration how the success of Amazon Web Services allows it to explore the future of IT with the enthusiastic help of a customer base that truly benefits from each success. I can't wait to see how customers choose to take advantage of spot pricing.
One of the most interesting aspects of the weeks leading up to and including this year's VMWorld was the incredible innovation in cloud-computing service offerings for enterprises--especially in the category of infrastructure as a service. A variety of service providers are stepping up their cloud offerings, and giving unprecedented capabilities to their customer's system administrators.
In this category, enterprises are most concerned about security, control, service levels, and compliance; what I call the "trust" issues. Most of the new services attempt to address some or all of these issues head on. Given that this is the infancy of enterprise cloud computing, I think these services bode well for what is coming in the next year or two.
Here is a brief analysis of the offerings that recently caught my eye:
Amazon Web Services Virtual Private Cloud: There is no doubt that the smart people at Amazon continue to innovate at a breathtaking pace. The last three years have seen a whirlwind of new and upgraded services, ranging from storage and server capacity, to payment processing and content delivery.
Amazon's new Virtual Private Cloud offering is just another example of how they listen to their customers when they build solutions. Not so much unique and innovative, as a near perfect execution of a simple solution to a raft of thorny problems, Amazon's VPC service is essentially a powerful VPN gateway which allows Amazon services to be added to the customer's network.
Now, this doesn't directly address security, compliance, or service levels, but it gives enterprise customers a level of control over network configuration that was previously unavailable from Amazon, which in turn enables the customer greater latitude to address those issues.
Savvis "Project Spirit": Available in beta "by the end of this year," Savvis's Project Spirit adheres to a "Virtual Private Data Center (VPDC)" concept very similar to the Virtual Data Center vision espoused by Sun. In a video providing an overview of the service, Savvis indicates that Project Spirit provides three tiers of service, each with an increasing set of capabilities and improved quality of service (QoS).
The video demonstrates wizard-based provisioning and drag-and-drop resource topology design, both of which are similar to features from GoGrid and Sun, though perhaps a little more aligned with the latter than the former.
What I like about Project Spirit is its sense of configurability; something that I think has been missing from many IaaS offerings to date.
Terremark vCloud Express: Terremark is one of the first out of the gate with a basic "one server at a time" offering based on VMWare's vCloud Express infrastructure. Targeted at the same users who find Amazon's EC2 so easy to use, the service is meant as a simple, low-risk way for customers to acquire compute capacity.
In a video recorded at VMWorld, Simon West, Terremark's VP of marketing, demonstrates provisioning a server in the service. Like other services in its class, it focuses on allowing you to select a server image from a menu of possibilities, click a button, and boot the resulting server in a few minutes. Pricing starts at $.036/hr for a 1 "VPU," 0.5GB server, but as Chris Flex of Citrix Systems notes in a blog post, Terremark charges differently than Amazon, so the CPU cost does not necessarily reflect cheaper overall operation costs.
Terremark's new service complements its existing Enterprise Cloud service, which is targeted at larger, more sophisticated infrastructure needs.
OpSource Cloud: Hosting vendor, OpSource, is taking a more network-centric approach toward cloud definition, similar to the "subnets" that Amazon allows customers to create in its VPC offering. The OpSource cloud is in pre-beta now, with an October target for "public release." When the OpSource team demonstrated their user interface to me, they showed me a metaphor that begins with the definition of a "network," which is an isolated through custom routing capabilities at the OpSource data centers.
Each network comes with eight public IP addresses (more can be added), and you can add resources such as servers, storage, and firewalls as you see fit. You can also create as many networks as you'd like for each account.
Obviously, there are many more offerings like these in the market today. However, it is interesting to note that the common theme here seems to be security, either through "isolation" via networking, and/or through the availability of enterprise-class firewalls, load balancers, and the like. The expansion of virtual data center offerings is also interesting, as I think it shows the early growth of what will likely be the true enterprise cloud-computing space.
Access control and user account management was a little sketchy in most of the services I saw, although some showed real promise.
However, one has to wonder as application architectures adjust to cloud computing, how much longer they are going to be tightly coupled to data center architectures. At what point will it no longer be advantageous for application owners to define infrastructure in terms of servers, storage, and security devices?
That being said, the independence of distributed applications from underlying architecture is a long way off, even from the enterprise perspective. I expect that by this time next year, we will see a stable of very strong enterprise public cloud offerings, with support for various compliance standards, sophisticated networking, and cloud-centric security services and technologies.
This is just the beginning of a long evolution, folks.
On the third anniversary of its Elastic Compute Cloud launch, Amazon Web Services late Tuesday announced a new service, the Virtual Private Cloud.
Targeted at customers with existing IT investments, the Virtual Private Cloud (VPC) service provides a way for companies to create a logically separated set of Elastic Compute Cloud (EC2) instances and a secure VPN connection to their own networks.
Amazon Web Services illustrates how the Virtual Private Cloud functions.
(Credit: Amazon.com)Jeff Barr, Amazon Web Services strategist, said in a blog that the service requires three elements: a VPC instance, an IPSec VPN gateway, and a block of IP addresses provided by the customer. The VPC's address space can range from 16 addresses (known to network administrators as a /28 address range) to 16,384 addresses (a /18 address range), and the addresses can be divided up into subnets to further partition traffic.
All Internet-bound traffic is routed through the customer's network and outbound security systems before reaching the public network, Barr said.
Amazon.com Chief Technology Officer Werner Vogels described in a blog Amazon's vision for the service:
(CIOs) have bought into the cloud as a target for a significant portion of their services, as the benefits are too obvious to ignore, and most expect that their transition will be a continuous process. They would accelerate the adoption of cloud services if they could access a form of cloud that would give them the best of both worlds: the flexibility and cost-effectiveness of accessing a virtually infinite pool of resources without owning it, while being able to integrate those resources into their existing datacenter environments such that they could continue to leverage existing investments in their management and control infrastructure...
We have developed Amazon Virtual Private Cloud (Amazon VPC) to allow our customers to seamlessly extend their IT infrastructure into the cloud while maintaining the levels of isolation required for their enterprise management tools to do their work.
Not all Amazon Web Services capabilities are supported in Amazon VPC at the start, such as Amazon EC2 security groups, DevPay AMIs, and Internet-facing IP addresses. The VPN service has been tested with equipment from Cisco Systems and Juniper Networks.
VPC pricing is based on a $0.05 hourly charge for VPN access, plus a cost for data transfer into and out of the connection, ranging from $0.10/GB to $0.17/GB. Charges for other Amazon Web Services, including Amazon EC2, are billed separately at Amazon's standard rates.
In conjunction with the well attended Interop and Enterprise Cloud Summit conferences in Las Vegas this week, cloud infrastructure and service vendor 3TERA announced the 3TERA AppStore, an online portal containing a variety of "cloud ready" components for use on their AppLogic platform. This is the latest commercialization of cloud image stores, and another example of how cloud computing enables marketplaces that were difficult or impossible to do before.
One of the earliest example of this trend comes from none other than Amazon, which provided a commercial payment system (called DevPay) for their Amazon Machine Image store some time ago. What has come of that experiment is in fact an amazing adoption rate for even some of the biggest software system companies in the business, including Oracle and IBM.
Others are working to catch up in this space as well. I know that at least one of the major IaaS providers and one of the cloud management vendors are working to make a play in this space. Getting people to buy pre-configured, pre-architected enterprise software to run in various cloud platforms is going to be an integral part of the cloud experience, and probably a very profitable one.
The 3TERA offering focuses, of course, on their platform, but AppLogic is one of a few platforms that take a true "virtual data center" approach to the problem. Sun's new cloud is another. My gut tells me that these guys have a bit of an advantage when it comes to "packaging" enterprise apps for the clouds, as they can easily include network and storage with the server architecture in one SKU.
Initial AppStore partners include CohesiveFT, Layer 7 Technologies, SOASTA, Tap In Systems, and Zeus Technology.
The debate about the validity of internal cloud implementations has raged on for some time now, with some claiming that cloud computing and wholly owned infrastructure don't mix, and others pointing out that applying "on demand," "at scale," and "multitennant" to enterprise IT data centers offers unique advantages to those who have already made that investment. It has been difficult, however, to do an objective comparison of the two approaches--until now.
The announcement on Thursday of Amazon's new Hadoop-based Elastic MapReduce service, combined with the introduction of a commercial Hadoop distribution from start-up Cloudera, means that we finally have a reasonable means of watching which directions enterprise IT prefers. Let me explain.
Amazon's service is a simplified, prepackaged Hadoop implementation that can be leveraged by anyone with an Amazon account. The Amazon Web Services blog describes it as follows:
Today we are rolling out Amazon Elastic MapReduce. Using Elastic MapReduce, you can create, run, monitor, and control Hadoop jobs with point-and-click ease.
You don't have to go out and buys scads of hardware. You don't have to rack it, network it, or administer it. You don't have to worry about running out of resources or sharing them with other members of your organization. You don't have to monitor it, tune it, or spend time upgrading the system or application software on it.
You can run world-scale jobs anytime you would like, while remaining focused on your results. Note that I said jobs (plural), not job. Subject to the number of EC2 (Elastic Compute Cloud) instances you are allowed to run, you can start up any number of MapReduce jobs in parallel. You can always request an additional allocation of EC2 instances here.
Processing in Elastic MapReduce is centered around the concept of a Job Flow. Each Job Flow can contain one or more steps. Each step inhales a bunch of data from Amazon S3, distributes it to a specified number of EC2 instances running Hadoop (spinning up the instances if necessary), does all of the work, and then writes the results back to S3.
Each step must reference application-specific "mapper" and/or "reducer" code (Java JARs or scripting code for use via the Streaming model). We've also included the Aggregate Package with built-in support for a number of common operations such as Sum, Min, Max, Histogram, and Count. You can get a lot done before you even start to write code!
Cloudera, on the other hand, provides a Hadoop build that you can deploy wherever you wish:
Cloudera's Distribution for Hadoop is based on the most recent stable version of Apache Hadoop. It includes some useful patches back-ported from future releases, as well as improvements we have developed for our support customers.
Cloudera's Distribution includes everything you need to configure and deploy Hadoop using standard Linux system administration tools.
Here's what I'm thinking: enterprise IT is looking at an entirely new class of applications that take advantage of MapReduce to process very large sets of both structured and unstructured data for things like predictive analysis, sorting/sequencing, and data mining. Both commercial Hadoop offerings meet the demand for a platform to simplify the development and operation of these applications. The primary difference is the where, not so much the what.
That is exactly what will make the competition between the two offerings so compelling to watch. Let me break it down for you:
Will the requirement to own and operate hardware work against Cloudera? What makes the Amazon offering so groundbreaking (and it will prove to be historic, in my opinion) is that it is now possible for anyone with a need to analyze large data sets to do so simply for the cost of data storage plus processing time. (Note that the use of Elastic MapReduce adds a nominal cost to the server instances that host the instances.)
Where "grid computing" was once the playground of large enterprises and academic institutions that could afford the hardware to justify the cost of building them out, Amazon makes it possible for even individuals to run such jobs for a few tens or hundreds of dollars.
Cloudera, on the other hand, requires that the hardware be available to install it on. That either means existing server capacity, new hardware (which greatly adds to the cost, and can only be justified for continuous Hadoop use), or leased capacity. The latter starts to look a lot like Amazon's service.
Will Amazon's requirements to use S3 work against it? There are three reasons why I see it might:
- The commonly cited concern about data security outside of corporate firewalls. (Even if the perception is wrong, the perception exists.)
- The cost of data transfer to and from the S3 service--currently as high as 17 cents per gigabyte a month.
- The cost of storage of both the raw data and the aggregate results--currently as high as 15 cents per gigabyte a month.
It should be rightly noted that if you already rely on S3 to store your data sets to be processed, this is a great deal. However, if you have to upload terabytes or even petabytes of data to be combed through by MapReduce, then this could get quite pricey on its own, and existing infrastructure might serve the purpose well. If you are going to leave the data up there permanently--and update it regularly--the cost of Amazon's service should be weighed against the cost of owning and operating that storage yourself in your existing facilities.
Will the so-called "barrier of exit" stand up? I'm not even arguing that the choice will be based solely on the comparative costs to the business. In fact, what I am interested in is the extent to which business units and departments will simply bypass IT altogether to build and run their own jobs in Amazon Elastic MapReduce.
If IT maintains a valuable service using existing facilities and computing investments, then Cloudera will likely do fine. If not, then Amazon stands to be the overwhelmingly dominant commercial Hadoop implementation.
I should also note that running a Hadoop instance is not the same thing as cloud computing in and of itself. An internal Cloudera implementation is not necessarily an internal cloud, though if operated "on demand," "at scale," and with multitenancy, it certainly qualifies as a cloud.
I will be watching this space closely for the next year or two. I have a feeling that Amazon will do fine, regardless, as there are many possible implementations that would benefit from a completely public cloud implementation. The real test is probably how much opportunity Cloudera finds within enterprise data centers.
Cloudera also has much more competition from the free downloads of Hadoop than Amazon has, in my opinion, as it faces a more traditional open-source competitive landscape.
Is your company looking at MapReduce for a new generation of data-mining applications? If so, what will you choose: the public, external cloud implementation of Hadoop from Amazon Web Services, or the wholly owned, internal implementation of the same from Cloudera?
Updated to include links to Opencloudmanifesto.org.
As widely discussed since Wednesday night's leak of its existence, the Open Cloud Manifesto--originally authored by IBM--has been released for public consumption.
This had been a difficult weekend for the document, first outed by Microsoft's Steven Martin and then leaked in its entirety by my Overcast co-host, Geva Perry, the next day.
The discussion of the document has been muted, in part because the document is not a standards declaration or contract attached to any action or entity. Instead, it serves as a simple statement of principles that almost any cloud participant would agree with--at least publicly. However, the process in which it was brought into existence has been debated ferociously and may signify a changing of the guard in the standards world.
What is perhaps more interesting, however, is the list of signatories to the document. The list below is official as of Monday morning, according to my contact at IBM:
IBM
Sun Microsystems
VMWare
AT&T
Telefonica
Cisco Systems
EMC
SAP
Advanced Micro Devices
Elastra
rPath
Juniper Networks
Red Hat
Hyperic
Akamai
Novell
Sogeti
Rackspace
RightScale
GoGrid
Aptana
CastIron
EngineYard
Eclipse
SOASTA
F5
LongJump
NC State
Enomaly
Nirvanix
OMG
Computer Science Corp.
Boomi
Reservoir
Appistry
Heroku
Note that the "big four" of cloud computing, Amazon.com, Microsoft, Google and Salesforce.com, are not signatories. However, several major players are on it, including my employer, Cisco--as well as EMC, Sun, VMware, and a host of key start-ups and established vendors throughout the industry.
There is a Cloud Computing Interoperability Forum meeting scheduled to be held Monday night in conjunction with Cloud Expo in New York City in which many, if not all of the signatories, and several that refused to sign (including Microsoft) will gather to talk about the future of cloud standards.
This could either be a historic meeting--or the final nail in the Manifestogate coffin.
The document itself is available on Scribd, or as a PDF from the official Opencloudmanifesto.org site or Perry's Thinking Out Cloud blog.
Cloud computing is the first major IT market disruption that has taken place in the world of open source software, "the wisdom of crowds" and the community collaboration revolution of Web 2.0. The concept of the cloud is trying to grow and evolve in an atmosphere in which technologists expect input on the technology they are being asked to rely on, and IT management expects input on the strategies they are being asked to adopt.
Never has that fact been more evident then in the events that have taken place over the last two days. The leaking of the Open Cloud Manifesto is a life lesson in the way that things will never be the same again.
To recap, the buzz began Wednesday night when Microsoft's Steve Martin intentionally leaked the existence of a diatribe created originally by IBM--an Open Cloud Manifesto. The industry proclamation is being supported by a laundry list of cloud service providers and members of the Cloud Computing Interoperability Forum. You can read the document on my Overcast co-host's Geva Perry's Thinking Out Cloud blog.
Since that leak, there has been a steady flow of news, retorts and excited commentary. Remember, the manifesto hasn't even been officially announced yet (look for that news to break on Monday morning)--so everything you've read so far has been pretty much who isn't participating and why.
Let me disclose right now that I was not involved in the creation of the document, nor in planning for its release, but I have been fully briefed through my employer, Cisco Systems, and the CCIF and have read the document. I planned to post my thoughts along with the others on Monday morning, and I'll still cover it in some depth at that time. For now, though, I just want to explore what I learned the last two days. (Just a quick reminder that the opinions expressed here are entirely my own, and not my employers.)
It's an opinion piece, not a standards proposal.
As several people have noted, this is a big deal about something that doesn't set anything in stone, either technically or legally.
Those who have publicly stated that they won't sign have the most to lose.
Microsoft and Amazon are the two cloud powerhouses that have publicly declared they will not sign the document at this time. Amazon has a huge existing install base that most other IaaS providers would like a piece of, and Microsoft is trying to hold on to an exceedingly large customer base of its own. Why should either agree to work on top-down standards to threaten that?
It's probably a bad idea to release even an industry opinion piece without public commentary.
IBM, et al, left the door open for Microsoft to label the entire effort as "closed" by trying to rush to a declaration of success without allowing any public community or industry input whatsoever. Big mistake, in my opinion, because open source software has changed the game forever for technical initiatives.
If the drivers of this initiative had simply announced that the Manifiesto draft was agreed to by the same list of companies, but was open for public commentary before being finalized, the Microsoft post would have looked silly. In fact, there is still time to declare exactly that.
It's what follows that is important here.
The most important quote from the day, for me, is the following from one of the CNET reports:
That said, Martin said Microsoft would like to be a part of the dialogue. He noted that the company was subsequently invited to a meeting of some cloud-computing participants to take place on Monday as part of a cloud-computing conference.
"We have accepted that invitation and we will participate," Martin said. "If there is meaningful dialogue, it is something we will want to play a role in. Hopefully we will use that as a chance to restart that conversation."
The productiveness of that meeting (and, I'm guessing, the civility) will say a lot about what will come of the manifesto. Its great that a large number of companies have (apparently) signed on to express their commitment to open cloud environments, but the actual actions initiated at that meeting--including organization, financial/people commitment, etc.--will go a long way to establishing what they can accomplished.
That being said, let me also note that I'm not convinced that a top-down formal standards approach will do anything other than repeat the mixed success of the WS-* efforts to date. Amazon's EC2 and S3 APIs are already defacto standards (see EUCALYPTUS and Sun's Cloud Compute Service), and Sun and GoGrid have also opened up their APIs in the hope they take some or all of the management standards pie. Already, businesses are out there figuring out some basic interoperability between cloud providers that matter to them: RightScale and their competitors are attacking server image portability in interesting ways, and Salesforce.com has full integration from Force.com to Amazon AWS and Facebook.
So, in the end, this declaration is a good thing in that it shows that the industry has learned that open is good. However, in the end it might not do much more than that, and we might have all gotten into a tizzy over yet another expression of what could be in cloud computing.
There were two very interesting pieces of news to come out in the last week related to the availability of relational databases in the cloud. One involved a start-up you have almost certainly never heard of, and the other involves a major player in on-premise database products.
The first was an announcement to the crowd at "Whose Cloud is It Anyway?"--a "roundtable and meet-up" sponsored by TechCrunch, held Friday on Microsoft's Mountain View, Calif., campus.
(Charles Cooper has more on the "roundtable" portion of the program. My favorite part of the afternoon was the fun comment by Salesforce.com CEO Mark Benioff; he noted the irony of hosting a cloud-computing meeting at the facilities of the vendor most disrupted by the trend.)
During the "pitch" section of the afternoon, Justin Santa Barbara of start-up FathomDB announced that the company has released to beta testing a sort of virtual managed hosting service for "standard relational databases" running on Amazon.com's Elatic Compute Cloud, or EC2, service. (There is a video of the afternoon's pitches; FathomDB starts at about 49:30.)
The start-up's current service simply allows someone to get a basic relational database management system, or RDBMS, instance (initially MySQL) up and running in minutes under its management, with services including creation, monitoring, and backup.
... Read moreOn their blog today, Rackspace's cloud division, Mosso, shows off a study they did where they compared the costs and performance of Amazon Web Service's S3 storage service and CloudFront Content Delivery Network (CDN) against Mosso's combination of CloudFiles and their partnership with CDN provider, Limelight Networks. The blog post presents five common use cases, and compares the cost of CloudFiles/Limelight with the Amazon offerings, both with and without Amazon's support option.
I spent some time on the phone yesterday with Mosso co-founder, Jonathan Bryce, and Senior Cloud Architect for Rackspace's cloud division, Erik Carlin, discussing what they found. The short-short version is that, for the five use cases they analyzed, they claim (not surprisingly) that Mosso beats Amazon's offerings in simplicity, cost and performance, especially when support is taken into account.
... Read moreThe original title of this post was going to be "Why isn't Google App Engine successful?" You see, I've been frustrated of late at the lack of followup press about the PaaS offering since Google's announcement about it last April. I was beginning to think that no one but a few Facebook application providers were using it, which makes it kind of irrelevant for the enterprise.
Compare Google's coverage to that of Amazon Web Services. Since its announcement in July 2002, the various services contained under the AWS umbrella have received a steady stream of press and accolades. Much of that is due to marketing (and the phenomenal technology evangelism program Amazon put into place), but part of it is also that successful start-ups are passing on their own success stories independent of Amazon.
Two quick examples of this are SmugMug and Animoto. Both are stories that were originally broadcast by the customers themselves, and then evangalized by Amazon. Almost everyone in the "cloud-o-sphere" knows about these guys as a result. In fact, Animoto's story is the most prevalent case study of the value of elasticity in Web applications today.
So, where is the Google equivalent? I've heard about a few Facebook widgets being developed on App Engine (and that is sort of cool), but I certainly haven't heard any other type of start-up trumpet the importance of App Engine to their success. Furthermore, there are zero examples of non-Web businesses using App Engine to change the nature of their IT processes. (See Eli Lilly's story for an AWS counterpoint.)
So, all of this might lead you to believe I'm anti-App Engine, or at least not confident that it is important except as a PaaS example. And until yesterday, you would be right. However, I spent the day yesterday at the Cloud Connect conference, hosted at the Computer History Museum in Mountain View, Calif. Google was much more visible here (in part because they were a "platinum sponsor"), and perhaps more importantly, the "how to" sessions they hosted Wednesday afternoon were packed by interested developers and technologists.
... Read more





