Hybrid Cloud With Nimbula and Amazon Web Services

 

Executive Summary

Hybrid cloud and federation to Amazon’s AWS has long been a feature of Nimbula Director and a core part of Nimbula’s strategy.  Nimbula Director specifically provides a single interface for self-service access to public and private resources that unifies workflows, permissions, audit, all the while protecting corporate assets residing in public cloud from misuse by those who no longer require access to them.  The recent news and discussion on API compatibility is interesting to be sure, but practically speaking, is much less important than the actual hybrid cloud functionality provided by IaaS platforms.

Introduction

The recent announcement between Eucalyptus and Amazon about their API compatibility has confirmed the adoption of hybrid clouds. But does API compatibility really provide enterprise ready hybrid cloud on its own?

Having a single API is helpful to anyone integrating software on top of private and public cloud – although supporting a few REST APIs in a modular fashion is not terribly challenging for most programmers.  But what does API compatibility do in terms of helping enterprises manage multiple clouds end to end – the authorization model, audit capabilities, credential management, etc…?

This post describes Nimbula’s view on what is required for enterprise hybrid cloud and how we meet these needs with Nimbula Director and have been doing so since we released version 1.0 in March of 2011.

What Is Needed In A Cloud API For The Enterprise

An obvious question is why does Nimbula have a different API from AWS in the first place?   Isn’t AWS the de facto standard in IaaS APIs?  The answer is that Nimbula chooses to not constrain its capabilities based on the feature set of a single cloud provider.  Nimbula’s API exposes our differentiated functionality – functionality that we believe is required for adoption of cloud throughout an enterprise. Some of this functionality includes:

  • An enterprise identity and authorization system
    Nimbula Director offers a multi-tenancy model where each tenant has an administrator that can manage the tenant’s users and groups.   Groups are hierarchical collections of users defined so that permissions need not be assigned on a user by user basis.  Each action on each object can be delegated or not to any user or group inside or outside the tenant as collaboration requires.  Furthermore, the system is flexible enough to allow multiple layers of delegation (e.g. from cloud admin to tenant admin to team lead to end user). This provides security and flexibility in a way that closely matches enterprise needs.
  • Quality of service tagging of equipment and workloads
    Nimbula Director allows tagging of compute and storage equipment based on its capabilities.  For example, cloud administrators can tag some servers as “>2GHz CPU” or some storage as “DR protected”.  These tags can then be made accessible to various tenants, users, and groups via the permissions system.   In this way, tenants of the cloud can decide who can use what type of equipment based on the known needs and value of the work of the end users.   This keeps down costs as less critical work can be pushed to less capable equipment without sacrificing performance for critical work which can be directed to more capable equipment.
  • Quotas & accounting
    Users and groups can be given permission to charge their work against one or more accounts and draw against tenant, personal, or project quotas depending on what they are working on with each provisioning activity.  The separation of permissions, billing, and budgeting and the management of all three at a fine grained level allows for real enterprise management of cloud activity and resources.  It is also crucial for making self-service IT work within the bounds of enterprise accounting practices.
  • Flexible networking model
    While Nimbula Director offers a flat layer 3 network option with an extremely robust set of network services like self-service firewall, NAT, and DNS, enterprise and government customers sometimes need layer 2 networking as well.  Nimbula Director allows the allocation and deallocation of layer 2 networks to tenants via the permissions system.  End users can use one or more of these layer 2 networks with their instances as needed.
  • Cross-cloud orchestration
    Nimbula Director offers the ability to define, launch, monitor, and heal an application that spans multiple sites and clouds.  This includes management of instances, storage volumes, persistent IP’s, and network security rules between the various locations.

Hybrid cloud requirements

Given a private cloud with one API and one set of capabilities and a public cloud with another, how do customers rationalize the two to get a seamless experience across both while getting the proper benefits from each?  At Nimbula, we believe there are four key requirements, all of which are implemented by Nimbula Director.

The basic implementation mechanism for all four requirements is the layering of our enterprise cloud API and our authorization and permission system over the public cloud API and proxying commands from our API to our federation engine, to the public cloud.

Keeping public cloud credentials from end users

The end users of an enterprise hybrid cloud should never have access to the credentials of the public cloud.  When an end user leaves their employer, or simply switches to a team that is not authorized for public cloud, they should no longer be able to access the enterprises’ public cloud resources.

Nimbula Director manages this by giving the enterprise cloud admin the ability to setup the credentials to the public cloud and determine which tenants, users, and groups are allowed to indirectly leverage those credentials.  When end users use the public cloud, they do so by proxying all requests through the Nimbula Director API and federation engine.  In this way, they do not own the public cloud resources, the enterprise does.  The end users use only the resources allowed to them by Nimbula Director’s enterprise grade permissions system and only for as long as the business allows.

Single security model

In an enterprise hybrid cloud model, there is a need to restrict which cloud users can use the public cloud and which cannot.  The cloud and tenant administrators need to be able to restrict access to specific users and groups, adding them when they are on projects requiring public resources and removing them when that access is no longer required.  This security model must be unified across public and private resources to keep the process simple and secure.

Nimbula Director’s authorization system overlays on top of public cloud sites and objects so that end users are granted access to public and private resources in exactly the same way.  Administrators can use Nimbula Director’s permissions system to achieve the benefits of its granularity, flexibility, and selective collaboration for public and private cloud resources.

Single audit trail

In enterprise hybrid cloud, there should be one way across public and private sites to see who has permissions on what and who granted those permissions, who has provisioned which resources, and what calls have been made with what results.

Nimbula Director, by virtue of proxying all public and private commands through its API is able to provide this single point of audit to enterprise customers.

Single set of end user workflows

End users should have a single way to access cloud resources, be they public or private.  They should be able to use the same workflows for maintaining and managing images, launching and deprovisioning instances, and managing public IPs and storage resources across providers.

Nimbula Director, by virtue of proxying all public and private commands through its API is able to provide this single set of workflows to enterprise end users.  It is only a matter of changing a single flag in a command to direct the work to a different site or provider.

Conclusion

Nimbula believes that we provide the most complete hybrid cloud story of any IaaS platform provider.   Our more enterprise focused functionality and API along with our ability to proxy those capabilities over public clouds like AWS make for a complete enterprise hybrid cloud story that can be taken advantage of today.  API compatibility is nice and makes for great press releases, but it’s all about the functionality and whether your cloud has it or not – APIs can be rationalized with simple shims, but platform functionality does not materialize without support from the underlying platform.  When mapping out your hybrid cloud strategy, make sure to look for what will support your enterprise business needs in the long term rather than focusing on short term convenience.

Implementation Matters

At the recent Cloud Connect event in Santa Clara, a new theme for debate emerged: Open. From the tone of the conversations, it feels that “Cloud” has been talked about enough and the new buzzword for conversations is “Open”. While it is certainly a good point to discuss, I want to reflect on what seems to matter most to our customers: software that works out of the box as advertised, implementation and delivery of the promised features! We want to provide our customers with an Amazon EC2-like infrastructure cloud and believe we are delivering on it!

We have worked with a number of customers over the past few months and as they have purchased and deployed our software, here are the points they mentioned as important to them.

  • Easy infrastructure cloud deployment: visit the nimbula.com website, download software, load it and it works! No need to speak with someone in sales or go through a gatekeeper. Some of our customers came out to us when they were ready to place their order and only had a few questions left. They ran the evaluation and bake-off against other platforms on their own, referring to our extensive technical documentation! Who else gives you such an easy and direct access with a complete high quality set of documentation?
  • Complete end-to-end automation, right from installation to deployment: Nimbula Director’s API has enabled our customers to automate a great number of things and gain precious time. We decided early on to put more effort in a complete API than a shiny UI. Our customers have valued that choice and it has been great hearing from them: one of them, using their Nimbula Director private cloud for development and testing of new applications claims close to 90% time savings for one test suite alone.
  • Solid security model: Nimbula Director delivers a security model that allows for very secure collaboration between tenants. A recent project in the Government space with a very security conscious customer helped us validate our model and demonstrate our claims.
  • No hardware shopping list: Nimbula Director works on commodity hardware and you do not need to purchase specific equipment to get a good experience and be able to deploy it. We harden the operating system and hypervisor (shipped as part of our software) to bring out the best possible performance and security on any hardware platform. As a matter of fact, one of our customers shared the hardware they had bought and I could not recognize any of the names: pure commodity white box vendors.

Now back to the debate: Open is clearly important and I believe that means more than just Open Source. Open is also about platforms which have openness to extensibility and manageability. It is about giving a choice to the customer: use our software as long as we provide you with a great product and customer service. If you are unsatisfied or think another platform is better, you should be free to migrate and not be locked into our software. I do not think the debate is about open source or not. It is about value to customers in the long run.

But let’s not dwell on it because while debates are good and generate noise and buzz, there is much work to do. We want to keep our head down and focus on what we do best: innovate and deliver a great software experience that works out of the box. We are focused on strong core guiding principles for our development, our co-founder speaks best of them.

So don’t take my word on our software quality, give it a spin!

Choosing A Cloud Software Partner

This is the third in a series of four articles discussing infrastructure as a service (IaaS) clouds. The series started at basic level setting and we will now begin diving progressively deeper. The topics for the series are:

1. Cloud 101
    – What is cloud
    – What value should cloud provide
    – Public, private, and hybrid cloud
    – Starting on a cloud project
2. Application taxonomy, what belongs in the cloud, and why
3. What you should look for in cloud infrastructure software
4. Evaluating different approaches to cloud infrastructure software

The concepts section describes architectural design points you should ask vendors about to make sure that the they are thinking like a true cloud provider and not simply cloud-washing older technology to try to be relevant in a new world. Keep in mind that these concepts are about infrastructure management in general, not just compute. You should think about the storage, network, and power aspects of your cloud in the same way.

The specific functionality section lists cloud features to check for.  If too many of these are missing, the cloud value proposition will not be delivered.

Core Concepts and Philosophy

Scale

In a cloud, scale is the key to long-term success. The number of nodes and instances, simultaneous connections to the management system, the networking and security features, etc. all need to scale. For each and every exciting and valuable feature a cloud vendor touts, you need to ask, “Can I have tens of thousands of those? What is the experience at that scale? When, if ever, does the scale impact the end user and how they do their work?”

While one can deploy a small-scale cloud, if a cloud is successful, it will become a single pool of capacity for an entire organization or even multiple organizations. In fact, the larger clouds scale, the more cost savings and value they generate since you start to see the benefits of  “the law of large numbers”. If you do not build for scale from the very beginning, you will hit a wall and need to create separately managed clouds. This will force end users to decide what workloads go on what clouds, thus the frictionless self-service model is broken. Furthermore, capex benefits will be lost as you are forced to overprovision each cloud fragment rather than benefiting from a single pool hosting many applications with offsetting resource consumption curves.

For example, in a non-cloud deployment, the datacenter management system is used by datacenter admins only. If it can only deal with tens of simultaneous connections and is limited to one or two nodes, there is no problem since the administrative team is relatively small. However, in a cloud, since a large number of end users drive the management system directly via self-service workflows, the management system requires a whole new level of scale.

Automation

Automation is the key for allowing end users to do their own work and also for lowering datacenter operation costs. Make sure there is a proper degree of automation for both application lifecycle operations and infrastructure operations.

The core principle for end user operations is that no end user task should ever trigger work on the datacenter administrator side, not even a single mouse click approval. There is still a high degree of control and protection required, but these controls must be implemented as up front policies where the right groups of people delegate the right privileges to the right consumers. Furthermore, there need to be audit trails so that one can show that the policies constrained people to the proper activities. However, none of this changes the fact that manual approval processes by central admins on regular daily end user operations cannot work in a cloud model.

The core principle for datacenter operations is that the cloud should be self-discovering, self-organizing, self-monitoring, and self-healing. Anyone that sells you a complete zero-touch datacenter today is certainly exaggerating, but you should check the features they have to make sure this philosophy is followed where technically feasible. Where manual intervention is needed, make sure that this intervention is required only for infrequent up front tasks and never for frequent operations that happen on a frequent basis.

As an example, initial cloud configuration and network setup may be items that require significant up front planning and hours to days of setup, but regularly growing the cloud deployment by adding nodes must require no more time and effort than it takes to rack the systems and plug them in.

Identity, Permissions and Delegation

Clouds need to understand who each user is, what groups they belong to and what customer or tenant their work is billed to. Each operation on each object needs to check the identity of the actor against the permissions system to make sure that the operation is allowed. Delegation then needs to be possible – from cloud admin to customer admin to end users and groups, and possibly between separate end users and groups. Without a strong concept of identity, permissions and delegation, your cloud will only scale to a single tenant and will never fully interoperate well with other clouds, thereby limiting the long-term benefit you derive from the system. Like scale and automation, this is a core design choice.

If cloud vendors do not have proper permissions systems for their objects or are lacking a way to delegate permissions through multiple levels, they are not thinking like a cloud vendor. The result will be trouble down the road as end users wind up having to place tickets to acquire permissions driving a heavyweight approval process where the owner of the resource and the end user’s management team need to be consulted.

Openness and Choice

Openness and choice mean that you have:

  • Independence at each layer: Your different cloud components are not locked in from end to end. A choice at one layer does not dictate a choice at another unrelated area.

    • Your choice of end-user self-service workflow management should never dictate your hypervisor or other core infrastructure component.
    • Equally importantly, your private cloud software should never dictate the choice of public clouds to which can federate. Your end-user provisioning interface should work on your private cloud infrastructure, any public cloud using the same cloud software, and even any public cloud that uses competing or homegrown cloud software. Having to present different interfaces to your end users for clouds using different cloud infrastructure components is not open.
  • Complete and open APIs: Your cloud vendor should have extensive APIs. At the very least the APIs should cover everything provided in the UI. This will allow customized workflows at both the infrastructure level and the end-user level.
  • Extensible components: Your cloud vendor should use open and extensible components where possible.  Open source, where anyone can insert code at any point, is the extreme example of this principle. In non-open source systems, there are ways to introduce more controlled, but still extremely flexible extensibility models.  For example, major components can be general purpose enough that customers can add in other ecosystem products readily, as with the Linux domain 0 model for hypervisors. Alternately, APIs can be made to be robust and complete enough so that most conceivable useful integrations are possible such as the case with Windows APIs. This makes a big difference as you try to augment your cloud with best of breed 3rd party cloud management products.
  • Standards: Your cloud vendor should take advantage of open standards where possible where those standards do not unduly constrain innovation.

Without openness and choice you risk vendor lock-in and the high cost that comes from not being able to have a meaningful option to replace an infrastructure component. Technology lock-in slows down the rate at which you get new features you request from your vendor. A limited ecosystem and an inability to augment your cloud with the latest and greatest offerings from companies both new and established or from the open source community further limits your ability to improve your cloud over time. Lastly, limited choice in public clouds to which you can federate may force you into a cloud with the wrong feature set or that is too expensive.

Key Functionality

When evaluating the base functionality described here, make sure to bring in the philosophies above for each. Make sure that every feature below is implemented with scale, automation, permissions and delegation, and openness in mind.

In the spirit of openness, it is key to recognize that it is not required, nor even desirable, for all the features here to come from the cloud software vendor. Cloud software vendors should be able to present you with an ecosystem that helps fulfill the requirements below.

Self-Service Developer and Deployment Workflows

  • This is the core of the concept of cloud. End users need a way to do the following on their own:
  • Manage their images, update, and version them.
  • Publish images to a selected community for use in deployment workflows.
  • Deploy images and configure the following runtime parameters:
    • The number of instances
    • The images used
    • The placement policy
    • The network connectivity
    • The application configuration
    • The resource allocation
    • The storage to mount
  • Scale applications up and down and retire them when their usefulness has ended.

Reliability and Scale of the Management System

With cloud, when we think about the management system, we’re not just talking about basic monitoring. We’re also talking about the whole datacenter control system – how workloads are deployed, managed, and retired. In a non-cloud datacenter, the management system is for datacenter admins only. If it can only deal with tens of simultaneous connections and is limited to one or two nodes creating a single point of failure compromising reliability, there is no problem since the administrative team is relatively small. However, in a true cloud, with end users driving the management system directly in self-service workflows, a whole new level of scale is required.

The management system of a cloud needs to be able to scale to handle thousands of simultaneous connections.  Furthermore, it can never be down. It needs to be self-monitoring and self-healing. When and if a management node is lost, the remaining management nodes need to continue operation, and the lost management node needs to be replaced from the remaining equipment in the cloud so that the degraded state is resolved. All of this needs to happen automatically without impact to the end user or intervention on the part of the administrator.

Multi-Tenancy and Networking

For clouds to be of use to anything more than the smallest organizations, robust and secure multi-tenancy separation is required. This involves a great deal of network functionality.

Some customers will require traditional separation at layer 2 like VLANs. Those networks need to be managed flexibly and securely.

  • The cloud needs to allow for the layer 2 network to be exposed to large sets of nodes without difficult configuration.
  • The cloud needs to make sure that there is a security system governing access to each layer 2 network so that only the right workloads from the right customers are placed on a given network.
  • The Layer-2 network should provide the equivalent of a broadcast domain and support all traffic types (unicast, muliticast and broadcast). It should also support IPv6 and any other layer 3 protocol, not just IPv4.

Other customers, who do not want to be limited to the scale and manageability of layer 2 networks and that do not need a broadcast domain may choose to have a more modern and flexible large flat cloud network with an integrated distributed firewall providing the isolation between customers’ workloads. This distributed firewall service needs:

  • To have a central configuration repository that ultimately informs the separation between workloads created by different customers as well as between workloads within individual customers.
  • To be configurable by the infrastructure administrator, the individual customer administrators, and even the end users who need to control access to their own work product within their organizations.
  • To have its configuration be independent of workloads and IPs – adding and removing workloads must not cause reconfiguration of the distributed firewall service.
  • To execute in a distributed manner that avoids network bottlenecks.
  • To be independent of server, building, network vendor, network topology, and even geography.

Storage Management

Like compute, storage needs to be aggregated into large pools for access by end users so that they are hidden from the details of the different storage devices and what objects are placed on what storage device.

Unlike compute, there are widely varying capabilities and prices for different storage devices, so some aggregation system is needed to create pools with different service levels where customers can decide the capabilities they need and are wiling to pay for to store their storage objects. This customer decision then drives automated pool selection and ultimately, device selection.

Like with all other cloud resources, the end-user created storage objects need to be created through self-service workflows without administrator interaction, but must also be governed by a robust permission and delegation system governing which storage can be used by which users.

The storage objects created by end-users in those pools need to be managed independently of the instances that mount and access them. This way, creating, updating or deleting workloads does not affect the core information the customer needs to preserve over time. Workloads that create data can be killed, redeployed from an updated template and reattached to storage without impact to the storage object itself. Also, storage objects should be able to be cloned or snapshotted for use by future instances or for rollback processes without any interaction with the running instance accessing it.

Billing and Chargeback

Core to the economic model of cloud is the ability to have end customers either pay for their usage or at the very least to understand their impact on datacenter costs. To that end, there needs to be complete metering APIs and a chargeback or showback system for the cloud.

Hands-Off Infrastructure Management

Management of the physical infrastructure should be as low touch as possible. This includes many aspects:

  • Installation of the nodes: Node installation becomes a frequent operation in a big, fast-growing, and/or mature cloud where parts need to be replaced regularly. Manually installing or configuring servers will be too expensive and error-prone in this world. The only proper experience is for the servers to be racked and connected, then powered on – and nothing else. The cloud needs to auto-discover the server, install it, and make it ready to accept workloads.
  • Intelligent workload placement: Workloads should be automatically, and without administrator involvement, placed such that they are:
    • Loosely packed enough that bottlenecks and performance problems are not generated as dealing with those problems reactively will be problematic at scale.
    • Tightly packed enough that hardware, power, and cooling are not wasted.
    • Strategically placed so that related workloads cohabitate for enhanced inter-workload communication and that redundant workloads are separated to eliminate single points of failure for the service.
    • Placed based on constraints such as the requirement to be on a node with GPUs or a node that is certified PCI compliant.
  • Capacity tracking: There needs to be cloud-wide tracking of resources so that the datacenter operators are aware of cloud capacity and when they need to acquire more hardware.
  • Isolating and retiring equipment: All systems should have a lifetime and a health status associated with them (due to length of maintenance contract, expected lifetime of component parts, and/or length of lease). When that lifetime is exceeded or when the part is failing or has failed, it is automatically isolated from the cloud and flagged for replacement. The cloud should be aware of datacenter layout so that administrators never have a problem locating the equipment at replacement time.
  • Managing planned and unplanned downtimes within the datacenter: If you generally deploy cloud-ready applications (see last article in this series), most datacenter events should be transparent to the end users of the services. Scale-out applications can be scaled up to repopulate lost instances and chunks of huge compute jobs can be automatically respun. However, downtimes associated with persistent data need to be managed as well as compute or network downtimes that affect any of your more monolithic applications. The datacenter should recover whatever it can on its own, and for what it can’t, end users need to be alerted to upcoming planned downtime or recent unplanned downtime and made capable of adjusting their workload deployments accordingly.

Federation Across Clouds

To provide a single interface for all end-users, a cloud must hide distinctions between different datacenters, geographies and providers. There should be one end-user experience for deploying to anywhere in the cloud – public, private, or hybrid. While end users, due to compliance reasons, may need to dictate placement policy in terms of location or provider, it should be a policy component of their work within a single experience – it is not acceptable for there to be a private cloud experience and completely separate public cloud experience.

It is critical that the choice of public clouds to federate to is not forced by the cloud software provider, that would be too limiting. By not allowing the customer to pick the best possible provider at the right cost, cloud deployment will become unnecessarily expensive.

Federation features need to be as follows:

  • From a single user interface, resources can be deployed and managed across multiple sites and providers.
  • Each site and provider can be allowed or denied on a per customer or per user/group basis.
  • When accessing public clouds that are tied to shared credential and billing information, like private keys and credit card numbers, the end user must have that information hidden from them. They don’t need to know it and they must not be able to take it with them when they change jobs or roles.
  • Identity is preserved across sites and providers so that:
    • Users are permitted access to resources at each site according to their specific permissions.
    • Bills from public cloud providers can be itemized by department, user, and project.
  • There needs to be a single audit trail showing who did what activity in what clouds.

When a cloud management system follows these rules, multiple sites within an organization can be managed as one. Furthermore, hybrid cloud becomes a reality with public cloud becoming a viable part of the IT toolkit, not a bootleg process hidden from the visibility of those most trained and responsible for keeping services safe and secure.

Conclusion

Enterprise IT departments and service providers have no shortage of choices today for cloud infrastructure software. But, for an organization doing a significant deployment, the list of requirements above can help separate the serious contenders from less mature or well thought through products.

When writing RFP’s don’t just fall back on the same enterprise management or virtualization platform requirements or you will wind up with the same old infrastructure. Start with your traditional requirements, but make sure to add in critical new requirements around what is really needed to take the next step and have a real cloud today!

What Is Best Suited For The Cloud

This is the second in a series of four articles discussing infrastructure as a service (IaaS) clouds. The series started at basic level setting and we will now begin diving progressively deeper. The topics for the series are:

1. Cloud 101
- What is cloud
- What value should cloud provide
- Public, private, and hybrid cloud
- Starting on a cloud project
2. Application taxonomy, what belongs in the cloud, and why
3. What you should look for in cloud infrastructure software
4. Evaluating different approaches to cloud infrastructure software

Cloud is obviously a serious transformation for your datacenter, but that transformation does not need to be far off or futuristic. If you pick the right applications owned by the right users with the right needs and if you partner with the right cloud software provider, cloud and its many benefits are achievable today.

When first deploying a cloud, it is critical to choose the right applications to move to the cloud initially. Given that cloud is about allowing business units to manage their own computing needs, it follows that the ideal place for cloud is where the business units:

  • Expect self-service.
  • Are tolerant of incorporating provisioning logic into their day-to-day work.
  • Have variable compute needs for the tasks that generate the need for frequent provisioning and de-provisioning activities.
  • Have multiple tasks that cause context switching in the customers’ work, with different tasks acquiring and yielding compute resources over time.
  • Have workloads that, even at maximum interaction, do not cause complex resource contention issues on shared resources so that as much or as little can be deployed as necessary with little if any forethought or calculation.

If you break down the types of workloads in a datacenter, you get three major types:

  • Traditional, monolithic, and stateful client/server apps – things like Exchange Servers and traditional databases
  • Scale-out load balanced apps with disposable stateless instances
  • Batch type computing jobs that can be decomposed into small chunks of compute and storage and distributed across a pool – things like Hadoop, Monte Carlo simulations, business analytics, and media processing and conversion

For each class of application, there are dev/test deployments and production deployments. This gives us a simple six-way classification of what runs in a datacenter that we can use to select our cloud candidates.

The following chart shows which workloads are good for early cloud adoption and what should be dealt with later on in the process. Beneath the chart is some explanation and justification for this assessment.

(Gray represents near-term opportunity and orange represents something that should be sent to the cloud later on)

Scale-out Load Balanced Apps with Disposable Instances

This type of application does not rely on any one instance of software being able to grow to use tremendous amounts of compute resource. Rather it assumes that each instance can only do so much and that additional power will be supplied by adding more instances. For this sort of application to work the data and other application state must be driven out of the application itself to another location.This allows instances to be created and destroyed with no data loss or outage to the end user. The worst problem caused by failure of an instance is the need for the end user to retry their operation. When the operation is retried, it finds a live instance and completes without difficulty.

Generally, the application requires greater or lesser instances over time as load grows and shrinks. This creates a need for dynamism in the deployment with low turn-around time on provisioning operations. As load grows, the application must respond in a short period of time. Either the application administrator or some auto-scaling management system needs to be able to make the required changes without a ticket to the infrastructure team. When load is high, new instances must be spawned to handle the increased demands. When load drops, instances should be deleted to reclaim compute resources for more useful activity. Because the instances are stateless, without sufficient load that needs to be serviced, no instance has an inherent reason to remain persistently deployed.

Scale-out applications are also a great fit with the datacenter operations models required to manage a massive cloud.  Cloud scale datacenters need to assume that any piece of equipment can break at any time with the application being resilient and able to hide the failure from the end user. Scale-out applications accomplish this by spreading many instances across nodes, datacenters, or even geographies. This makes it much simpler for the infrastructure to be managed – failures become a capacity issue which can be managed in aggregate on a periodic basis, not end-user outage issues that need to be addressed immediately and individually.

More generally, this type of application is the way of the future in general. Due to current systems architectures, we are seeing more and more cheaper cores distributed across many commodity nodes rather than the massive scaling up of individual servers. The only way to really get applications to scale is to build for many smaller instances teaming up together. Some programming infrastructures take this to a logical conclusion. Node.js, a language increasingly used for scale-out web applications, refuses to give a program access to more than one core regardless of how many cores are in a system. Should the developer need more power, they need to increase the number of node.js application instances.

Scale-out applications and cloud are such a good match because they have the same goals – elastically, scale up and down in response to load using only the resources actually needed, distributing applications across hosts and sites to better tolerate systems outages without end user impact and better conformation to modern system architecture.

Batch Applications

Batch applications are decomposable to smaller compute and storage packages. They are good in both test/dev and production for similar reasons as the scale-out applications. As jobs are launched, the number of instances depends on how fine grained you can chunk the job. Given different sizes of data sets for separate runs, the number of instances needed for each run will vary. Therefore, statically provisioning instances doesn’t make sense. Furthermore, there are likely to be completely different applications in such an environment that will need to run at different times. Clouds are perfect for repurposing an infrastructure, ramping up one application while ramping down another without having to do a massive retooling of the physical infrastructure. Like with the last case, the infrastructure administrators should not decide how many and when to deploy instances. Activity needs to be driven by the business that is actually running the jobs and deriving benefit from the output.

Traditional IT Applications

Traditional IT applications, vertically scaled, stateful and monolithic are not good for clouds in their production deployments. These applications tend to be custom built for a particular load in mind, deployed once, expected to have each instance stay running persistently, and are only touched for upgrade. These types of applications are not a natural fit for the cloud paradigm. They do not accommodate dynamic scaling by simply adding more instances. As a result there is a reduced need for end-user self-service.  Also, since they are monolithic, any performance or availability problem must be addressed in the one or two instances that make up the application with a deep understanding of the impact of the underlying infrastructure. As a result, the burden of management remains on the infrastructure administrator and cannot shift the application administrator. This breaks the operational model of cloud where the datacenter administrator needs to be removed from the details of the running of applications.

However, traditional IT applications are very well suited to cloud when the service is development, integration, and testing of the applications. For this function, application owners need to deploy new instances of server software, update them, make new templates, try out their work and iterate. Furthermore, mass numbers of clients will need to be deployed for load and scale testing. The development and testing process generates the dynamism in workload resource needs as well as the requirement for self-service on the part of the IT developer that make a cloud an ideal solution.

Pick the Right Applications and Move to Cloud Today!

Too many times cloud is pitched as an evolutionary technology. In many cases, this is because the vendor making this pitch is already managing a legacy application stack for the customer and sees no reason for a radical shift.

Since these legacy applications do not accommodate elasticity and do not tolerate the more unpredictable availability of any single server that the cloud datacenter operations model implies, true clouds are limited in the benefits they can provide and cause a loss of SLA that is unacceptable to the end users of the legacy applications.

Of course, legacy applications will not go away any time soon and we acknowledge that it takes tremendous time and effort to move to a new programming paradigm. But, the technology is here today and the benefits have been made obvious – scale, resiliency, efficiency. The success stories of companies like Netflix and Zynga are well known. All that is needed is the will to move in that direction.

For enterprises and service providers that leverage modern applications development process, cloud is not an evolution at all – cloud is the best and most obvious way forward for development, testing and mass deployment of their applications.

Pick your target applications and get started today!