The Cloud Ecosystem

Nimbula’s co-founder and VP of Products, Willem van Biljon, spoke at the recent Cloud Connect event in Santa Clara. Here are some of the points Willem made during his talk. The video of the full talk is available online at http://bcove.me/6fllnnzg

Building a proper cloud, whether it is a private or public cloud, is more than buying and implementing a product. It is a rather complex architecture with many interrelated pieces that need to be considered. Ultimately, it is a about a whole bunch of things that need to work together.

So, what is involved to make this work?

  • Compute and Storage hardware
  • Networking infrastructure
  • A Cloud Operating System, something that will make all of the infrastructure accessible to the outside world. 
  • On top of that, the various services that people are going to need (PaaS, SaaS, etc.)
  • Alongside we need some management infrastructure, billing, external storage or compute resources, etc.

So overall, it is a pretty large ecosystem and many vendors and products come into play.

The Infrastructure as a Service (IaaS) provides the software that gives control of hardware layer. Just like traditional Operating Systems, but with a large set of hardware. The issues we think are important are:

  • Scale: lessons learned from large scale matter at any scale. Large properties like Google, Amazon or Yahoo learned lessons that we can apply to all data centers
  • Automation: low costs implies low human touch
  • Resource management: who gets what
  • Permission / policy management: who can get what

If we look at the hypervisor, the first lesson is that the hypervisor is not the Cloud OS. It is an essential component, but not all of it. In particular, it does not provide resource management across multiple machines. The hypervisor market is rapidly maturing and one should not build applications or a cloud architecture that rely on a specific hypervisor. 

Large enterprises have shown that commodity hardware can lower costs. The magic is in the software, not the hardware: design the application for commodity hardware and you can dramatically lower costs.

In the network, as applications are no longer bound to specific servers, the topology no longer defines security. The network security now needs to be configured automatically and managed dynamically. 

How do I federate to other people’s cloud – whether private or public? There are a number of key challenges around the API, the identity that I need to present, the data that I need to move and the application environment in which the virtual machine will execute. Of all of these, identity is probably the main challenge to address. 

Billing is about getting money back for the resources that are consumed. It generally breaks down to three elements: Firstly you need to be able to properly measure and meter what is used, secondly to assign proper rates to the various resource elements and finally to generate a bill. The important  elements is finding and assigning the appropriate rate for a given resource – that is where data is transformed into business value.

There is a massive amount of data on enterprise systems today and there is an equally massive opportunity to re-architect that storage to use cheaper systems. There is no simple, one-size-fits all answer. The key is balance and figure our where do you need today’s high end enterprise storage and where do you need the lower cost and highly scalable newer storage systems.

So in conclusion, the cloud ecosystem has many components and many issues per component. We believe that one should start by focusing on the key issues per component and find the right answer for each part.

Taking Advantage of Public and Private Clouds Requires the Right Cloud Management Software

Cloud computing is just a few years old, but already has given rise to two separate approaches and architectures; one public, like Amazon’s Web services, the other private, usually inside a corporate data center. Computer users assigned to business units are attracted to the direct access and easy provisioning of the public cloud, since servers can be up and running in a few minutes. IT organizations, on the other hand, value the security and control they associate with private clouds, and worry about the proliferation of public cloud instances and its potential impact on corporate data and security policies. It’s a familiar tug-of-war.

Successful businesses have lately come to realize that both public and private clouds have advantages, and want to make able to use both of them when appropriate. Consider Intuit, the software company does the load testing for its online TurboTax program on servers at Amazon; because real customer data is not being used, there are no regulatory or privacy issues. However, once the software is made available to the public it runs on Intuit’s on-premises machines, as one would expect for information of such a sensitive nature.

Being able to move between public and private clouds in this manner requires the right kind of cloud management software, a true “Cloud Operating System” that doesn’t take a one-size-fits-all approach to cloud architecture. Instead, it must make use of, when appropriate, the growing number of cloud technologies the marketplace is accepting.

In a properly designed Cloud Operating System, an application runs in either the public or the private cloud depending on the application itself, in connection with company policies. These policies might involve, for example, the kinds of data the application uses, or the extent to which the application is mission-critical to the organization.

The actual placement of an individual application’s workload in either the public or private cloud should occur automatically and transparently to end users. Be they in IT or in business units, users should concern themselves only with choosing the proper policy for the workload. Cloud management software should then take over, determining where precisely in the public-private cloud ecosystem the program will run.

This means that to be effective a Cloud Operating System software needs to shield users from the multitude of different command systems they currently need to master to move between public and private clouds. Instead the software must present a unified user experience, with the same authorization, the access control and interfaces regardless of the workload’s final destination. Users can focus on their workload needs using credentials set up centrally by IT. That protects the enterprise from employees disclosing their credentials to others, or worse, taking them with them when they leave the organization.

A Cloud Operating System must also give users a painless way to move data and applications back and forth between public and private clouds. That’s a seemingly straightforward task, but one whose current complexity routinely leads to lengthy and unexpected delays in what IT workers had assumed was going to be a straightforward migration process.

So how might this hybrid public-private blend architectures play out in an enterprise? Traditional mission-critical ERP programs are less likely to migrate to new cloud infrastructures, just yet. That’s because these programs have strict requirements for stability and fault tolerance and their data is subject to stringent regulatory and compliance regimes. In addition, the programs themselves do not require the constant changing and updating that can occur so easily in a cloud environment. ERP customers are much more concerned about keeping the programs running stably than they are with making daily adjustments to the underlying infrastructure. While mission-critical workloads won’t be the first ones that IT will move to cloud infrastructures, they will clearly be candidates for the private cloud in the second phase of cloud adoption.

By contrast, programs built on new generations of Web-based development environments, such as Ruby on Rails, are perfect candidates for internal clouds right away. Whether you are in a development and test environment or beginning work with a new Platform as a Service or Software as a Service offering, a Cloud Operating System technologies will make possible a new level of agility and flexibility into your organization. You can scale your infrastructure as fast as you can stack racks of hardware without having to bother with the lengthy server provisioning cycles once associated with IT deployment.

Of course, you can also use third party cloud resources like Amazon to complement your own infrastructure when doing so makes sense. Intuit used the cloud for testing; some companies move to the cloud to meet seasonal demands, or to run one of the many commercial SaaS offering becoming available. Cloud management software can transform the public cloud from a rogue resource snuck in the back door by business units trying to circumvent IT and make it instead a viable business tool, properly integrated into an enterprise’s systems.

There are a few more things that IT managers need to be aware of when choosing cloud management software besides its ability to handle both public and private clouds. Has the software been designed from the ground up to deal with the complexities of today’s computing environments or are those features bolted-on as an afterthought to software initially designed simply to set up virtual machines? How much does it automate the time-consuming, repetitive manual tasks often associated with creating and configuring virtual machines? And can it scale up as effortlessly as modern IT operations are discovering they need to?

IT managers will need to deal with those issues, too, as they make a decision about cloud management software. But at the very least, they need to make sure that when they ask a cloud management vendor if they are public or private, the answer they hear back is “Yes.”

2011 Prediction – Clearer Skies Ahead as Vendors Deliver on the Promise of Cloud Computing

The word cloud was everywhere in the high tech industry in 2010. The incredible rise of Amazon’s public cloud offering and their success stories drew record interest from customers and technology providers alike. We saw everyone in the latter group start to “cloudify” their marketing. If you did not have a cloud strategy, you had the risk of falling behind. The race to “cloud” created a lot of confusion and there were very few, if any true cloud computing deployments beyond Amazon’s success stories.

As enterprise early adopters wanted to bring these benefits to their infrastructure, they started looking under the covers of the generally available private cloud offerings on the market from startups and the established virtualization and management leaders and found that those vendors couldn’t deliver on the promise of cloud computing because their solutions were not designed and built for cloud requirements and scale.

In 2011, I believe we’ll see new innovative vendors deliver private cloud solutions built from the ground up to deliver cloud benefits to enterprise customers. This will help “clear” the skies and what were just trends, ideas or initiatives this year, will start becoming real and tangible in 2011. As a result, we’ll see an acceleration of cloud computing deployment and usage beyond the current Web 2.0 world in traditional enterprise and service provider data centers.

This will be a perfect opportunity for IT to turn the tables and become an engine of innovation again. Cloud computing technologies will help with the management of data center infrastructure, which has become one of the top challenges in the enterprise, and in turn allow IT to focus on delivering new applications to the line of business. While virtualization has already brought a lot of efficiency gains to IT, there are still a number of missing pieces needed in order for IT be more agile and on the side of building competitive advantage rather than a cost center. 

One of the first major steps in that direction will come from automation. Current virtualization implementations still require numerous manual steps and that is neither efficient nor scalable. Automation is the next logical step and eliminates human errors. Automation should start from the moment you start installing the infrastructure and allow it to scale up as fast as you can stack equipment. A few minutes of manual interaction per machine results in loss of efficiency and an increase in the potential for human errors as your infrastructure grows.

But automating the build up of infrastructure should only be the first step. As you manage the infrastructure and build applications on top of it, automation will keep gaining foothold so that tedious, error-prone but well understood processes can be achieved with the maximum efficiency possible.

And this efficiency will increasingly be achieved on commodity hardware and software. In 2006, Alessandro Perilli covered the launch of Amazon’s Xen-powered virtual data center on demand, Amazon’s public cloud offering, and highlighted that he had expected VMware to be the first to launch such a service and not Amazon. But Amazon innovated with a new approach and they were not the only ones doing so. Over the past years, giants like Google, Facebook and others have demonstrated that you can build and deliver world-class applications and services at very large scale without brand name hardware or expensive hypervisors.

This movement has started entering the enterprise world and I expect it to pick up momentum in 2011. As the price of the base hypervisor is rapidly declining with some being free all together, customers are more and more comfortable running various offerings in their data centers. One size does not fit all and one should use the hypervisor most suited to the use and application being built on top of it.

As organizations start using the same building blocks as the major public cloud providers, the move towards true hybrid clouds will become a greater reality in 2011. Public clouds have demonstrated the business benefits of cloud computing in terms of efficiency, scalability and agility. Those benefits can be achieved in great part on private infrastructure using private cloud offerings. IT can look to bring greater amounts of flexibility and agility behind their firewall and empower their internal business customers. But not every application will be required to run on the IT infrastructure and in some cases, the use of public cloud infrastructure will make more sense from an economic or architecture perspective.

This will create a co-existence model where IT can pick and chose which applications should run on their traditional core systems, which should run on a new breed of cloud enabled infrastructure behind the firewall and which should be moved to the public cloud. This hybrid model will allow an unprecedented level of elasticity.

Although initial interest in the cloud was primarily driven by cost savings, other aspects of the cloud promise have been picking up steam and I expect them to dominate the reasons for adoption and deployment through 2011. The level of innovation enabled by private and hybrid cloud technologies will allow IT to build and deliver better applications with virtually unlimited capacity, using third party resources when required. Moving beyond association with cost, IT will be associated with innovation again, bringing more competitive advantages to their organization. 

Choosing the Right Enterprise Cloud Solution for Your Organization

Cloud computing is here to stay and many organizations are under pressure to move towards this powerful new technology. Yet, concerns around moving into the cloud are very real. Complex and time consuming deployment, security risks, nightmarish application migration scenarios and buggy and immature private cloud management offerings are just some of the barriers to mainstream enterprise adoption of the cloud.

Enterprises want the same benefits of agility, automation and scale demonstrated by public cloud services such as Amazon EC2 behind their firewall. Although the number of private and hybrid cloud solutions is increasing rapidly, discerning which solution is best suited to your organization can be difficult, Here we attempt to provide some pointers for choosing the right enterprise cloud management solution for your organization – one that is reliable, will meet your business needs and allow you to focus on innovation.

What is an Enterprise Cloud Solution?

An Enterprise Cloud Solution promises to convert your static data center into nimble compute capacity that is further enhanced by seamless integration with other on- and off-premise clouds. Essentially, the Enterprise Cloud Solution offers a cost-effective way to meet your organization’s computing requirements by optimizing the utilization of your current infrastructure and increasing all-round system efficiency by automating resource provisioning and reducing the need for human interaction.

The key benefits of an Enterprise Cloud Solution, offering access to both on- and off-premises compute capacity, include:

  • Flexible compute capacity promotes innovation and allows organizations to rapidly respond to changing business demands while focusing on their core-competencies.
  • Infrastructure cost savings - Corporate servers are estimated to run below 15% capacity. An Enterprise Cloud Solution promises to dramatically increase infrastructure utilization rates especially where load is variable, eliminating the need for hardware purchases that tackle only peak demand and avoiding over-provisioning of resources.
  • Operational cost savings - By automating manual processes, Enterprise Cloud Solutions reduce the demand for administration and support.

Why not just use Public Cloud Services?

An Enterprise Cloud Solution offers the benefits of flexible, responsive compute capacity both behind the firewall and in the public cloud domain. A good Enterprise Cloud Solution should provide access to both local private infrastructure and off-premise cloud services with a uniform interface and integrated authentication. Although public cloud services provide unique opportunities for certain types of compute, particularly those with unpredictable load, there are many advantages to deploying an in-house private cloud able to link to external clouds when necessary. Customers should not have to choose one over the other but be able to implement a flexible solution that allows them to utilize both as required.

The benefits of having access to local on-premise cloud capacity as opposed to solely accessing public cloud services for elastic compute needs include:

  • Security – It may be preferable to keep sensitive services, applications and data behind the firewall, instead of exposing these to the risks associated with outsourcing compute capacity and storage to an external vendor.
  • Performance – Access speeds between compute instances over a local network are generally much faster than access to a public cloud over the Internet, where speed is limited by a provider’s bandwidth or latency. Ensuring compute and data are physically close together avoids performance degradation, especially in large scale systems.
  • Service Disruption – Technology upgrades scheduled at the most convenient and cost effective time for the public cloud operator, could have serious implications for your services if they correspond with high demand.
  • Regulatory – Data flows are becoming more global but privacy laws are local. Deploying systems across regions can become problematic when regional regulations differ.
  • Internal Resource Accounting – A good Enterprise Cloud Solution will facilitate the monitoring and metering of resource consumption by various business units within the organization, allowing for consumption based intra-organizational billing.
  • Sunk Cost – Many large corporations have invested heavily in private data centers, most of which run under 15% utilization. With the spend on data centers being roughly split in thirds across operations, hardware and power, a solution that dramatically increases hardware utilization while decreasing operational overheads, becomes very cost effective.
  • Data Longevity – Keeping data in a public cloud for long periods can be costly. Where data longevity is a key system requirement, having a private cloud component with local storage as part of your cloud solution can be cost effective.

Assessing your On-Premise Cloud Needs

There are many aspects to consider when selecting a solution for your enterprise private cloud needs. Below we outline these and the questions that need to be satisfactorily answered when considering a cloud solution’s suitability for enterprise use.

Security – Authentication and Authorization
Ideally, an Enterprise Cloud Solution should provide fine-grained authorization supporting multi-tenancy.

Authentication should ideally be integrated with existing user services to provide hassle free user management and the efficient reuse of existing corporate user databases. An ideal solution should support the sophisticated authentication and authorization necessary to provide multi-tenancy which allows multiple customers, groups and users to co-exist in isolation from each other or share resources on a single site. The ability to create users and groups whose allowable actions can be determined by policy in the form of rigorously enforced permissions, is essential for robust security, controlled access to on- and off-premise resources and for monitoring and billing.

Some specific questions to be asked:

  • What security is present?
  • Is this security robust?
  • Does the security afford tight enough control?
  • Does authentication integrate with existing user services?
  • Is fine-grained policy based authorization supported?
  • Is multi-tenancy supported?
  • Does the security satisfy any industry-specific laws and regulations by which your business must abide?

Ease of Use

Installation

Complex installation and setup can be a very real barrier to entering cloudspace. As the number of physical machines (nodes) in your underlying infrastructure grows, it becomes increasingly important to avoid a per node installation setup and intensive manual configuration. Make sure your Enterprise Cloud Solution has a largely automated installation with minimal configuration and dynamic resource discovery. Ideally the underlying software should have the ability to discover resources and automatically install and configure new infrastructure..

Specific questions to be asked:

  • How quick and easy is it to deploy a site?
  • Is the base operating system for each node bundled in the solution you are considering?
  • Can you simply plug new nodes into your infrastructure and automatically have these resources installed, configured and available in you cloud environment?
  • Is operating system install for nodes fully automated?
  • Can nodes install in under 15 minutes?
  • Is installation independent of Internet connectivity?
  • Are security keys distributed at install time?
  • Is the site install of the node farm automated?

Site Administration and Maintenance

Ideally, an Enterprise Cloud Solution automates most of the manual tasks associated with static data center administration and management and provides powerful tools to handle those tasks requiring human interaction. This reduces operational overhead and saves on staff retraining time as administration and maintenance of the cloud should be a largely hands-off affair.

Specific questions to be asked:

  • How quick and easy is it to manage the site?
  • Is site management largely automated?
  • Is scripting and API support present?
  • Is it easy to get technical personnel up to speed with the requirements of the cloud solution with respect to architecture, implementation and operation?

Interface

Various interfaces need to be available for interacting with the cloud in different contexts, such as a command line interface (CLI), a graphical user interface (GUI) and an application programming interface (API) for programmatic interaction. These interfaces need to be easy to use and consistent, providing the same, comprehensive set of functionality.

Specific questions to be asked:

  • What are the various interfaces for interacting with the cloud and how usable are these?
  • How consistent are these interfaces? Is the same set of functionality supported throughout?
  • Can the Enterprise Cloud Solution interface integrate seamlessly with public cloud offerings?

Migration

Re-engineering applications to work on a new platform can dramatically escalate the cost and time required to get your business up and running in the cloud, so being able to easily migrate existing applications is a key requirement when choosing an Enterprise Cloud Solution. Network and storage constraints are the main differentiators when considering migration.

Specific questions to be asked:

  • How convenient is it to migrate existing enterprise applications into the cloud?
  • Does the solution support an architecture similar to those used by enterprise applications?
  • What guest operating systems are supported?
  • Is regular SQL RDBMS storage supported?
  • Can compute and storage be attached and detached easily to facilitate flexible movement of instances?
  • Is seamless access to your existing SAN and NAS infrastructure supported?
  • Is a familiar networking environment supported?
  • Are layer 2 access, multicast and non-tcp protocols like IPSEC supported?
  • Are all the required versions of your application operating system supported?
  • Are your existing software licenses portable to the cloud?

Integration

Integration with existing services and system management polices allows for time-saving and hassle-free reuse of existing systems.

Specific questions to be asked:

  • How easy is it to integrate existing systems management policies and functions into the cloud?
  • Does the system integrate with existing Directory Services?

Flexibility

Ideally an Enterprise Cloud Solution should provide dynamic workload allocation that optimizes infrastructure utilization and provide flexible instance management that supports customization to specific business needs. Controlled access to external compute capacity should also be supported to make additional capacity available when necessary.

Access to External Compute Capacity

Access to external compute capacity allows customers to burst into other on- and off-premise clouds subject to load demands, flexibly providing extra compute capacity when necessary.

Specific questions to be asked:

  • Is federated authentication with external on- and off-premise clouds supported?
  • What level of control is available when accessing external compute capacity?
  • Is the interface to external clouds transparent i.e. is it the same interface used to access local resources?

Instance Management

Being able to easily launch and terminate instances empowers innovation as it eliminates the manual restrictions involved in static systems deployment. The ability to customize the instance deployment environment allows customers to fine tune how their instances are launched in line with business needs.

Specific questions to be asked:

  • How easy is it to deploy and terminate instances?
  • Is it possible to customize aspects of instance deployment to specific business needs?
  • Is it possible to specify network-locality relationships between launched instances? This is useful, if for example, you require that two instances be launched on the same physical machine to facilitate inter-instance communication, or that instances be launched on different clusters, to try and guarantee the highest level of reliability even if there are data center failures.
  • Is it possible to define custom instance types or shapes that define elements needed to run an instance?
    • number of CPUs
    • amount of  memory
    • special requirements (eg GPUs)
    • attached storage (eg Fiber Channel interface to a SAN)

Dynamic Workload Allocation

Efficient dynamic workload allocation is a key requirement for a utility grade Enterprise Cloud Solution and replaces the hands-on resource provisioning of the static data center paradigm. Intelligent placement should efficiently allocate resources to instances based on user specified parameters, such as number of CPUs and amount of RAM required. Ideally you want to be able to dynamically place any instance shape on any node, according to policy, instead of having to hard configure nodes for particular instance types or shapes. Hard configuring nodes makes it very difficult to accommodate shifts in demand from one shape to another.

Specific questions to be asked:

  • Is the dynamic workload allocation model robust, providing optimized rates of infrastructure utilization?
  • Does the system place workloads on the optimal node?
  • How flexible and efficient is placement?
  • Can you dynamically place any instance shape on any node, according to policy?
  • Can the parameters be specified by the user?

Scaling

To be able to support deployment of utility grade systems, an Enterprise Cloud Solution should scale from a small cluster to hundreds of thousands of physical machines without performance degradation.

Specific questions to be asked:

  • How scalable is the solution?
  • What are the operational overhead implications associated with scaling?
  • How is performance impacted with scale?

Reliability

At large scale, failures are the norm, so being able to automatically deal with failures dramatically reduces operational burden..Ideally an Enterprise Cloud Solution should be self-healing and self-organizing with no single points of failure.  Sophisticated fail over mechanisms should be employed to ensure system integrity and resilience. Failover management should be completely automated to ensure no service interruption.

Specific questions to be asked:

  • Are there single points of failure in the solution’s architecture?
  • What redundancy mechanisms are in place?
  • Is failover management automated?

Monitoring and Metering

Ideally an Enterprise Cloud Solution should record all system requests, incidents and events, creating a rich audit trail. The monitoring system should be able to integrate with external analysis software.

Specific questions to be asked:

  • Does the monitoring and metering provided by the system support the needs of your organization?
  • Can monitoring be integrated with your existing monitoring infrastructure e.g. is SNMP supported?

In Summary

With the number of Enterprise Cloud Solutions on the market increasing and the implications of moving to the cloud representing a substantial investment in product procurement and staff training and time, it is critical that solution choice be carefully considered. Thorough evaluation of Enterprise Cloud Solutions against the indicators outlined above will ensure the best chance of your organisation enjoying the true benefits of public cloud infrastructure behind the firewall.