General



14 Mar 11

How the Meek Shall Inherit The Data Center, Change The Way We Build and Deploy Applications, And Kill the Public Cloud Virtualization Market

The tiny ant. Capable of lifting up to 50 times its body weight, an ant is an amazing workhorse with by far the highest “power to weight” ratio of any living creature. Ants are also among the most populous creatures on the planet. They do the most work as well – a bit at a time Ants can move mountains.

Atom chips (and ARM chips too) are the new ants of the data center. They are what power our smartphones, tablets and ever more consumer electronics devices. They are now very fast, but surprisingly thrifty with energy – giving them the highest computing power to energy weight ratio of any microprocessor.

I predict that significantly more than half of new data center compute capacity deployed in 2016 and beyond will be based on Atoms, ARMs and other ultra-low-power processors. These mighty mites will change much about how application architectures will evolve too. Lastly, I seriously believe that the small, low-power server model will eliminate the use of virtualization in a majority of public cloud capacity by 2018. The impact in the enterprise will be initially less significant, and will take longer to play out, but in the end it will be the same result.

So, let’s take a look at this in more detail to see if you agree.

This week I had the great pleasure to spend an hour with Andrew Feldman, CEO and founder of SeaMicro, Inc., one of the emerging leaders in the nascent low-power server market. SeaMicro has had quite a great run of publicity lately, appearing twice in the Wall Street Journal related to their recent launch of their second-generation product – the SM10000-64 based on a new dual-core 1.66 GHz 64-bit Atom chip created by Intel specifically for SeaMicro.

SeaMicro: 512 Cores, 1TB RAM, 10 RU

Note – the rest of this article is based on SeaMicro and their Atom-based servers.  Calxeda is another company in this space, but uses ARM chips instead.

These little beasties, taking up a mere 10 rack units of space (out of 42 in a typical rack), pack an astonishing 256 individual servers (512 cores), 64 SATA or SSD drives, up to 160GB of external network connectivity (16 x 10GigE), and 1.024 TB of DRAM. Further, SeaMicro uses ¼ of the power, ¼ the space and costs a fraction of a similar amount of capacity in a traditional 1U configuration. Internally, the 256 servers are connected by a 1.28 Tbps “3D torus” fabric modeled on the IBM Blue Gene/L supercomputer.

The approach to using low-power processors in a data center environment is detailed in a paper by a group of researchers out of Carnegie Mellon University. In this paper they show that cluster computing using a FAWN (“Fast Array of Wimpy Nodes”) approach, overall, are “substantially more energy efficient than conventional high-performance CPUs” at the same level of performance.

The Meek Shall Inherit The Earth

A single rack of these units would boast 1,024 individual servers (1 CPU per server), 2,048 cores (total of 3,400 GHz of compute), 4.1TB of DRAM, and 256TB of storage using 1TB SATA drives, and communicate at 1.28Tbps at a cost of around half a million dollars (< $500 per server).

$500/server – really? Yup.

Now, let’s briefly consider the power issue. SeaMicro saves power through a couple of key innovations. First, they’re using these low power chips. But CPU power is typically only 1/3 of the load in a traditional server. To get real savings, they had to build custom ASICs and FPGAs to get 90% of the components off of a typical motherboard (which is now the size of a credit card, with 4 of them on each “blade”). Aside from capacitors, each motherboard has only three types of components – the Atom CPU, DRAM, and the SeaMicro ASIC. The result is 75% less power per server. Google has stated that, even at their scale, the cost of electricity to run servers exceeds the cost to buy them. Power and space consumes >75% of data center operating expense. If you save 75% of the cost of electricity and space, these servers pay for themselves – quickly.

If someone just gave you 256 1U traditional servers to run – for free – it would be far more expensive than purchasing and operating the SeaMicro servers.

Think about it.

Why would anybody buy traditional Xeon-based servers for web farms ever again? As the saying goes, you’d have to pay me to take a standard server now.

This is why I predict that, subject to supply chain capacity, more than 50% of new data center servers will be based on this model in the next 4-5 years.

Atoms and Applications

So let’s dig a bit deeper into the specifics of these 256 servers and how they might impact application architectures. Each has a dual-core 1.66GHz 64-bit Intel Atom N570 processor with 4GB of DRAM. These are just about ideal Web servers and, according to Intel, the highest performance per watt of any Internet workload processer they’ve every built.

They’re really ideal “everyday” servers that can run a huge range of computing tasks. You wouldn’t run HPC workloads on these devices – such as CAD/CAM, simulations, etc. – or a scale-up database like Oracle RAC. My experience is that 4GB is actually a fairly typical VM size in an enterprise environment, so it seems like a pretty good all-purpose machine that can run the vast majority of traditional workloads.

They’d even be ideal as VDI (virtual desktop servers) where literally every running Windows desktop would get their own dedicated server. Cool!

Forrester’s James Staten, in a keynote address at CloudConnect 2011, recommended that people write applications that use many small instances when needed vs. fewer larger instances, and aggressively scale down (e.g. turn off) their instances when demand drops. That’s the best way to optimize economics in metered on-demand cloud business models.

So, with a little thought there’s really no need for most applications to require instances that are larger than 4GB of RAM and 1.66GHz of compute. You just need to build for that.

And databases are going this way too. New and future “scale out” database technologies such as ScaleBase, Akiban, Xeround, dbShards, TransLattice, and (at some future point) NimbusDB can actually run quite well in a SeaMicro configuration, just creating more instances as needed to meet workload demand. The SeaMicro model will accelerate demand for scale-out database technologies in all settings – including the enterprise.

In fact, some enterprises are already buying SeaMicro units for use with Hadoop MapReduce environments. Your own massively scalable distributed analytics farm can be a very compelling first use case.

This model heavily favors Linux due to the far smaller OS memory footprint as compared with Windows Server. Microsoft will have to put Windows Server on a diet to support this model of data center or risk a really bad TCO equation. SeaMicro is adding Windows certification soon, but I’m not sure how popular that will be.

If I’m right, then it would seem that application architectures will indeed be impacted by this – though in the scheme of things it’s probably pretty minor and in line with current trends in cloud.

Virtualization? No Thank You… I’ll Take My Public Cloud Single Tenant, Please!

SeaMicro claims that they can support running virtualization hosts on their servers, but for the life of me I don’t know why you’d want to in most cases.

What do you normally use virtualization for? Typically it’s to take big honking servers and chunk them up into smaller “virtual” servers that match application workload requirements. For that you pay a performance and license penalty. Sure, there are some other capabilities that you get with virtualization solutions, but these can be accomplished in other ways.

With small servers being the standard model going forward, most workloads won’t need to be virtualized.

And consider the tenancy issue. Your 4GB 1.66GHz instance can now run on its own physical server. Nobody else will be on your server impacting your workload or doing nefarious things. All of the security and performance concerns over multi-tenancy go away. With a 1.28 Tbps connectivity fabric, it’s unlikely that you’ll feel their impact at the network layer as well. SeaMicro claims 12x available bandwidth per unit of compute than traditional servers. Faster, more secure, what’s not to love?

And then there’s the cost of virtualization licenses. According to a now-missing blog post on the Virtualization for Services Providers blog (thank you Google) written by a current employee of the VCE Company, the service provider (VSPP) cost for VMware Standard is $5/GB per month. On a 4GB VM, that’s $240 per year – or 150% the cost of the SeaMicro node over three years! (VMware Premier is $15/GB, but in fairness you do get a lot of incremental functionality in that version). And for all that you get a decrease in performance having the hypervisor between you and the bare metal server.

Undoubtedly, Citrix (XenServer), RedHat (KVM), Microsoft (Hyper-V) and VMware will find ways to add value to the SeaMicro equation, but I suspect that many new approaches may emerge that make public clouds without the need for hypervisors a reality. As Feldman put it, SeaMicro represents a potential shift away from virtualization towards the old model of “physicalization” of infrastructure.

The SeaMicro approach represents the first truly new approach to data center architectures since the introduction of blades over a decade ago. You could argue – and I believe you’d be right – that low-power super-dense server clusters are a far more significant and disruptive innovation than blades ever were.

Because of the enormous decrease in TCO represented by this model, as much as 80% or more overall, it’s fairly safe to say that any prior predictions of future aggregate data center compute capacity are probably too low by a very wide margin. Perhaps even by an order of magnitude or more, depending on the price-elasticity of demand in this market.

Whew! This is some seriously good sh%t.

It’s the dawn of a new era in the data center, where the ants will reign supreme and will carry on their backs an unimaginably larger cloud than we had ever anticipated. Combined with hyper-efficient cloud operating models, information technology is about to experience a capacity and value-enablement explosion of Cambrian proportions.

What should you do? Embrace the ants as soon as possible, or face the inevitable Darwinian outcome.

The ants go marching one by one, hurrah, hurrah…

——————

(c) 2011 CloudBzz / TechBzz Media, LLC.  All rights reserved.  This post originally appeared at http://www.cloudbzz.com/seamicro-atom-and-the-ants/. You can follow CloudBzz on Twitter @CloudBzz.

Bookmark and Share

Filed under: General,Vendors

Trackback Uri






25 Feb 11

A couple months back I had a chance to catch up with Pat O’Day, CTO at BlueLock. They are a cloud provider headquartered in Indianapolis with two data centers (a primary and a backup), and also cloud capabilities on Wall Street and in Hong Kong for specific customers.

BlueLock has been a vCloud service provider for the past year and has taken an enterprise IT-centric approach to their cloud services. They are not going after the SMB web hosting market, and don’t want to sell to everybody. Their primary focus is on mid-tier enterprises looking for a provider that will deliver cloud in a way that integrates with customer environments – what you might expect from a managed services provider.

Initially they just provided private clouds, really just dedicated VMware environments with a vCenter front end. Their clouds now are still mostly private, with the user able to control what level of multi-tenancy they want. They do this through three models:

- Pay as you go multitenant
- Reserved multitenant at a lower cost
- Committed single-tenant dedicated infrastructure

 

For multi-tenant users they implemented vCloud Director as the UI. When showing this to their customers, they got feedback that Director was too unfamiliar when compared to vCenter. This gave them the idea to create a plug-in to vCenter that would allow VMware administrators to control their cloud resources.

Their plug-in was enabled by the fact that vCloud Director provides a full implementation of the vCloud API. This model has proven to be very popular with their customers. It was also very innovative.

In addition to starting and stopping cloud instances, users can move applications to BlueLock’s cloud and back again. As O’Day explained it, a vCenter administrator can create vApps from workloads running in their data center and use vCenter to deploy it up to the cloud – and to repatriate it again if necessary.

Contrast this with most cloud providers. Some, like Amazon and Rackspace, require you to package up your applications and move them to the cloud with a lot of manual processing. Amazon now can import VMDKs, but that only gets you instances – not whole apps. Other service providers, including most who target the enterprise, have “workload onboarding” processes that generally require IT to package up their VMware images and let the provider manage the import. Sometimes this is free, sometimes there may be an onboarding charge. BlueLock’s approach makes it easy and under the control of IT for workloads and data to be migrated in both directions.

VMware recently announced vCloud Connector to perform essentially the same function. But to my knowledge BlueLock remains one the few – if not the only- production cloud with this type of capability deployed.

While we all love to cite Amazon’s velocity of innovation, BlueLock has shown that even smaller providers can deliver very innovative solutions based on listening closely to customer requirements. While most people out there today are just talking about hybrid clouds, BlueLock is delivering.

Bookmark and Share

Filed under: Case Studies,General,Vendors

Trackback Uri






17 Feb 11

Time to talk about cloud stacks again.  No, not that there are too many (though there are), but rather the one-track mind that many IT buyers I encounter have with respect to cloud.  Some end users I have spoken with in the past few weeks are looking to implement a private clouds, and they are actively evaluating “cloud stacks” from a limited number of vendors (mainly their strategic suppliers).

I’ve asked about their process for their private cloud initiatives, and specifically how far along they were on their top-down requirements analysis and documentation.  The reply I typically received was more than a little bit surprising.  While it varies, it typically goes something like this…

“We have not created a high-level analysis of requirements.  We’re evaluating vendor solutions and will pick the best one.”

If IT leaders haven’t thought through their vision for private cloud, translated that into capabilities requirements, and aligned their business users, how can they possibly choose a technology?  And why do they think that all they need is a cloud stack?

Is the dog wagging the tail, or is it the other way around?  In this case, it’s clearly a case of tails wagging the dog as they get bombarded with vendor cries of “buy this cloud now and it will solve world hunger and peace.”

Cloud projects should have a flow that should definitively answer a VERY LONG set of key questions.  Here are JUST A FEW of them…:

  1. What are the strategic objectives for my cloud program?
  2. How will my cloud be used?
  3. Who are my users and what are their expectations and requirements?
  4. How should/will a cloud model change my data center workflows, policies, processes and skills requirements?
  5. How will cloud users be given visibility into their usage, costs and possible chargebacks?
  6. How will cloud users be given visibility into operational issues such as server/zone/regional availability and performance?
  7. What is my approach to the service catalog?  Is it prix fixe, a la carte, or more like value meals?  Can users make their own catalogs?
  8. How will I handle policy around identity, access control, user permissions, etc?
  9. What are the operational tools that I will use for event management & correlation, performance management, service desk, configuration and change management, monitoring, logging, auditability, and more?
  10. What will my vCenter administrators do when they are no longer creating VMs for every request?
  11. What will the approvers in my process flows today do when the handling of 95% of all future requests are policy driven and automated?
  12. What levels of dynamism are required regarding elasticity, workload placement, data placement and QoS management across all stack layers?
  13. Beyond a VM, what other services will I expose to my users?
  14. How will I address each of the key components such as compute, networking, structured & object storage, virtualization, security, automation, self-service, lifecycle management, databases and more?
  15. What are the workloads I expect to see in my cloud, and what are the requirements for these workloads to run?

This list goes on for several pages, and more.  If you have not done this level of analysis, you are not ready to evaluate a cloud stack.  Sure, you can research and hear from vendors – often a good way to educate yourself and prompt new thinking about the concepts above.  However, customers should stay away from asking for POCs, asking for pricing from vendors, setting budgets, and all of the other dance routines we face in the procurement process.

The unfortunate truth is that most vendors don’t want you to do this because it slows down the sales cycle.  But I’m going to quote the carpenter’s axiom here:

Measure twice, cut once!

Would you build a house without a vision, rendering and architectural blueprints?  Of course not.  However, the other unfortunate truth I am finding is that too many customers I see are falling into this trap.  They have the cart way out in front of the horse.

They are practicing “Ready! Fire!  Aim!”  I’m sure we all can guess how that will work for them…

Bookmark and Share

Filed under: General

Trackback Uri






15 Feb 11

David Linthicum writes in a post on InfoWorld today about “How VCs are leading us down the wrong path for cloud computing.” In this, he gives three reasons for this premise, but provides little in the way of substantiation for this position. We started a twitter discussion but David suggested I respond in a post – here it is. Regarding definition challenges, which David points out, I will focus on the tools and solutions for Infrastructure as a Service – since that seems to be what he’s getting at.

David Linthicum’s “Cloud computing investor mistake No. 1: Assume a sustainable business model”

For this mistake he cites the financial ROI of cloud from the perspective of IT. Is cloud cheaper than internally-delivered IT? Maybe yes, but often no. At least not on the surface. Assuming that some fairly large % of your infrastructure needs are static (e.g. opposite of elastic), and that some chunk of this is moving along nicely. Will the same VMs, which are very stable and always on, be less expensive in a cloud? Probably not. But that’s not even the point. Even if you believe that most of the value of cloud is how it supports innovation, then the ROI discussion gets a lot stickier. But that’s not the point either.

This point is very focused on external clouds from service providers. Here is the key point David makes:

“it’s clear to me that the high subscription fees many cloud computing providers charge have to fall over time as the prices for enterprise hardware and software fall as well. That means the expected ROI may not be there for cloud computing — and even if it’s there at first, it could degrade significantly over time.”

I don’t follow.  If subscription prices fall at the same rate as SW and HW, then wouldn’t any ROI calculation stay the same?  Actually not.  In fact, the cloud model gets even better due to the labor arbitrage of moving from 50-100 servers per admin to thousands in the cloud (assuming public cloud – several hundred in the private cloud) while the cost of labor increases.  I was at a customer this week who quotes > 6 months to provision a physical server, and several weeks for a virtual server.  This is due to manual processing of work orders that have an average SLA of 3 weeks.  3 weeks????!!!!   Now, let’s fire up that same server on a public cloud with a nice portal, even one integrated with access management (etc.), in 5 minutes, with no manual processing.  Or this can be on a private cloud with the same process model.

Lastly, how is this a problem with VCs?  Few VCs I know want to invest in capital-intensive service provider businesses.  Most new clouds are coming from existing service providers, not VCs.  The biggest cloud provider is also the world’s largest online retailer – Amazon.  Many of these guys know how to compete in commodity markets.  Again, how is this an instance of VCs leading us down the wrong path?

David Linthicum’s “Cloud computing investor mistake No. 2: Have little understanding of the technology in the context of an emerging market”

Funding innovation in emerging markets is the business of venture capital. Admittedly, the flock behavior referenced by David is a challenge. It was with the Internet, open source, Java funds, etc. A few early investors make bold bets and some win – then a lot of VCs will start dropping money into the new new thing. Some will be winners. Most will fail. That’s the nature of venture investing. As Bob Davoli at Sigma stated last year in Mass High Tech...

“Do you know how much money people are going to lose in the cloud?” he asked. “Billions.”

Okay, so VC investments pan out less than 50% of the time (or whatever). How is it their fault if an IT buyer doesn’t appreciate the risks of buying new technologies and bets wrong? I got an email from a very widely known “cloud” startup (maybe now a growth company) the other day responding to a request I made about his interest in another piece of new technology I came across. His response? “We don’t feel comfortable deploying untested technology into production.” In other words, let me know when it’s proven and I’ll look at it. This wasn’t some grizzled IT veteran – the guy’s a Silicon Valley entrepreneur in his 30′s (and I bet 80% of you use his innovation). Some IT shops know that taking a risk can pay off, but the smart ones know how to mitigate the risks of new technologies and avoid too much reliance on a business that might fold before their next round is funded.

David Linthicum’s “Cloud computing investor mistake No. 3: Buy into cloud-washed technology”

Cloud washing. It’s everywhere (“Cloud Power!” – ugh!). Often it’s from big established tech companies. Sometimes it’s from older VC-backed companies who reposition themselves for the cloud to “catch the wave.” Perhaps even some new startups with technology that really is not very cloudy do this too. Who cares? Honestly, if you’re an IT buyer and you purchase kit from your vendor just because it has the word cloud on the data sheet, you suck. Truly suck. You should be fired you suck so bad.

Honestly, shouldn’t you start with an understanding of the business problem you’re trying to solve, how that translates into feature and technology requirements, and then search the market for the solution that best meets your real needs? Who gives a flying donkey that “cloud” is not in the name of the product. Are you really that easy to fool? Don’t blame VCs for because some people are too lazy or unskilled to do their job properly.

I guess this leads to David’s concluding line:

“Go cloud — but do it wisely, not by following the VC flock.”

That’s good advice in any market.  Go cloud – but use your judgement, analysis and common sense.  In other words, don’t be some vendor’s puppet.  Okay, on this we agree.

(c) 2011 CloudBzz – this post originally appeared on CloudBzz.com.  For more of John’s thoughts on the cloud, visit CloudBzz.com.

Bookmark and Share

Filed under: General

Trackback Uri






31 Jan 11

As I’ve written about previously, there are many tools in the market for building clouds – whether private or public. There are too many, in fact, and it will be hard to see most of them still around after the next five years. BMC is in a very strong position with both enterprises and large managed service providers – and I would expect them to be one of the survivors based on scale and reach, if for no other reason.

BMC has been hard at work on their Cloud Lifecycle Management (CLM) offering and recently announced their 1.5 release. CLM is a solution built on top of a good chunk of the BMC tools suite – including BladeLogic, Atrium CMDB, Atrium Orchestrator, and more. It’s an approach similar to most of the large IT automation vendors – IBM, CA and HP included – and we have used this model at Unisys as well.

I got a chance to catch up with one of BMC’s cloud product marketers, Lilac Shoenbeck. According to Lilac, CLM 1.5 is a substantial upgrade, with the primary focus on making it easier for cloud administrators to configure and manage their cloud – including the service catalog and rules. They are also focused on providing out-of-the-box functionality for key cloud use-cases: dev/test, big data analytics, Web hosting, etc.

One of the core principals is around the service catalog management. Rather than specifying a fixed and inflexible catalog, BMC likened this to an ice cream parlor where you have a set of base (cone), middle (ice cream) and top (toppings) components and allow the user a fair bit of latitude in creating their environment.

CLM is not a pure play cloud environment like Eucalyptus or Cloud.com, which means it comes with a set of tradeoffs to support the legacy BMC tool set.  This gives them a lot of powerful technology underneath, but anytime you need to integrate a bunch of stuff under the covers – especially stuff that came through acquisition – it makes for a fair bit of complexity.  The pure-play guys have more maneuvering room for innovation – which is how they like it.  However, these tools don’t run in a vacuum and there can be substantial work to integrate it into existing automation environments.

CLM does provide a high level of automation for both physical and virtual environments has the advantage of a large enterprise sales force to bring it to market.  Tight integration with Cisco switches and UCS, deep integration with a leading CMDB, and an extensive hardware support matrix are all positives for CLM.  I believe they also ship today with support for multiple hypervisors (VMware, Xen, Hyper-V, KVM).

Another area they tote is around business service management and workload placement.  Once a workload is placed into the cloud, rules can be used to move it, scale it, etc. based on business transactions performance and other factors.  There’s a fair bit of work to get this right, so time will tell if they have gotten it working well.

It is missing some key elements right now – most notably a public cloud API out of the box (you can write your own against the internal APIs of CLM). It’s also not open source – which I have also written about – and is going to be fairly complex to set up initially. The user portal is also fairly basic and targeted at enterprise users, not the Web/SMB market, though you can use their internal APIs to create your own portal if desired.

Pricing was not disclosed, but they do have both usage-based and perpetual license-based models. The usage-based pricing is particularly key to the service provider space, though apparently some enterprises are also using this model. You’d expect BMC to be priced quite a bit higher than the pure-play market, though I am led to believe that they can be very aggressive to win deals.

BMC CLM is a credible and reasonably well positioned offering from a traditional ITA tools vendor.  If you are already a big BladeLogic user, or you want a cloud solution from a mainstream data center automation tools vendor, CLM is a strong offering.  If you tend towards open source tools in your data center and are focused on leveraging new innovations, this might not be the best fit.

Bookmark and Share

Filed under: General

Trackback Uri






3 Jan 11

A new year is often a time for reflection on the past and pondering the future.  2010 was certainly a momentous year for cloud computing.  An explosion of tools for creating clouds, a global investment rush by service providers, a Federal “cloud first” policy, and more.  But in the words of that famous Bachman Turner Overdrive song — “You ain’t seen nothin’ yet!”

In fact, I’d suggest that in terms of technological evolution, we’re really just in the Bronze Age of cloud.  I have no doubt that at some point in the not too distant future, today’s cloud services will look as quaint as an historical village with no electricity or running water.  The Wired article on AI this month is part of the inspiration for what comes next.  After all, if a computer can drive a car with no human intervention, why can’t it run a data center?

Consider this vision of a future cloud data center.

The third of four planned 5 million square foot data centers quietly hums to life.  In the control center, banks of monitors show data on everything from number of running cores, to network traffic to hotspots of power consumption.  Over 100,000 ambient temperature and humidity sensors keep track of the environmental conditions, while three cooling towers vent excess heat generated by the massively dense computing and storage farm.

The hardware, made to exacting specifications and supplied by multiple vendors, uses liquid coolant instead of fans – making this one of the quietest and most energy-efficient data centers on the planet.  The 500U racks reach 75 feet up into the cavernous space, though the ceiling is yet another 50 feet higher where the massive turbines draw cold air up through the floors.  Temperature is relatively steady as you go up the racks due to innovative ductwork that vents cold air every 5 feet as you climb.

Advanced robots wirelessly monitor the 10GBps data stream put off by all of the sensors, using their accumulated “knowledge and experience” to swap out servers and storage arrays before they fail. Specially designed connector systems enable individual pieces or even blocks of hardware to be snapped in and out like so many Lego blocks – no cabling required.  All data moves on a fiber backbone at multiple terabytes per second.

On the data center floor, there are no humans.  The PDUs, cooling systems and even the robots themselves are maintained by robots – or shipped out of the data center into an advanced repair facility when needed.  In fact, the control center is empty too – the computers are running the data center.  The only people here are in the shipping bay, in-boarding the new equipment and shipping out the old and broken, and then only when needed.  Most of these work for the shippers themselves.  The data center has no full-time employees.  Even security and access control for the very few people allowed on the floor for emergencies is managed by computers attached to iris and handprint scanners.

The positioning and placement of storage and compute resources makes no sense to the human eye.  In fact, it is sometimes rearranged by the robots based on changing demands placed on the data center – or changes that are predicted based on past computing needs.  Often this is based on private computing needs of the large corporate and government clients who want (and will pay for) increased isolation and security.  The bottom line – this is optimized far beyond what a logical human would achieve.

Tens of millions of cores, hundreds of exabytes of data, no admins.  Sweet.

The software automation is no less impressive.  Computing workloads and data are constantly optimized by the AI-based predictive modeling and management systems.  Data and computing tasks are both considered to be portable – one moving to the other when needed.  Where large data is required, the compute tasks are moved to be closer to the data.  When only a small amount of data is needed, it will often make the trip to the compute server.  Of course, latency requirements also play a part.  A lot of the data in the cloud is maintained in memory — automatically based on demand patterns.

The security AI is in a constant and all-out running battle with the bots, worms and viruses targeting the data center.  All server images are built with agents and monitoring tools to track anomalies and attack patterns that are constantly updated.  Customers can subscribe to various security services and the image management system automatically checks for compliance. Most servers are randomly re-imaged throughout the day based on the assumption that the malware will eventually find a way to get in.

Everything is virtualized – servers, storage, networking, data, databases, application platforms, middleware and more.  And it’s all as a service, with unlimited scale-out (and scale-in) of all components.  Developers write code, but don’t install or manage most application infrastructure and middleware components.  It’s all there and it all just works.

Component-level failure is assumed and has no impact on running applications.  Over time, as the AI learns, reliability of the software infrastructure underlying any application exceeds 99.999999%.

Everything is controllable through APIs, of course.  And those APIs are all standards-based so tools and applications are portable among clouds and between internal data centers and external clouds.

All application code and data is geographically dispersed so even the failure of this mega data center has a minimal impact on applications.  Perhaps a short hiccup is experienced, but it lasts only seconds before the applications and data pick up and keep on running.

Speaking of applications, this cloud data center hosts thousands of SaaS solutions for everything from ERP, CRM, e-commerce, analytics, business productivity and more. Horizontal and vertical applications too.  All exposed through Web services APIs so new applications – mashups – can be created that combine them and the data in interesting new use cases.  The barriers between IaaS, PaaS and SaaS are blurred and operationally barely exist at all.

All of this is delivered at a fraction of the cost of today’s IT model.

Large data center providers using today’s automation methods and processes are uncompetitive. Many are on the verge of going out of business and others are merging in order to survive.  A few are going into higher-level offerings – creating custom solutions and services.

The average enterprise data center budget is 1/10th of what it used to be. Only the applications that are too expensive to move or otherwise lack suitability for cloud deployment are still in-house managed by an ever-dwindling pool of IT operations specialists (everybody else has been retrained in cloud governance and management, or found other careers to pursue).  Everything else is either a SaaS app or otherwise cloud-hosted.

Special-purpose clouds within clouds are easily created on the fly, and just as easily destroyed when no longer needed.

The future of the cloud data center is AI-managed, highly optimized, and incredibly powerful at a scale never before imagined.  The demand for computing power and storage continues to grow at ever increasing rates.  Pretty soon, the data center described above will be considered commonplace, with scores or even hundreds of them sprinkled around the globe.

This is the future – will you be ready?

Follow me on twitter for more cloud conversations: http://twitter.com/cloudbzz

Notice: This article was originally posted at http://CloudBzz.com by John Treadway.

(c) CloudBzz / John Treadway

Bookmark and Share

Filed under: General

Trackback Uri






22 Nov 10

Ok, I know that this is dangerous.  Randy is a very smart guy and he has a lot more experience on the public cloud side than I probably ever will.  But I do feel compelled to respond to his recent “Elasticity is NOT #Cloud Computing  …. Just Ask Google” post.

On many of the key points – such as elasticity being a side-effect of how Amazon and Google built their infrastructure – I totally agree.  We have defined cloud computing in our business in a similar way to how most patients define their conditions – by the symptoms (runny nose, fever, headache) and not the underlying causes (caught the flu because I didn’t get the vaccine…). Sure, the result of the infrastructure that Amazon built is that it is elastic, can be automatically provisioned by users, scales out, etc.  But the reasons they have this type of infrastructure are based on their underlying drivers – the need to scale massively, at a very low cost, while achieving high performance.

Here is the diagram from Randy’s post.  I put it here so I can discuss it, and then provide my own take below.

My big challenge with this is how Randy characterizes the middle tier.  Sure, Amazon and Google needed unprecedented scale, efficiency and speed to do what they have done.  How they achieve this are the tactics, tools and methods they exposed in the middle tier.  The cause and the results are the same – scale because I need to.  Efficient because it has to be.   These are the requirements.  The middle layer here is not the results – but the method chosen to achieve them.  You could successfully argue that achieving their level of scale with different contents in the grey boxes would not be possible – and I would not disagree.  Few need to scale to 10,000+ servers per admin today.

However, I believe that what makes an infrastructure a “cloud” is far more about the top and bottom layers than about the middle.  The middle, especially the first row above, impacts the characteristics of the cloud – not its definition.  Different types of automation and infrastructure will change the cost model (negatively impacting efficiency).  I can achieve an environment that is fully automated from bare metal up, uses classic enterprise tools (BMC) on branded (IBM) heterogeneous infrastructure (within reason), and is built with the underlying constraints of assumed failure, distribution, self-service and some level of over-built environment.  And this 2nd grey row is the key – without these core principles I agree that what you might have is a fairly uninteresting model of automated VM provisioning.  Too often, as Randy points out, this is the case.  But if you do build to these row 2 principles…?

Below I have switched the middle tier around to put the core principles as the hands that guide the methods and tools used to achieve the intended outcome (and the side effects).

The core difference between Amazon and an enterprise IaaS private cloud is now the grey “methods/tools” row.  Again, I might use a very different set of tools here than Amazon (e.g. BMC, et al).  This enterprise private cloud model may not be as cost-efficient as Amazon’s, or as scalable as Google’s, but it can still be a cloud if it meets the requirement, core principles and side effects components.  In addition, the enterprise methods/tools have other constraints that Amazon and Google don’t have at such a high priority.  Like internal governance and risk issues, the fact that I might have regulated data, or perhaps that I have already a very large investment in the processes, tools and infrastructure needed to run my systems.

Whatever my concerns as an enterprise, the fact that I chose a different road to reach a similar (though perhaps less lofty) destination does not mean I have not achieved an environment that can rightly be called a cloud.  Randy’s approach of dev/ops and homogeneous commodity hardware  might be more efficient at scale, but it is simply not the case that an “internal infrastructure cloud” is not cloud by default.

Bookmark and Share

Filed under: General

Trackback Uri






17 Nov 10

A recurring challenge I have with a lot of enterprise vendor “cloud” solutions I get briefed on is that they seem to be designed and built without any real understanding of how and why customers are actually using the cloud today.  I suspect in most cases that this results from the fact that the people building these solutions have NEVER EVER used Amazon, Rackspace, or any other mainstream public cloud offering.

Chris Hoff points out his suspicion of this scenario in his frank assessment of the recently released FedRAMP documentation.

I’m unclear if the folks responsible for some of this document have ever used cloud based services, frankly.

When you gather together a group of product managers, architects, developers and self-styled strategists who have never used a public cloud, and ask them to design a cloud solution, more often than not their offering will not be a cloud solution (or any other kind of solution that customers want).  It’s not that these people are lacking in intelligence.  Rather, they lack the context provided through experience.  Oh, and many large enterprise vendors suck at the very basics of the “customer development process.”  So not only will their solution not be cloudy, it will be released to the market without them knowing this basic piece of information.

Bookmark and Share

Filed under: General

Trackback Uri






8 Nov 10

Despite the NoSQL hype, traditional relational databases are not going away any time soon. In fact, based on continued market evolution and development, SQL is very much alive and doing well.

I won’t debate the technical merits of SQL vs. NoSQL here, even if I were qualified to do so. Both approaches have their supporters, and both types of technologies can be used to build scalable applications. The simple fact is that a lot of people are still choosing to use MySQL, PostgreSQL, SQL Server and even Oracle to build their SaaS/Web/Social Media applications.

When choosing a SQL option for your cloud-based solution, there are typically three approaches as outlined below. One note – this analysis applies to “mass market clouds” and not the enterprise clouds from folks like AT&T, Savvis, Unisys and others. At that level you often can get standard enterprise databases as a managed service.

  1. Install and Manage – in this “traditional” model the developer or sysadmin selects their DBMS, creates instances in their cloud, installs it, and is then responsible for all administration tasks (backups, clustering, snapshots, tuning, and recovering from a disaster. Evidence suggests that this is still the leading model, though that could soon change. This model provides the highest level of control and flexibility, but often puts a significant burden on developers who must (typically unwillingly) become DBAs with little training or experience.
  2. Use a Cloud-Managed DBaaS Instance – in this model the cloud provider offers a DBMS service that developers just use. All physical administration tasks (backup, recovery, log management, etc.) are performed by the cloud provider and the developer just needs to worry about structural tuning issues (indices, tables, query optimization, etc). Generally your choice of database is MySQL, MySQL, and MySQL – though a small number of clouds provide SQL Server support. Amazon RDS and SQL Azure are the two best known in this category.
  3. Use an External Cloud-Agnostic DBaaS Solution – this is very much like the cloud-based DBaaS, but has a value of cloud-independence – at least in theory. In the long run you might expect to be able to use an independent DBaaS to provide multi-cloud availability and continuous operations in the event of a cloud failure. FathomDB and Xeround are two such options.

Here’s a chart summarizing some of the characteristics of each model:

In my discussions with most of the RDBMSaaS solutions I have found that user acceptance and adoption is very high. When I spoke with Joyent a couple of months ago I was told that “nearly all” of their customers who spend over $500/month with them use their DBaaS solution. And while Amazon won’t give out such specifics, I have heard from them (both corporate and field people) that adoption is “very robust and growing.” The exception was FathomDB was launched at DEMO2010. They seem to not have gained much traction, but I don’t get the sense they are being very aggressive. When I spoke with one of their founders I learned they were working on a whol new underlying DBMS engine that would not even be compatible with MySQL. In any event, they have only a few hundred databases at this point. Xeround is still in private beta.

The initial DBaaS value proposition of reducing the effort and cost of administration is worth something, but in some cases it might be seen to be a nice-to-have vs. a need-to-have. Inevitably, the DBaaS solutions on the market will need to go beyond this to performance, scaling and other capabilities that will be very compelling for sites that are experiencing (or expect to experience) high volumes.

Amazon RDS, for instance, just added the ability to provision read replicas for applications with a high read/write ratio. Joyent has had something similar to this since last year when they integrated Zeus Traffic Manager to automatically detect and route query strings to read replicas (your application doesn’t need to change for this to work).

Xeround has created an entirely new scale-out option with an interesting approach that alleviates much of the trade-offs of the CAP Theorem. And ScaleBase is soon launching a “database load balancer” that automatically partitions and scales your database on top of any SQL database (at least eventually – MySQL will be first, of course but plans include PostgreSQL, SQL Server and possibly even Oracle). My friends at Akiban are also innovating in the MySQL performance space for cloud/SaaS applications.

Bottom line, SQL-based DBaaS solutions are starting to address many (though not all) of the leading reasons why developers are flocking to NoSQL solutions.

All of this leads me to the following conclusions – I’m interested if you agree or disagree:

  • Cloud-based DBaaS options will continue to grow in importance and will eventually become the dominant model. Cloud vendors will have to invest in solutions that enable horizontal scaling and self-healing architectures to address the needs of their bigger customers. While most clouds today do not offer an RDS-equivalent, my conversations with cloud providers suggest that may soon change.
  • Cloud-Independent DBaaS options will grow but will be a niche as most users will opt for the default database provided by their cloud provider.
  • The D-I-Y model of installing/managing your own database will eventually also become a niche market where very high scaling, specialized functionality or absolute control are the requirements. For the vast majority of applications, RDBMSaaS solutions will be both easier to use and easier to scale than traditional install/manage solutions.

At some point in the future I intend to dive more into the different RDBMSaaS solutions and compare them at a feature/function level. If I’ve missed any – let me know (I’ll update this post too).

Other Cloud DBMS Posts:

Amazon RDS vs. SQL Azure: The birth of the DBMS Utility

Amazon Adds Consistency to SimpleDB

Databases and Cloud Computing Roundup

Bookmark and Share

Filed under: General

Trackback Uri






22 Oct 10

At Interop this week I met with Doug Oathout, VP of Converged Infrastructure at HP.  It’s often been very frustrating trying to figure out if HP really has a cloud strategy and is poised to compete in this market.  While nobody would claim that HP is delivering any clarity on cloud right now, it sounds like they might be moving down the path a bit and a more comprehensive strategy might someday emerge.

What Doug talked about first was the economic value of a converged infrastructure (naturally).  In this regards they are positioning against Cisco and the broader VCE Coalition with particular emphasis on openness vs. the more prescriptive VCE approach (any hypervisor vs. VMware only, automation tooling that crosses into legacy environments, etc.).  Cisco might say that the downside of supporting that level of openness is complexity and increased cost.  We’ll let them duke that out but it’s clear that a market that used to be fragmented (storage, servers, networking, etc. sold by different vendors and integrated at the customer) has tilted towards more integrated and verticalized infrastructures that result in far fewer components and much less work to deploy.  I had to wonder if there was an opportunity for someone to do the same thing with commodity gear targeting the mass-market service provider space.

As for cloud offerings, there seem to be only three at the moment (at least that I was able to learn about in this meeting).

The first is private clouds built from their Matrix converged infrastructure and Cloud Service Automation (CSA) tools bundle (an integrated set of Opsware and other tools).  I guess I’d characterize this as IBM’s CloudBurst circa 2009 and Unisys’ Secure Private Cloud, but with a weaker story on cloudy capabilities such support for multi-tenancy, scaling out and more.  It’s the “cloud-in-a-box” approach.

Their second cloud offering is a quick-start service (“CloudStart“) to roll out a simple “cloud in a box” solution on customer premise in 30 days. Obviously that’s kind of a bunch of hype because the process changes, integrations etc. you need to do to really drive value out of an enterprise cloud program take many months of deep effort.

Their third area is not really a defined offering.  They are doing services around some other cloud technologies, most notably Eucalyptus.  This is natural given the deficiencies in cloud functionality with their CSA-based approach.

Notably absent are any offerings out of their former EDS managed services unit.  Doug mentioned a Matrix Online offering for standing up short-term infrastructure blocks for testing purposes, but it’s not a cloud, isn’t multi-tenant even, and requires HP labor to do the provisioning.  Like I said, not a cloud (if it even exists – can’t find it on the HP site)

Meanwhile, it seems like IBM is not putting as much emphasis on the CloudBurst approach anymore, instead focusing on their Smart Business Development & Test public cloud offering.  Sources tell me that this offering is doing quite well and several months ago there were tweets about them having run out of capacity.  HP currently has no such offering.

The takeaway for me was that HP is making inching progress in a couple areas of their business, but no discernible progress on driving a delivering a comprehensive, aligned and compelling enterprise cloud story to the market.  Looks like we’ll be waiting for a bit longer…

Bookmark and Share

Filed under: General,Vendors

Trackback Uri