Canonical Voices

Posts tagged with 'deployment'

mark

As we move from “tens” to “hundreds” to “thousands” of nodes in a typical data centre we need new tools and practices. This hyperscale story – of hyper-dense racks with wimpy nodes – is the big shift in the physical world which matches the equally big shift to cloud computing in the virtualised world. Ubuntu’s popularity in the cloud comes in part from being leaner, faster, more agile. And MAAS – Metal as a Service – is bringing that agility back to the physical world for hyperscale deployments.

Servers used to aspire to being expensive. Powerful. Big. We gave them names like “Hercules” or “Atlas”. The bigger your business, or the bigger your data problem, the bigger the servers you bought. It was all about being beefy – with brands designed to impress, like POWER and Itanium.

Things are changing.

Today, server capacity can be bought as a commodity, based on the total cost of compute: the cost per teraflop, factoring in space, time, electricity. We can get more power by adding more nodes to our clusters, rather than buying beefier nodes. We can increase reliability by doubling up, so services keep running when individual nodes fail. Much as RAID changed the storage game, this scale-out philosophy, pioneered by Google, is changing the server landscape.

In this hyperscale era, each individual node is cheap, wimpy and, by historical standards for critical computing, unreliable. But together, they’re unstoppable. The horsepower now resides in the cluster, not the node. Likewise, the reliability of the infrastructure now depends on redundancy, rather than heroic performances from specific machines. There is, as they say, safety in numbers.

We don’t even give hyperscale nodes proper names any more – ask “node-0025904ce794”. Of course, you can still go big with the cluster name. I’m considering “Mark’s Magnificent Mountain of Metal” – significantly more impressive than “Mark’s Noisy Collection of Fans in the Garage”, which is what Claire will probably call it. And that’s not the knicker-throwing kind of fan, either.

The catch to this massive multiplication in node density, however, is in the cost of provisioning. Hyperscale won’t work economically if every server has to be provisioned, configured  and managed as if it were a Hercules or an Atlas. To reap the benefits, we need leaner provisioning processes. We need deployment tools to match the scale of the new physical reality.

That’s where Metal as a Service (MAAS) comes in. MAAS makes it easy to set up the hardware on which to deploy any service that needs to scale up and down dynamically – a cloud being just one example. It lets you provision your servers dynamically, just like cloud instances – only in this case, they’re whole physical nodes. “Add another node to the Hadoop cluster, and make sure it has at least 16GB RAM” is as easy as asking for it.

With a simple web interface, you can  add, commission, update and recycle your servers at will.  As your needs change, you can respond rapidly, by adding new nodes and dynamically re-deploying them between services. When the time comes, nodes can be retired for use outside the MAAS.

As we enter an era in which ATOM is as important in the data centre as XEON, an operating system like Ubuntu makes even more sense. Its freedom from licensing restrictions, together with the labour saving power of tools like MAAS, make it cost-effective, finally, to deploy and manage hundreds of nodes at a time

Here’s another way to look at it: Ubuntu is bringing cloud semantics to the bare metal world. What a great foundation for your IAAS.

Read more
Szilveszter Farkas

In this blog entry I would like to describe our deployment strategies we use at the different stages of our development process. The stages are the following:

  1. local development
  2. QA
  3. staging (+ QA)
  4. production (+ QA)

For local development everyone is welcome to use his preferred way, but most of us bet on Virtualenv. Especially given the fact that we maintain a large number of different projects, it makes our lives a lot easier to have a separate environment for each of those. We also have to make sure that we align with the production environment, which is in some cases still Python 2.5-based, but we’re currently in transition to 2.6.

If a feature or bugfix is ready to be QA’d, we deploy the application to an Amazon EC2 instance. Our team mate, ?ukasz Czy?ykowski, wrote a collection of extensions to Fabric, that provides a few useful functions (e.g. using private PPAs very easily). With a few dozens of lines of simple Python code, we can deploy the whole application to a running EC2 instance. We also use EC2 to QA all the features and bugfixes targeted at a release together before deploying to staging, so that if there is an issue, we can re-deploy very quickly (during the next two stages, QA is mainly about testing regressions).

The staging and production environments are identical from the deployment process perspective. We simply create a binary Debian package from our application: Launchpad’s PPA feature makes the build process a breeze. The main reason we decided to go with Debian packages is that we can also specify system level dependencies, not only Python packages (and of course there’s some dogfooding involved since the company supports Ubuntu). This also requires that all of the team members have packaging skills, so we had several training sessions, and a two-day online sprint where we packaged lazr.restful and all of its dependencies which were not available in Ubuntu 8.04 (around 30 packages, half of them backports, half of them new packages – thanks to our hard-working team members, these are available for Ubuntu 10.04 as well).

For configuration we don’t use Django’s built-in settings mechanism, but a custom solution that will be open sourced in the near future (one more reason to keep an eye on our blog). It consists of two components: schemaconfig is responsible for parsing the config files (which are INI-style, but have some extra features, like layering, typing, and support for data structures like lists and dictionaries – basically we looked around for solutions, and stole a little bit from everywhere to put together one that fits us most), and there’s django-settings which is a glue between schemaconfig and Django’s settings (so in the end we still use django.conf.settings). One of the biggest problems we had with our previous setup that it was very prone to human error, and that caused us unexpected deployment issues between staging and production. This is solved by the layering and the non-Python style of the config files, so they are easily manageable by both us and IS (our operations team).

Watch this blog for more about schemaconfig and other exciting projects and articles!

Read more