Canonical Voices

What Barry Warsaw talks about

Posts tagged with 'canonical'

Barry Warsaw

So, now all the world now knows that my suggested code name for Ubuntu 12.10, Qwazy Quahog, was not chosen by Mark.  Oh well, maybe I'll have more luck with Racy Roadrunner.

In any case, Ubuntu 12.04 LTS is to be released any day now so it's time for my semi-annual report on Python plans for Ubuntu.  I seem to write about this every cycle, so 12.10 is no exception.  We've made some fantastic progress, but now it's time to get serious.

For Ubuntu 12.10, we've made it a release goal to have Python 3 only on the desktop CD images.  The usual caveats apply: Python 2.7 isn't going away; it will still probably always be available in the main archive.  This release goal also doesn't affect other installation CD images, such as server, or other Ubuntu flavors.  The relatively modest goal then only affects packages for the standard desktop CD images, i.e. the alternative installation CD and the live CD.

Update 20120425: To be crystal clear,  if you depend on Python 2.7, the only thing that changes for you is that after a fresh install from the desktop CD on a new machine, you'll have to explicitly apt-get install python2.7.  After that, everything else will be the same.

This is ostensibly an effort to port a significant chunk of Ubuntu to Python 3, but it really is a much wider, Python-community driven effort.  Ubuntu has its priorities, but I personally want to see a world where Python 3 rules the day, and we can finally start scoffing at Python 2 :).

Still, that leaves us with about 145 binary packages (and many fewer source packages) to port.  There are a few categories of packages to consider:

  • Already ported and available.  This is the good news, and covers packages such as dbus-python.  Unfortunately, there aren't too many others, but we need to check with Debian and make sure we're in sync with any packages there that already support Python 3 (python3-dateutil comes to mind).
  • Upstream supports Python 3, but it is not yet available in Debian or Ubuntu.  These packages should be fairly easy to port, since we have pretty good packaging guidelines for supporting both Python 2 and Python 3.
  • Packages with better replacements for Python 3.  A good example is the python-simplejson package.  Here, we might not care as much because Python 3 already comes with a json module in its standard library, so code which depends on python-simplejson and is required for the desktop CD, should be ported to use the stdlib json module.  python-gobject is another case where porting is a better option, since pygi (gobject-introspection) already supports Python 3.
  • Canonical is the upstream.  Many packages in the archive, such as python-launchpadlib and python-lazr.restfulclient are developed upstream by Canonical.  This doesn't mean you can't or shouldn't help out with the porting of those modules, it's just that we know who to lean on as a last resort.  By all means, feel free to contribute to these too!
  • Orphaned by upstream.  These are the most problematic, since there's essentially no upstream maintainer to contribute patches to.  An example is python-oauth.  In these cases, we need to look for alternatives that are maintained upstream, and open to porting to Python 3.  In the case of python-oauth, we need to investigate oauth2, and see if there are features we're using from the abandoned package that may not be available in the supported one.
  • Unknowns.  Well, this one's the big risky part because we don't know what we don't know.
We need your help!  First of all, there's no way I can personally port everything on our list, including both libraries and applications.  We may have to make some hard choices to drop some functionality from Ubuntu if we can't get it ported, and we don't want to have to do that.  So here are some ways you can contribute:
  • Fill in the spreadsheet with more information.  If you're aware of an upstream or Debian port to Python 3, let us know.  It may make it easier for someone else to enable the Python 3 version in Debian, or to shepherd the upstream patch to landing on their trunk.
  • Help upstream make a Python 3 port available.  There are lots of resources available to help you port some code, from quick references to in-depth guides.  There's also a mailing list (and Gmane newsgroup mirror) you can join to get help, report status, and have other related discussions. Some people have asked Python 3 porting questions on StackOverflow, using the tags #python, #python-3.x, and #porting
  • Join us on the #python3 IRC channel on Freenode.
  • Subscribe to the python-porting mailing list.
  • Get packages ported in Debian.  Once upstream supports Python 3, you can extend the existing Debian package to expose this support into Debian.  From there, you or we can make sure that gets sync'd into Ubuntu.
  • Spread the word!  Even if you don't have time to do any ports yourself, you can help publicize this effort through social media, mailing lists, and your local Python community.  This really is a Python-wide effort!
Python 3.3 is scheduled to be released later this year.  Please help make 2012 the year that Python 3 reached critical mass!

 -----------------------------

On a more personal note, I am also committed to making Mailman 3 a Python 3 application, but right now I'm blocked on a number of dependencies.  Here are the list of dependencies from the setup.py file, and their statuses.  I would love it if you help get these ported too!
Of course, these are only the direct dependencies.  Others that get pulled in include:


Read more
Barry Warsaw

sbuild is an excellent tool for locally building Ubuntu and Debian packages.  It fits into roughly the same problem space as the more popular pbuilder, but for many reasons, I prefer sbuild.  It's based on schroot to create chroot environments for any distribution and version you might want.  For example, I have chroots for Ubuntu Oneiric, Natty, Maverick, and Lucid, Debian Sid, Wheezy, and Squeeze, for both i386 and amd64.  It uses an overlay filesystem so you can easily set up the primary snapshot with whatever packages or prerequisites you want, and the individual builds will create a new session with an overlaid temporary filesystem on top of that, so the build results will not affect your primary snapshot.  sbuild can also be configured to save the session depending on the success or failure of your build, which is fantastic for debugging build failures.  I've been told that Launchpad's build farm uses a customized version of sbuild, and in my experience, if you can get a package to build locally with sbuild, it will build fine in the main archive or a PPA.

Right out of the box, sbuild will work great for individual package builds, with very little configuration or setup.  The Ubuntu Security Team's wiki page has some excellent instructions for getting started (you can stop reading when you get to UMT :).

One thing that sbuild doesn't do very well though, is help you build a stack of packages.  By that I mean, when you have a new package that itself has new dependencies, you need to build those dependencies first, and then build your new package based on those dependencies.  Here's an example.

I'm working on bug 832864 and I wanted to see if I could build the newer Debian Sid version of the PySide package.  However, this requires newer apiextractor, generatorrunner, and shiboken packages (and technically speaking, debhelper too, but I'm working around that), so you have to arrange for the chroot to have those newer packages when it builds PySide, rather than the ones in the Oneiric archive.  This is something that PPAs do very nicely, because when you build a package in your PPA, it will use the other packages in that PPA as dependencies before it uses the standard archive.  The problem with PPAs though is that when the Launchpad build farm is overloaded, you might have to wait several hours for your build.  Those long turnarounds don't help productivity much. ;)

What I wanted was something like the PPA dependencies, but with the speed and responsiveness of a local build.  After reading the sbuild manpage, and "suffering" through a scan of its source code (sbuild is written in Perl :), I found that this wasn't really supported by sbuild.  However, sbuild does have hooks that can run at various times during the build, which seemed promising.  My colleague Kees Cook was a contributor to sbuild, so a quick IRC chat indicated that most people create a local repository, populating it with the dependencies as you build them.  Of course, I want to automate that as much as possible.  The requisite googling found a few hints here and there, but nothing to pull it all together.  With some willful hackery, I managed to get it working.

Rather than post some code that will almost immediately go out of date, let me point you to the bzr repository where you can find the code.  There are two scripts: prep.sh and scan.sh, along with a snippet for your ~/.sbuildrc file to make it even easier.  sbuild will call scan.sh first, but here's the important part: it calls that outside the chroot, as you (not root). You'll probably want to change $where though; this is where you drop the .deb and .dsc files for the dependencies.  Note too, that you'll need to add an entry to your /etc/schroot/default/fstab file so that your outside-the-chroot repo directory gets mapped to /repo inside the chroot.  For example:

# Expose local apt repository to the chroot
/home/barry/ubuntu/repo    /repo    none   rw,bind  0 0
An apt repository needs a Packages and Packages.gz file for binary packages, and a Sources and Sources.gz file for the source packages.  Secure APT also requires a Release and Release.gpg file signed with a known key.  The scan.sh file sets all this up, using the apt-ftparchive command.  The first apt-ftparchive call creates the Sources and Sources.gz file.  It scans all your .dsc files and generates the proper entries, then creates a compressed copy, which is what apt actually "downloads".  The tricky thing here is that without changing directories before calling apt-ftparchive, your outside-the-chroot paths will leak into this file, in the form of Directory: headers in Sources.gz.  Because that path won't generally be available inside the chroot, we have to get rid of those headers.  I'm sure there's an apt-ftparchive option to do this, but I couldn't find it.  I accidentally discovered that cd'ing to the directory with the .dsc files was enough to trick the command into omitting the Directory: headers.

The second call to apt-ftparchive creates the Packages and Packages.gz files.  As with the source files, we get some outside-the-chroot paths leaking in, this time as path prefixes to the Filename: header value.  Again, we have to get rid of these prefixes, but cd'ing to the directory with the .deb files doesn't do the trick.  No doubt there's some apt-ftparchive magical option for this too, but sed'ing out the paths works well enough.

The third apt-ftparchive file creates the Release file.  I shameless stole this from the security team's update_repo script.  The tricky part here is getting Release signed with a gpg key that will be available to apt inside the chroot.  sbuild comes with its own signing key, so all you have to do is specify its public and private keys when signing the file.  However, because the public file from
/var/lib/sbuild/apt-keys/sbuild-key.pub
won't be available inside the chroot, the script copies it to what will be /repo inside the chroot.  You'll see later how this comes into play.

Okay, so now we have the repository set up well enough for sbuild to carry on.  Later, before the build commences, sbuild will call prep.sh, but this script gets called inside the chroot, as the root user.  Of course, at this point /repo is mounted in the chroot too.  All prep.sh needs to do is add a sources.list.d entry so apt can find your local repository, and it needs to add the public key of the sbuild signing key pair to apt's keyring.  After it does this, it needs to do one more apt-get update.  It's useful to know that at the point when sbuild calls prep.sh, it's already done one apt-get update, so this does add a duplicate step, but at least we're fortunate enough that prep.sh gets called before sbuild installs all the build dependencies.  Once prep.sh is run, the chroot will have your overriding dependent packages, and will proceed with a normal build.

Simple, huh?

Besides getting rid of the hackery mentioned above, there are a few things that could be done better:
  • Different /repo mounts for each different chroot
  • A command line switch to disable the /repo
  • Automatically placing .debs into the outside-the-chroot repo directory

Anyway, it all seems to hang together.  Please let me know what you think, and if you find better workarounds for the icky hacks.
 

Read more
Barry Warsaw

TL;DR: Ubuntu 12.04 LTS will contain only Python 2.7 and 3.2, while Ubuntu 11.10 will contain Python 3.2, 2.7 and possibly 2.6, but possibly not.

Last week, I attended the Ubuntu Developer Summit in Budapest, Hungary. These semi-annual events are open to everyone, and hundreds of people participate both in person and remotely. Budapest's was called UDS-O, where the 'O' stands for Oneiric Ocelot, the code name for Ubuntu 11.10, which will be released in October 2011. This is where we did the majority of planning for what changes, new features, and other developments you'll find in the next version of Ubuntu. UDS-P will be held at the end of the year in Orlando, Florida and will cover the as yet unnamed 12.04 release, which will be a Long Term Support release.

LTS releases are special, because we make longer guarantees for official support: 3 years on the desktop and 5 years on the server. Because of this, we're making decisions now to ensure that 12.04 LTS is a stable, confident platform for years to come.

I attended many sessions, and there is a lot of exciting stuff coming, but I want to talk in some detail about one area that I'm deeply involved in. What's going to happen with Python for Oneiric and 12.04 LTS?

First, a brief summary of where we are today. Natty Narwhal is the code name for Ubuntu 11.04, which was released back in April and is the most recent stable release. It is not an LTS though; the last LTS was Ubuntu 10.04 Lucid Lynx, release back in October 2010. In Lucid, the default Python (i.e. /usr/bin/python) is 2.6 and Python 2.7 is not officially supported or available. Python 3.1 is available for Lucid, but not installed by default.

In Natty, the default Python is 2.7 with 2.6 still being officially supported. This means that you can have both Python 2.6 and 2.7 on your Natty machine, and where possible, packages were built for both Python versions. Where this was not possible, you'll almost always find a package for Python 2.7 instead of 2.6. Natty also has Python 3.2 and 3.1 available, with 3.2 being the default.

Two more bits of background are useful to know. In Ubuntu (inherited from Debian, where most packages are initially developed), we separate Python 2 support and Python 3 support into separate "stacks", meaning entirely separate binary packages even if the source packages are the same. This has many benefits, including allowing a system administrator to only install the Python 2 stack, or only the Python 3 stack if they want. It also makes for our eventual transition to Python 3 much easier, because packages don't need to be renamed. So for example, if you see a package named "python-foo" you know this is the Foo package for Python 2. You might also see a "python3-foo" which would be the Python 3 version of Foo.

Also, many packages are built for all supported versions in a particular stack. So for example, if we want to make Foo available for both Python 2.6 and 2.7, we'll include support for both in a single python-foo package. Pure Python source code is generally easily shared, so this reduces the duplication (more on this in another blog posting), however extension modules, which are usually implemented in C, must be compiled twice, and both shared libraries must be included in the same binary package. This means if we are supporting package Example which contains an extension module, for both Python 2.6 and 2.7, the binary package will contain two shared libraries, effectively doubling the disk consumption for extension module support.

Keep all that in mind as I describe what comes next!

To understand our plans for Oneiric, it's first useful to explain our goals for the next LTS, since we'll be using 11.10 as a transitional cycle. For 12.04 LTS, we want to support just one Python 2 version and just one Python 3 version. Because Python 2.6 is in security-fix only mode in upstream Python, we want to drop support for it in 12.04 LTS. This will also allow us to reclaim some space on the installation CDs because we won't need to include extension modules compiled for both Python 2.6 and 2.7. Last cycle we calculated the savings at about 10MiB, which is not insignificant on a standard CD.

For 12.04 LTS, the only Python 3 version we want to support is Python 3.2. Our thinking here is that there really isn't much code out there that depends on Python 3 yet, and Python 3.2 has many very useful features that make it (IMO) the first Python 3 to start basing production quality code on. We're going to put our money where our mouth is here, and I'll write more on that later too.

The decision to drop Python 3.1 support for 12.04 LTS is, as far as I know, completely uncontroversial, so this will happen in Oneiric. And because Python 3.3 will not be released before 12.04 LTS, we will be making that change very soon, so as to provide the longest possible period of stabilization and porting between now and April 2012. If you've been holding off on developing for Python 3, now is a great time to jump in!

Dropping Python 2.6 is somewhat more controversial for several reasons. First, in Ubuntu, we rely very heavily on Debian for the majority of packages, and we strongly encourage our developers to submit patches and new packages in Debian first, with requests for syncing to Ubuntu once they're available in Debian. My take on this relationship is that, because Ubuntu has strictly timed releases while Debian has a "release-when-ready" policy, we can often use Ubuntu's development cycle to blaze a trail (sometimes on the bleeding edge ;) but it's always critical to ensure that wherever possible, Debian contains the authoritative versions of our packages. Now that Debian has released Squeeze and is working on its Wheezy release, it's time for us to push our Ubuntu changes back into Debian, and work on getting the latest upstream versions into Debian, while syncing back to Ubuntu. For this reason, we just don't want to get too far ahead of Debian in our Python support. Our plan therefore is to continue to support Python 2.6 until Debian has completed their transition to Python 2.7 as the default version (they already support both, but Python 2.6 is still the default).

Our timeline therefore is to make a final decision on Python 2.6's fate for Oneiric by feature freeze. If Debian still hasn't completed their transition by then, we'll keep Python 2.6 for Oneiric and drop it as soon as the archive opens for 12.04 LTS. This should be pretty low risk for us, and it helps us better align ourselves with Debian, which is always a good thing! If you feel so inclined, you can help by working on some of the blocker bugs for Debian's transition to Python 2.7, as we will also be doing.

Another reason to be cautious about dropping Python 2.6 is because many of the services in our own data center are not yet ported to Python 2.7. Probably the biggest of such services is Launchpad. Our data center machines always run the previous LTS, and Lucid does not have Python 2.7, so this makes for a kind of Catch-22 for the Launchpad team. To address this, we've created a quasi-official PPA into which we'll backport Python 2.7 and many dependent modules. The Launchpad team can then use this PPA to work on their own port to Python 2.7 in plenty of time for 12.04 LTS. Anybody else out there who wants to do the same can also use our PPA, and if they need additional modules backported, they can create their own PPA which depends on ours for the base support.

So, that's what's happening, and why. Feedback is of course invited here, on the ubuntu-devel mailing list, or to me directly. If you want to follow along, you can take a look at the blueprint describing these changes, and more.

In the next articles, I plan to discuss how we're going to get to Python 3 only on the Ubuntu CDs, and how we're going to help with the migration to dh_python2. Cheers!

Read more
Barry Warsaw

What We Do

My friends and family often ask me what I do at my job. It's easy to understand when my one brother says he's a tax accountant, but not so easy to explain the complex world of open source software development I live in. Sometimes I say something to the effect: well, you know what Windows is, and you know what the Mac is right? We're building a third alternative called Ubuntu that is free, Linux-based and in most cases, much better. Mention that you won't get viruses and it can easily breathe new life into that old slow PC you shudder to turn on, and people at least nod their heads enthusiastically, even if they don't fully get it.

I've been incredibly fortunate in my professional career, to have been able to share the software I write with the world for almost 30 years. I started working for a very cool research lab with the US Federal government while still in high school. We had a UUCP connection and were on the early Arpanet, and because we were funded by the US taxpayer, our software was not subject to copyright. This meant that we could share our code with other people on Usenet and elsewhere, collaborate with them, accept their suggestions and improvements, and hopefully make their lives a little better, just as others around the world did for us. It was free and open source software before such terms were coined.


I've never had a "real job" in the sense of slaving away in a windowless cube writing solely proprietary software that would never see the light of day. Even the closed source shops I've worked have been invested somehow in free software, and with varying degrees of persuasion, have both benefited from and contributed to the free and open source ecosystem. Thus, in many ways, my current position at Canonical feels like the perfect fit and ultimate destination for my background, skills, and passion. Canonical is open source to its core. Its central mission, as articulated by our founder Mark Shuttleworth is "to bring free software to the widest possible audience, powered by services rather than licenses, in tune with a world that was moving to services as the core economic model of a digital world."

To me, the free and open source ethos goes much deeper than just the software we write. It's about collaboration, community, sharing, learning, teaching, and having a truly positive impact on the world. It's about empowering individuals to realize their full potential, to give them the opportunity to build a merit based reputation, to carve out their own areas of interest and expertise, and relate that to the larger society, to know that they make a difference, and that their opinions and contributions matter. Open source is about having the courage of your convictions, but embracing humility to admit when you're wrong and someone else has a better idea. To encourage and actively seek out consensus, but also to cultivate a thoughtful and compassionate process for making the hard decisions when consensus can't be reached. It's about spreading enthusiasm and rallying others to your side sometimes, and at other times humbly and joyfully embracing other points of view.

I could go on with all the mushy goodness, but let's look at a few areas where the work I do for Canonical directly contributes to the broader free and open source ecosystem.

Python

I've been a core Python developer since about 1995, and have served several times as release manager for major new versions. As part of my job on the Ubuntu Platform Foundations team, I'm keenly concerned with issues involving Python's deployment on Ubuntu and its upstream ancestor Debian. One of the distinction between Ubuntu/Debian and other Linux distributions is that Ubuntu/Debian often provides more than one Python version in a release. For example, in Natty Narwhal, it's likely that Ubuntu will officially support Python 2.6, 2.7, 3.1, and 3.2. The reason for this is that it does take some porting effort for applications to upgrade their version of Python, and this eases the transition for those applications. But it's complicated by the fact that upstream Python doesn't really support multiple installed versions out of the box. Recent work I've done on PEP 3147 and PEP 3149 improve this situation greatly, by allowing multiple versions of Python to more peacefully coexist. As is typical, I try to push my changes needed for the Ubuntu platform as far upstream as possible, because that benefits the most users of the open source software. In addition, a huge number of users get their Python interpreter from their operating system vendor (this is at least the case on all Linux variants as well as Mac OS X), so work done to improve the Python experience on Ubuntu directly and positively impacts users on Debian and other Linux distributions, as well as the general Python community at large.

GNU Mailman and Launchpad

I've been the lead developer for GNU Mailman since the late 90's, and I've been overwhelmed to see it become the predominant free software mailing list manager. When I was working on the Launchpad team at Canonical, my primary responsibility for the first few years was to integrate mailing lists with Launchpad, and of course Mailman was the obvious choice for underlying technology. At first, Launchpad was closed source software, but the intent was always to release it as open source, and it was with much joy that we saw Launchpad released under a free software license in the summer of 2009. It was not a design goal of the original Mailman software to be easily integrated with external systems, so a good bit of the work I did on Launchpad at the time directly benefited upstream Mailman users by making it easier for ourselves, and others, to do this type of integration. Now that I'm no longer on the Launchpad team, I work on Mailman 3 exclusively in my spare time. My experiences with that integration effort had a direct influence on the architecture of Mailman 3, and I have a goal of swapping the current Mailman 2.1 technology in Launchpad with Mailman 3. All the relevant work to do this is fed back into the upstream project, to the benefit of the larger free software community, since many other people also want a better way to integrate mailing list functionality with their web sites.

My Mailman and Launchpad work also saw many spin-offs of helpful utilities and libraries which can be used by the much wider community of open source developers. The lazr suite of Python libraries has applicability outside Launchpad and I use many of them in my extra-curricular projects, evangelizing them along the way. Many GNU Mailman utilities have also been spun off into their own libraries, and these are now or will soon be available on the Debian and Ubuntu platforms, just as they are now available on the Python Package Index. The work I'm doing, along with my Canonical and Ubuntu colleagues to reduce the barriers to opportunistic participation in projects is key to this ecosystem. We're making it much easier for others to find and participate in the projects that interest them, at whatever level they choose.

UDD

Along those lines, as part of my work on Ubuntu Platform Foundations, I've become quite enthusiastic about technology we're calling Ubuntu Distributed Development. When I moved to the Platform team, I knew next to nothing about the art of packaging, which is how upstream projects are turned into installable and manageable chunks that people can put on their Ubuntu (and Debian) machines. Packaging has a rich tradition, but it's pretty esoteric and to most people who don't package, it's a black art. Even now that I know how to package software, I still think it's way too magical. UDD is a set of tools and procedures that aim to bring packaging to the masses, by reducing the magic and exposing the important parts in much more familiar tools. Every open source developer sooner or later (hopefully much sooner!) learns how to use a revision control system. Such systems allow developers to manage their code in a principled and rigorous way, by keeping detailed records about the changes to their software over time. Revision control systems are an absolutely fundamental and required tool in the open source developer's toolbox.

The Bazaar distributed revision control system was written by Canonical, but is now a community-driven free software project. It's ease of use, superior quality, and high degree of flexibility and extensibility make it a very attractive choice for projects looking to use the next generation in revision control systems. Bazaar also integrates very nicely with Launchpad. UDD are a set of extensions to bring packaging to the Bazaar suite, so that the very tool that upstream software developers use dozens of times a day, are also the tools they will use to create packages for making their software available to the mass of Ubuntu users. I've been using Bazaar for years now, and have written a few plugins and patches. I'll soon be helping to lead an effort to improve the UDD workflows and more widely expose the benefits of UDD to more and more Ubuntu developers.

Platform

I love my work on Platform because it allows me to have a small hand in lots of different upstream free and open source projects. After so many years of hacking on such projects of all different stripes, I'm fairly good at looking at and understanding good open source code, debugging and fixing problems, and interacting with the upstream communities to get those patches pushed higher up the stack. Bug trackers, mailing lists, IRC, and revision control systems are the core technologies needed for this work, and I'm comfortable interacting on all of them. Part of our job as Platform developers is, in my opinion, to engage the upstream projects, so that the fixes and changes we need to make in order to provide the absolute best experience to Ubuntu users, are available even to those folks who don't get their software from Ubuntu. This to me is the the true meaning of being a free and open source developer for a company whose mission is to make free and open source available to the mass of computer users. Our aim is not just to make software that competes on price, or liberty, but also on quality and "sex appeal" - we want our users to flock to Ubuntu not just because it costs nothing, but because it's so compelling that you can't help but love it.

Read more
Barry Warsaw

I'm doing some work these days on trying to get Python 2.7 as the default Python in the next version of Ubuntu, Maverick Meerkat (10.10). This work will occasionally require me to break my machine by installing experimental packages. That's a good and useful thing because I want to test various potentially disruptive changes before I think about unleashing them on the world. This is where virtual machines really shine!


To be efficient, I need a really fast turnaround from known good state, to broken state, back to known good state. In the past, I've used VMware Fusion on my Mac to create a VM, then take a live snapshot of the disk before making my changes. It was really easy then to revert to the last known good snapshot, try something else and iterate.

But lately Fusion has sprouted a nasty habit of freezing the host OS, such that a hard reboot is necessary. This will inevitably cause havoc on the host, by losing settings, trashing mail, corrupting VMs, etc. VMware can't reproduce the problem but it happens every time to me, and it hurts, so I'm not doing that any more :).

Back to my Lucid host and libvirt/kvm and the sanctuary of FLOSS. It's really easy to create new VMs, and there are several ways of doing it, from virt-manager to vmbuilder to straight up kvm (thanks Colin for some recipes). The problem is that none of these are exactly fast to go from bare metal to working Maverick VM with all the known good extras I need (like openssh-server and bzr, plus my comfortable development environment).

I didn't find a really good fit for vmbuilder or the kvm commands, and I'm not smart enough to use the libvirt command line tools, but I think I've figured out a hack using virt-manager that will work well enough.

1. Create a disk for the baseline VM (named 'scars' in my case :) manually
% qemu-img create -f qcow2 scars.qcow2 20G

2. Create the baseline VM using virt-manager
* I use dhcp internally, so I give this thing a mac address, assign it 1GB
of RAM and 1 processor.
* For storage, I tell it to use the scars.qcow2 file I created above
* Boot from the maverick ISO of your choice, install everything you want,
and get your development environment in place
* Shut this machine down

3. Clone your baseline VM
* In the virt-manager Manager window, right click on your baseline VM and
select Clone
* You will not be given an opportunity to select a disk or a mac address,
so for now just go with the defaults.
* Do not start your clone

4. Create an 'overlay' disk that is a backed by your baseline disk.
% qemu-img create -f qcow2 -b scars.qcow2 scars.ovl

5. Edit your clone
* Delete the disk given to your clone by default
* Create a new virtio storage that points to scars.ovl
* Delete the nic given to your clone by default
* Create a new virtio network device with the mac address of your
baseline. You'll get a warning about a mac address collision, but this
can be ignored (see below).

6. Boot your clone

At this point you'll have a baseline which is your known good system, and a clone/overlay which you can break to your heart's content. When it's time to iterate back to a known good state, shut down your clone, delete the overlay disk, and create a new one from the baseline qcow2 disk. This is pretty fast, and your turn around time is not much more than the time it takes to shutdown one machine and boot another. It actually feels a lot faster by the wall clock than Fusion ever was to snapshot and restore.

One downside is that you cannot run both VMs at the same time. I think mostly this is because of the MAC address collision, but also because creating the overlay requires that both machines be powered off.

The other downside seems to be that if you want to update your known good baseline, say by installing more packages or apt-get update/upgrade, you will have to recreate your overlay disk for your next experiment. Changes to the underlying disk do not seem to propagate to the overlay automatically. Maybe that's intentional; I can't find much documentation on it. (Note too that the manpage for qemu-img does not describe the -b option.)

I guess the last downside is that I spent way too much time trying to figure all this out. The Googles were not a lot of help but did give me the qemu-img clue. But at least now you don't have to! :)

Read more
Barry Warsaw

My friend Tim is working on a very cool Bazaar-backed wiki project and he asked me to package it up for Ubuntu. I'm getting pretty good at packaging Python projects, but I always like the practice because each time it gets a little smoother. This one I managed to package in about 10 minutes so I thought I'd outline the very easy process.


First of all, you want to have a good setup.py, and if you like to cargo cult, you can start with this one. I highly recommend using Distribute instead of setuptools, and in fact the former is what Ubuntu gives you by default. I really like adding the distribute_setup.py which gives you nice features like being able to do python setup.py test and many other things. See lines 18 and 19 in the above referenced setup.py file.

The next thing you'll want is Andrew Straw's fine stdeb package, which you can get on Ubuntu with sudo apt-get install python-stdeb. This package is going to bootstrap your debian/ directory from your setup.py file. It's not perfectly suited to the task (yet, Andrew assures me :), but we can make it work!

These days, I host all of my packages in Bazaar on Launchpad, which is going to make some of the following steps really easy. If you use a different hosting site or a different version control system, you will have to build your Ubuntu package using more traditional means. That's okay, once you have your debian/ directory, it'll be fairly easy (but not as easy as described here ). If you do use Bazaar, you'll just want to make sure you have the bzr-builddeb. Just do sudo apt-get install bzr-builddeb on Ubuntu and you should get everything you need.

Okay, so now you have the requisite packages, and a setup.py, let's build us a deb and upload it to our personal package archive so everyone on Debian and Ubuntu can easily try it out.

First, let's create the debian directory. Here's the first little icky bit:

% python setup.py --command-packages=stdeb.command sdist_dsc

Notice that this leaves us with a deb_dist/ directory, not the debian/ directory we want. The latter is in there, just buried a bit. Let's dig it out:

% mv deb_dist/wikkid-0.1/debian .
% rm -rf deb_dist
% bzr add debian
% bzr commit -m'Debianize'

Note that "wikkid-0.1" will be replaced by the name of your package. In order to build the .deb package, you need an "orig.tar.gz" file. Packaging sort of assumes that you've got an original upstream tarball somewhere and you're just adding the necessary Debian goo to package the thing. In this case, we don't have an upstream tarball, although we could easily create one, and upload it to the Cheeseshop or Launchpad or wherever. However, that just slows us down so let's skip that for now! (Aside: if you do have an upstream tarball somewhere, you'll want to add a debian/watch which points to it; that'll eliminate the need to do the next step, by downloading the tarball instead).

Let's create the tarball right now and copy it to where the following step will expect it:

% python setup.py sdist
% mv dist/Wikkid-0.1.tar.gz ../wikkid_0.1.orig.tar.gz

Here's the second icky bit. Building a Debian source package imposes a very specific naming convention on the tarball. Wikkid's setup.py happens to build a tarball with an incompatible name, while the sdist command leaves it in a place where the next step can't find it. The rename just gets everything into the proper place. YMMV.

Now we can build the Debian source package. It's the source package that we'll upload to our Launchpad PPA. Launchpad will then automatically (if we've done everything right) build the binary package from the uploaded source package, from which Ubuntu and Debian users can easily install.

Oops! Before we do this, please edit your debian/changelog file and change unstable to lucid. You should also change the version number by adding a ~ppa1 to the end of it. Yeah, more ickiness.

Alright now we're ready to build our source package:

% bzr bd -S

Now let's upload it (assuming you've enabled a PPA):

% cd ..
% dput ppa:barry/python wikkid_0.1-1~ppa1_source.changes

That's it! If you've done everything successfully, you'll have the package in your PPA in 5 minutes or so. Then anybody who's added your PPA can just apt-get install wikkid (or whatever your package is called).

I do hope to work with the appropriate developers to make some of the ickiness go away. Please do contact me if you want to help!

Addendum (2010-06-10)

Let's say you publish your tarball on the Cheeseshop or Launchpad, and you don't want to have to build a different tarball locally in order to package it. Here's what I think works:

Create a debian/watch file that points to the download location you publish to. If your package is not yet available in Debian or Ubuntu, then use this command to build your source package:

bzr bd -S -- -sa

The bit at the end tells the Debian packaging primitives to include your tarball when your source package is uploaded. The debian/watch file is used to download your published tarball and automatically renamed to the required .orig.tar.gz name. When you dput your package, your tarball will be uploaded too, and everything should build properly.

Oh, and don't forget to look carefully at the lintian output. Try to make this as clean as possible. The Debian and Ubuntu packaging guides can help here.

Addendum 2 (2010-06-10)

Andrew Straw has added a debianize command to his stdeb package, which makes things much nicer. With this you can create the debian/ directory right next to your setup.py. AFAIK, this version of stdeb isn't released yet, so you need to install his git head in a virtualenv, and it has a few minor buglets, but it does seem like the best-of-breed solution. I'll post another article with a more detailed follow up later.

Read more
Barry Warsaw

Gentoo No More

Today I finally swapped my last Gentoo server for an Ubuntu 10.04 LTS server. Gentoo has served me well over these many years, but with my emerge updates growing to several pages (meaning, I was waaaay behind on updates with almost no hope of catching up) it was long past time to switch. I'd moved my internal server over to Ubuntu during the Karmic cycle, but that was a much easier switch. This one was tougher because I had several interdependent externally facing services: web, mail, sftp, and Mailman.


The real trick to making this go smoothly was to set up a virtual machine in which to install, configure and progressively deploy the new services. My primary desktop machine is a honkin' big i7-920 quad-core Dell with 12GB of RAM, so it's perfectly suited for running lots of VMs. In fact, I have several Ubuntu, Debian and even Windows VMs that I use during my normal development of Ubuntu and Python. However, once I had the new server ready to go, I wanted to be able to quickly swap it into the real hardware. So I purchased a 160GB IDE drive (since the h/w it was going into was too old to support SATA, but still perfectly good for a simple Linux server!) and a USB drive enclosure. I dropped the new disk into the enclosure, mounted it on the Ubuntu desktop and created a virtual machine using the USB drive as its virtio storage.

It was then a pretty simple matter of installing Ubuntu 10.04 on this USB drive-backed VM, giving the VM an IP address on my local network, and installing all the services I wanted. I could even register the VM with Landscape to easily keep it up-to-date as I took my sweet time doing the conversion. There were a few tricking things to keep in mind:

  • I use a port forwarding border router to forward packets from my static external IP address to the appropriate server on my internal network. As I prepared to move each service, I first shut the service off on the old server, twiddled the port forwarding to point to a bogus IP, then tested the new service internally before pointing the real port forward to the new service. This way for example, I had reasonably good confidence that my SMTP server was configured properly before loosing the fire hose of the intarwebs on it.
  • I host several domains on my server so of course my Apache uses NameVirtualHosts. The big downside here is that the physical IP address is used, so I had to edit all the configs over to the temporary IP address of the VM, then back again to the original IP of the server, once the switch was completed.
  • My old server used a fairly straightforward iptables configuration, but in Ubuntu, UFW seems to be the norm. Again, I use IP addresses in the configuration, so these had to be changed twice during the migration.
  • /etc/hosts and /etc/hostname both had to be tweaked after the move since while living in a VM, the host was called something different than when in its final destination. Landscape also had to be reconfigured (see landscape-config(8)).
You get the picture. All in all, just tedious if not very difficult. One oddness was that when the machine was a Gentoo box, the ethernet port was eth0 but after the conversion to Ubuntu, it became eth1. Once I figured that out, it was easy to fix networking. There were a few other little things like updates to my internal DNS, pointing my backup server to the new locations of the web and Mailman data on the new server (Ubuntu is more FHS-compliant than my ancient Gentoo layout), and I'm sure a few other little things I forgot to take notes on.

Still, the big lesson here was that by bouncing the services to a USB-drive backed VM, I was able to fairly easy drop the new disk into the old server for a quick and seamless migration to an entirely new operating system.

Read more