Canonical Voices


I just uploaded Apport 2.1 to Quantal. A big change in that version is that the whole code now works with both Python 2 and 3, except for the launchpadlib crash database backend (as we do not yet have a python3-launchpadlib package).

I took some care that apport report objects get along with both strings (unicode type in Python 2) and byte arrays (str type in Python 2) in values, so most package hooks should still work. However, now is the time to check whether they also work with Python 3, to make the impending transition to Python 3 easier.

However, you need to watch out if you use projects or scripts which directly use python-apport to process reports: The open(), write(), and write_mime() methods now require the passed file descriptors to be open in binary mode. You will get an exception otherwise.

A common pattern so far has been code like

  report = apport.Report()

This needs to be changed to

  report = apport.Report()
  with open('myfile.crash', 'rb') as f:

The “with” context is not strictly required, but it takes care of timely closing the files again. This avoids ResourceWarning spew when you run this in test suites or enable warnings.

Read more

The first Beta of the upcoming PostgreSQL 9.2 was released yesterday (see announcement). Your humble maintainer has now created packages for you to test. Please give them a whirl, and report any problems/regressions that you may see to the PostgreSQL developers, so that we can have a rock solid 9.2 release.

Remember, with the postgresql-common infrastructure you can use pg_upgradecluster to create a 9.2 cluster from your existing 8.4/9.1 cluster and run them both in parallel without endangering your data.

For Debian the package is currently waiting in the NEW queue, I expect them to go into experimental in a day or two. For Ubuntu 12.04 LTS you can get packages from my usual PostgreSQL backports PPA. Note that you need at least postgresql-common version 0.130, which is available in Debian unstable and the PPA now.

I (or rather, the postgresql-common test suite) found one regression: Upgrades do not keep the current value of sequences, but reset them to their default value. I reported this upstream and will provide updated packages as soon as this is fixed.

Read more

This announcement comes very late (a week after release), but better late than never..

The first PyGObject 3.3 series release is now out, with lots of yummy fixes and improvements. Dieter, Sebastian, and I went through a round of bugzilla spring cleaning to clean up old bugs, fix simple bugs, and apply good patches that were waiting, so as a result the patch queue is now almost empty and PyGObject works better than ever.

There was also quite some work on the test suite: it became a lot stricter and robust, and now also enforces PEP8 compatibility and absence of pyflake errors of the code.

One small but handy new feature is that the freeze_notify() and handler_block() methods are now context managers, i. e. they automatically call the corresponding thaw_notify()/handler_unblock() at the end of the with statement in an exception-safe way. ((#672324)

There are almost no API changes in this release, so it should work fine with GNOME 3.4 and applications developed with pygobject 3.2. The one exception is the removal of the Gobject.get_data() and Gobject.set_data() methods. They were prone to errors and crashes as they are not safely bindable, and in Python you can and should just use normal Python object attributes instead.

Complete list of changes:

  • GSettings: allow extra keyword arguments (Giovanni Campagna) (#675105)
  • pygtkcompat: Correct Userlist module use (Jose Rostagno) (#675084)
  • Add release-news make rule (Martin Pitt)
  • Add “make check.nemiver” target (Martin Pitt)
  • Test flags and enums in GHash values (Martin Pitt) (#637466)
  • tests: Activate test_hash_in and apply workaround (Martin Pitt) (#666636)
  • Add special case for Gdk.Atom array entries from Python (Martin Pitt) (#661709)
  • test_gdbus: Call GetConnectionUnixProcessID() with correct signature (Martin Pitt) (#667954)
  • Add test case for Gtk.ListStore custom sort (Martin Pitt) (#674475)
  • GTK overrides: Add missing keyword arguments (Martin Pitt) (#660018)
  • Add missing override for TreeModel.iter_previous() (Martin Pitt) (#660018)
  • Drop obsolete drag method conversions (Martin Pitt) (#652860)
  • tests: Replace deprecated assertEquals() with assertEqual() (Martin Pitt)
  • Plug tiny leak in constant_info_get_value (Paolo Borelli) (#642754)
  • Fix len_arg_index for array arguments (Bastian Winkler) (#674271)
  • Support defining GType properties from Python (Martin Pitt) (#674351)
  • Handle GType properties correctly (Bastian Winkler) (#674351)
  • Add missing GObject.TYPE_GTYPE (Martin Pitt)
  • Fix for Python 3 (Martin Pitt)
  • Make callback exception propagation test stricter (Martin Pitt) (#616279)
  • Add context management to freeze_notify() and handler_block(). (Simon Feltman) (#672324)
  • Add support for GFlags properties (Martin Pitt) (#620943)
  • Wrap GLib.Source.is_destroyed() method (Martin Pitt) (#524719)
  • Fix error message when trying to override a non-GI class (Martin Pitt) (#646667)
  • Fix segfault when accessing __grefcount__ before creating the GObject (Steve Frécinaux) (#640434)
  • Do not bind gobject_get_data() and gobject_set_data() (Steve Frécinaux) (#641944)
  • Add test case for multiple GLib.MainLoop instances (Martin Pitt) (#663068)
  • Add a ccallback type which is used to invoke callbacks passed to a vfunc (John (J5) Palmieri) (#644926)
  • Regression test: marshalling GValues in GHashTable (Alberto Mardegan) (#668903)
  • Update .gitignore (Martin Pitt)
  • Fix “distcheck” and tests with out-of-tree builds (Martin Pitt)
  • Add a pep8 check to the makefile (Johan Dahlin) (#672627)
  • PEP8 whitespace fixes (Johan Dahlin) (#672627)
  • PEP8: Remove trailing ; (Johan Dahlin) (#672627)
  • tests: Replace deprecated Python API (Martin Pitt)
  • Fail tests if they use or encounter deprecations (Martin Pitt)
  • Do not run tests in two phases any more (Martin Pitt)
  • test_overrides: Find local gsettings schema with current glib (Martin Pitt)
  • Add GtkComboBoxEntry compatibility (Paolo Borelli) (#672589)
  • Correct review comments from Martin (Johan Dahlin) (#672578)
  • Correct pyflakes warnings/errors (Johan Dahlin) (#672578)
  • Make tests fail on CRITICAL logs, too, and apply to all tests (Martin Pitt)
  • Support marshalling GI_TYPE_TAG_INTERFACE (Alberto Mardegan) (#668903)
  • Fix warnings on None values in added tree/list store rows (Martin Pitt) (#672463)
  • pygtkcompat test: Properly clean up PixbufLoader (Martin Pitt)

Read more

Half a year ago I blogged about the changed expectancies and processes to improve quality of the development release which we discussed at the UDS in Orlando: A promise that we don’t break the development version, regressions are not to be tolerated, acceptance criteria for Canonical upstreams. For that we introduced the Stable+1 team, actually did some reversions of broken packages, our QA team set up rigorous daily installation image and upgrade tests, and the code development process for Unity and related project was changed to enforce buildability and passing automatic tests with each and every change to trunk.

To be honest I was still a tad sceptic back then when this was planned. These were a lot of changes for one cycle, the stable+1 team was a considerable resource investment (starting with three people fulltime in the first few months), and not to the least our friends in the DX team felt thwarted because they had to sit down for a long time developing tests, and then changing their habits and practices for development.

So was all that effort worth it?


Just a random sample of goodness that this brought:

  • It was nice to not have to sit down for an hour every cople of days to figure out how to get back my desktop after the daily dist-upgrade bricked it.
  • Unity, compiz, and friends were remarkably stable. I still remember the previous cycles where every new version got differently crashy, broke virtual workspaces, and what not. The worst thing that happened this cycle is eternally breaking keybindings (or changing them around), but at least those usually had obvious workarounds.
  • As a result of those, I think we had at least one, maybe two magnitudes more testers of the daily development release than in previous cycles. So we got a lot of good bug reports and also patch contributions for smaller issues in Precise which we otherwise would not have discovered.
  • The daily dist-upgrade tests tremendously helped to uncover packaging problems which would break real-world upgrades out there by the dozens. It took months to fix the hardest one: upgrading 10.04 LTS to 12.04 LTS with all universe packages offered in software-center. This beast takes 13 hours to run, so nobody really did manual tests like that in the past cycles.
  • Due to the daily automatic CD image builds we dramatically reduced both the cost of fixing regressions as well as the emergency hackathons during milestone preparations. It is a lot easier to unbreak e. g. LVM setup or OEM install modes on our images when the regression happened just a day before than discovering it two days before a milestone is due, as again nobody tests these less common modes very often.
  • So as a result, I really think the investments into QA and the stable+1 teams already paid off twofold by giving us more time to work on the less critical fixes, avoiding lots of user frustration about broken upgrades, and generally making the daily development a lot more enjoyable. Or, as Rick Spencer puts it: Velocity, velocity, velocity!

    Despite these improvements, there are still some improvements I’m looking forward to in the next cycles: Thanks to Colin Watson we can now use -proposed as a proper staging area, and used this feature rather extensively in the past month. From my point of view, 90% of the remaining daily dist-upgrade failures were due to packages building on different architectures at vastly different times, or failing on some, but not all architectures (“arch skew”). This is something you cannot really predict or guard against as a developer when you upload large and potentially harmful packages directly to the development release, so uploading them to the staging area and letting everything build there will reduce the breakage to zero. This was successfully demonstrated with Unity, GTK, and other packages where arch skew pretty much always causes people to hose their desktop, as well as daily CD images not working.

    I’m also looking forward to combining the staging area with lots of automatic tests against reverse dependencies (e. g. testing the installer against a new GTK or pygobject before it lands), something we just barely tipped our toes in.

    I can’t imagine how we were ever able to develop our new releases the old way. :-)

    Precise Pangolin^W^WUbuntu 12.04, I’m proud of you! Go out and amaze people!

    Read more

I just released a new pygobject version 3.1.92, for this week’s GNOME 3.3.92. This was my first-ever GNOME release (yay!), so please bear with me.

One highlight of this release is the new pygtkcompat module, contributed by Johan Dahlin. It provides backwards compatibility to pygtk far beyond to what the Gtk overrrides do, and also includes some shims for the old static webkit, gudev, and other modules. You can, and have to, enable them individually:

import gi.pygtkcompat

# enable "gobject" and "glib" modules

import glib
import gtk

Now you can use gtk.Window(), glib.timeout_add() etc. as before, and these will be transparently be converted into their modern GI counterparts. Please note that this is still in its infancy, and also mostly meant to ease the porting to GI. It’s not something we’ll keep forever.

Thanks to Michel Dänzer this release now also works properly on big-endian machines.

I mostly worked on fixing the calls of methods which take a list of GValues as arguments, such as Gtk.ListStore.insert_with_valuesv() and similar functions, and made the override API for tree models (append() etc. with providing row data) atomic wrt. the signals it sends out.

I want to thank Johan and Paolo for the nice teamwork with reviewing each other’s patches. That’s open source at its best!

Complete list of changes:

  • Correct Gtk.TreePath.__iter__ to work with Python 3 (Johan Dahlin)
  • Fix test_everything.TestSignals.test_object_param_signal test case (Martin Pitt)
  • Add a PyGTK compatibility layer (Johan Dahlin)
  • pygtkcompat: Remove first argument for get_origin() (Johan Dahlin)
  • Fix to work with Python 3 (Martin Pitt)
  • GtkViewport: Add a default values for the adjustment constructor parameters (Johan Dahlin)
  • GtkIconSet: Add a default value for the pixbuf constructor parameter (Johan Dahlin)
  • PangoLayout: Add a default value for set_markup() (Johan Dahlin)
  • Gtk[HV]Scrollbar: Add a default value for the adjustment constructor parameter (Johan Dahlin)
  • GtkToolButton: Add a default value for the stock_id constructor parameter (Johan Dahlin)
  • GtkIconView: Add a default value for the model constructor parameter (Johan Dahlin)
  • Add a default value for column in Gtk.TreeView.get_cell_area() (Johan Dahlin)
  • Atomic inserts in Gtk.{List,Tree}Store overrides (Martin Pitt)
  • Fix Gtk.Button constructor to accept use_stock parameter (Martin Pitt)
  • Correct bad rebase, remove duplicate Window (Johan Dahlin)
  • Add bw-compatible arguments to Gtk.Adjustment (Johan Dahlin)
  • GtkTreePath: make it iterable (Johan Dahlin)
  • Add a default argument to TreeModelFilter.set_visible_func() (Johan Dahlin)
  • Add a default argument to Gtk.TreeView.set_cursor (Johan Dahlin)
  • Add a default argument to Pango.Context.get_metrics() (Johan Dahlin)
  • Fix double-freeing GValues in arrays (Martin Pitt)
  • Renamed “property” class to “Property” (Simon Feltman)
  • Fix Python to C marshalling of GValue arrays (Martin Pitt)
  • Correct the Gtk.Window hierarchy (Johan Dahlin)
  • Renamed getter/setter instance attributes to fget/fset respectively. (Simon Feltman)
  • Add Gtk.Arrow/Gtk.Window constructor override (Johan Dahlin)
  • Fix marshalling to/from Python to work on big endian machines. (Michel Dänzer)
  • Use gi_cclosure_marshal_generic instead of duplicating it. (Michel Dänzer)
  • Override Gtk.TreeView.get_visible_range to fix return (René Stadler)
  • Plug memory leak in _is_union_member (Paolo Borelli)
  • tests: Split TestInterfaces into separate tests (Sebastian Pölsterl)
  • README: Update current maintainers (Martin Pitt)

Read more

Part of our efforts to reduce power consumption in Ubuntu is to provide an easy tool to hunt down which programs and devices are to blame for inordinate power consumption. powertop’s interactive mode is pretty good for this if you are sitting in a train and want to tweak some knobs to max out battery life, but we need something more reproducible and noninteractive for developers who want to file proper bug reports.

So I wrote a little script power-usage-report which calls fatrace for measuring file access activity from programs, and powertop-1.13 to measure process and device wakeups, clean up and sort their ouput, and generate a report which is appropriate to attach to bug reports, send around, put into Jenkins for measuring daily progress, etc. It is now part of fatrace version 0.4, so today’s Precise upgrades will have it.

The output has several sections for disk access (which prevent the disk from spinning down), wakeups (causing CPU power usage), and device activity. Disk/wakeups are sorted in descending order by process:

$ sudo power-usage-report
Measurement will begin in 5 seconds. Please make sure that the
computer is idle, i. e. do not press keys, start or operate programs, and that
programs are not busy with active tasks other than the one you want to examine.
Starting measurement for 60 seconds...
Measurement complete. Generating report...
======= unity-panel-ser: 5 file access events ======
/usr/share/zoneinfo/UTC: 1 reads
/usr/share/zoneinfo/posix/Europe/Berlin: 1 reads
/etc/localtime: 3 reads

======= gnome-settings-: 1 file access events ======
/etc/fstab: 1 reads

======= telepathy-gabbl: 1 file access events ======
/home/martin/.cache/wocky/caps/caps-cache.db: 1 reads

====== Wakeups ======
  30,9% ( 52,0)   compiz
  16,3% ( 27,4)   [iwlwifi] 
  12,5% ( 21,0)   [i915] 
   3,7% (  6,3)   [ahci] 
   2,3% (  3,9)   swapper/3
   1,2% (  2,0)   gvfs-afc-volume

====== Devices ======
An audio device is active 100,0% of the time:
hwC0D0 Conexant CX20585 

Recent USB suspend statistics
Active  Device name
100,0%	USB device 1- : USB Mouse (A4Tech)
100,0%	/sys/bus/usb/devices/1-
100,0%	USB device 1-1.5.4 : Kinesis Keyboard Hub (PI Engineering)
  0,0%	USB device 1-1.5.2 : USB2.0 Hub Controller (NEC Corporation)


You can redirect output to a file, of course. The top header (“Starting measurement..” etc.) will go to stderr and thus not be part of the redirected output.

Read more

Part of our efforts to reduce power consumption is to identify processes which keep waking up the disk even when the computer is idle. This already resulted in a few bug reports (and some fixes, too), but we only really just began with this.

Unfortunately there is no really good tool to trace file access events system-wide. powertop claims to, but its output is both very incomplete, and also wrong (e. g. it claims that read accesses are writes). strace gives you everything you do and don’t want to know about what’s going on, but is per-process, and attaching strace to all running and new processes is cumbersome. blktrace is system-wide, but operates at a way too low level for this task: its output has nothing to do any more with files or even inodes, just raw block numbers which are impossible to convert back to an inode and file path.

So I created a little tool called fatrace (“file access trace”, not “fat race” :-) ) which uses fanotify, a couple of /proc lookups and some glue to provide this. By default it monitors the whole system, i. e. all mounts (except the virtual ones like /proc, tmpfs, etc.), but you can also tell it to just consider the mount of the current directory. You can write the log into a file (stdout by default), and run it for a specified number of seconds. Optional time stamps and PID filters are also provided.

$ sudo fatrace
rsyslogd(967): W /var/log/auth.log
notify-osd(2264): O /usr/share/pixmaps/weechat.xpm
compiz(2001): R device 8:2 inode 658203

It shows the process name and pid, the event type (Rread, Write, Open, or Close), and the path. Sometimes its’ not possible to determine a path (usually because it’s a temporary file which already got deleted, and I suspect mmaps as well), in that case it shows the device and inode number; such programs then need closer inspection with strace.

If you run this in gnome-terminal, there is an annoying feedback loop, as gnome-terminal causes a disk access with each output line, which then causes another output line, ad infinitum. To fix this, you can either redirect output to a file (-o /tmp/trace) or ignore the PID of gnome-terminal (-p `pidof gnome-terminal`).

So to investigate which programs are keeping your disk spinning, run something like

  $ sudo fatrace -o /tmp/trace -s 60

and then do nothing until it finishes.

My next task will be to write an integration program which calls fatrace and powertop, and creates a nice little report out of that raw data, sorted by number of accesses and process name, and all that. But it might already help some folks as it is right now.

The code lives in bzr branch lp:fatrace (web view), you can just run make and sudo ./fatrace. I also uploaded a package to Ubuntu Precise, but it still needs to go through the NEW queue. I also made a 0.1 release, so you can just grab the release tarball if you prefer. Have a look at the manpage and --help, it should be pretty self-explanatory.

Read more

PackageKit has a “WhatProvides” API for mapping distribution independent concepts to particular package names. For example, you could ask “which packages provide a decoder for AC3 audio files?

$ pkcon what-provides  "gstreamer0.10(decoder-audio/ac3)"
Installed   	gstreamer0.10-plugins-good-	GStreamer plugins from the "good" set
Available  	gstreamer0.10-plugins-ugly-0.10.18-3ubuntu4.amd64	GStreamer plugins from the "ugly" set

This is the kind of question your video player would ask the system if it encounters a video it cannot play. In reality they of course use the D-BUS or the library API, but it’s easier to demonstrate with the PackageKit command line client.

PackageKit provides a fair number of those concepts; I recently added LANGUAGE_SUPPORT for packages which provide dictionaries, spell checkers, and other language support for a given language or locale code.

However, PackageKit’s apt backend does not actually implement a lot of these (only CODEC and MODALIAS), and aptdaemons’s PackageKit compatibility API does not implement any. That might be because their upstreams do not know enough how to do the mapping for a particular distro/backend, because doing so involves distro specific code which should not go into upstreams, or simply because of the usual chicken-egg problem of app developers rather doing their own thing instead of using generic APIs.

So this got discussed between Sebastian Heinlein and me, and voila, there it is: it is now very easy to provide Python plugins for “what-provides” to implement any of the existing types. For example, language-selector now ships a plugin which implements LANGUAGE_SUPPORT, so you can ask “which packages do I need for Chinese in China” (i. e. simplified Chinese)?

$ pkcon what-provides "locale(zh_CN)"
Available   	firefox-locale-zh-hans-10.0+build1-0ubuntu1.all	Simplified Chinese language pack for Firefox
Available   	ibus-sunpinyin-2.0.3-2.amd64            	sunpinyin engine for ibus
Available   	language-pack-gnome-zh-hans-1:12.04+20120130.all	GNOME translation updates for language Simplified Chinese
Available   	ttf-arphic-ukai-0.2.20080216.1-1.all    	"AR PL UKai" Chinese Unicode TrueType font collection Kaiti style

Rodrigo Moya is currently working on implementing the control-center region panel redesign in a branch. This uses exactly this feature.

In Ubuntu we usually do not use PackageKit itself, but aptdaemon and its PackageKit API compatibility shim python-aptdaemon.pkcompat. So I ported that plugin support for aptdaemon-pkcompat as well, so plugins work with either now. Ubuntu Precise got the new aptdaemon (0.43+bzr769-0ubuntu1) and language-selector (0.63) versions today, so you can start playing around with this now.

So how can you write your own plugins? This is a trivial, although rather nonsense example:

from packagekit import enums

def my_what_provides(apt_cache, provides_type, search):
    if provides_type in (enums.PROVIDES_CODEC, enums.PROVIDES_ANY):
        return [apt_cache["gstreamer-moo"]]
        raise NotImplementedError('cannot handle type ' + str(provides_type))

The function gets an apt.Cache object, one of enums.PROVIDES_* and the actual search type as described in the documentation (above dummy example does not actually use it). It then decides whether it can handle the request and return a list of apt.package.Package objects (i. e. values in an apt.Cache map), or raise a NotImplementedError otherwise.

You register the plugin through Python pkg-resources in your (this needs setuptools):



You can register arbitrarily many plugins, they will be all called and their resulting package lists joined.

All this will hopefully help a bit to push distro specifics to the lowest possible levels, and use upstream friendly and distribution agnostic APIs in your applications.

Read more

Suppose you install Ubuntu and select a language other than English (it’s known to happen!). This will install the general and the GNOME language packs, translated LibreOffice help, and so on. Now, install a KDE package or GIMP. You’ll notice that the new application is not translated and has no help available for your language. The next time you open the language selector from control-center it would tell you that you miss some language support and offer to install it, but this has been pretty indiscoverable, and we really can do better.

Today’s language-selector upload provides an aptdaemon plugin which automatically marks corresponding language support packages (translated help, dictionaries, spell checker modules, and translations themselves) for installation for any newly installed package, for all languages that are configured on your system.

For example, I have German and English locales on my system, and no KDE packages. Before, installing GIMP got me just that:

$ aptdcon -i gimp
The following NEW package will be installed (1):

Now it automatically installs the corresponding localized help:

$ aptdcon -i gimp
The following NEW packages will be installed (4):
gimp gimp-help-common gimp-help-de gimp-help-en

I am using aptdcon here as it points out the effect better than software-center doing all this in the background, but both use aptdaemon, so the effect will be the same.

Likewise, installing the first KDE-ish package will automatically install the KDE language packs:

$ aptdcon -i kate
The following NEW packages will be installed (71):
kate kate-data [...] kdelibs5-data [...] language-pack-kde-de language-pack-kde-en [...]

This is now possible because I rewrote the check-language-support logic from scratch; the old code was very slow, hard to read and a nightmare to maintain, and also depended on a lot of data files. The new code is very fast (figuring out all missing language support packages for all installed packages for all available locales takes 8 ms on my system), and has full test coverage.

While the check-language-support program still works (I rewrote it using the new API), it is easier and probably a lot faster to just use the new API now, e. g. in our Ubiquity installer.

Say goodbye to this 2.5 year old bug!

Read more

On my 8 hour train ride to Budapest last Sunday I finally worked on making libxklavier introspectable. Thanks to Sergey’s fast review the code now landed in trunk. I sent a couple of refinements to the bug report still, but those are mostly just icing on the cake, the main functionality of getting and setting keyboard layouts is working nicely now (see the example script).

Read more

I’m the release engineer in charge for Precise Alpha 1 which is currently being prepared. I must say, this has been a real joy! The fruits of the new QA paradigm and strategy and the new Stable+1 maintenance team have already achieved remarkable results:

  • The archive consistency reports like component-mismatches, uninstallability, etc. now appear about 20 minutes earlier than in oneiric.
  • CD image builds can now happen 30 minutes earlier after the publisher start, and are much quicker now due to moving to newer machines. We can now build an i386 or amd64 CD image in 8 minutes! Currently they still need to wait for the slow powerpc buildd, but moving to a faster machine there is in progress. These improvements lead to much faster image rebuild turnarounds.
  • Candidate CDs now get automatically posted to the new ISO tracker as soon as they appear.
  • Whenever a new Ubuntu image is built (daily or candidate), they automatically get smoke-tested, so we know that the installer works under some standard scenarios and produces an install which actually boots.
  • Due to the new discipline and the stable+1 team, we had working daily ISOs pretty much every day. In previous Alphas, the release engineer(s) pretty much had to work fulltime for a day or two to fix the worst uninstallability etc., all of this now went away.

All this meant that as a release engineer almost all of the hectic and rather dull work like watching for finished ISO builds and posting them or getting the archive into a releasable state completely went away. We only had to decide when it was a good time for building a set of candidate images, and trigger them, which is just copy&pasting some standard commands.

So I could fully concentrate on the interesting bits like actually investigating and debugging bug reports and regressions. As the Law of Conservation of Breakage dictates, taking away work from the button pushing side just caused the actual bugs to be much harder and earned us e. g. this little gem which took Jean-Baptiste, Andy, and me days to even reproduce properly, and will take much more to debug and fix.

In summary, I want to say a huge “Thank you!” to the Canonical QA team, in particular Jean-Baptiste Lallement for setting up the auto-testing and Jenkins integration, and the stable+1 team (Colin Watson, Mike Terry, and Mathieu Trudel-Lapierre in November) for keeping the archive in such excellent shape and improving our tools!

Read more

Apport and the retracer bot in the Canonical data center have provided server-side automatic closing of duplicate crash report bugs for quite a long time. As we have only kept Apport crash detection enabled in the development release, we got away with this as bugs usually did not get so many duplicates that they became unmanageable. Also, the number of duplicates provided a nice hint to how urgent and widespread a crash actually was.

However, it’s time to end that era and provide something better now:

  • This probably caused a lot of frustration when a reporter of the crash spent time, bandwidth, and creativity to upload the crash data and create a description for it, only to find that it got closed as a duplicate 20 minutes later.
  • Some highly visible crashes sometimes generated up to a hundred duplicates in Launchpad, which was prone to timeouts, and needless catch-up by the retracers.
  • We plan to have a real crash database soon, and eventually want to keep Apport enabled in stable releases. This will raise the number of duplicates that we get by several magnitudes.
  • For common crashes we had to write manual bug patterns to avoid getting even more duplicates.

So with the just released Apport 1.90 we introduce client-side duplicate checking. So from now, when you report a crash, you are likely to see “We already know about this” right away, without having to upload or type anything, and you will get directed to the bug page. You should mark yourself as affected and/or subscribe to the bug, both to get a notification when it gets fixed, and also to properly raise the “hotness” of the bug to bubble up to developer attention.

For the technically interested, this is how we detect duplicates for the “signal” crashes like SIGSEGV (as opposed to e. g. Python crashes, where we always have a fully symbolic stack trace):
As we cannot rely on symbolic stack traces, and do not want to force every user to download tons of debug symbols, Apport now falls back to generating a “crash address signature” which combines the absolute addresses of the (non-symbolic) stack trace and the /proc/pid/maps mapping to a stack of libraries and the relative offsets within those, which is stable under ASLR for a given set of dependency versions. As the offsets are specific to the architecture, we form the signature as combination of the executable name, the signal number, the architecture, and the offset list. For example, the i386 signature of bug looks like this:


As library dependencies can change, we have more than one architecture, and the faulty function can be called from different entry points, there can be many address signatures for a bug, so the database maintains an N:1 mapping. In its current form the signatures are taken as-is, which is much more strict than it needs to be. Once this works in principle, we can refine the matching to also detect duplicates from different entry points by reducing the part that needs to match to the common prefix of several signatures which were proven to be a duplicate by the retracer (which gets a fully symbolic stack trace).

The retracer bots now exports the current duplicate/address signature database to in an indexed text format from where Apport clients can quickly check whether a bug is known.

For the Launchpad crash database implementation we actually check if the bug is readable by the reporter, i. e. it is private and the reporter is in a subscribed team, or the bug is public; if not, we let him report the bug anyway and duplicate it later through the existing server-side retracer, so that the reporter has a chance of getting subscribed to the bug. We also let the bug be filed if the currently existing symbolic stack trace is bad (tagged as apport-failed-retrace) or if a developer wants a new symbolic stack trace with the current libraries (tagged as apport-request-retrace).

As this is a major new feature, I decided that it’s time to call this Apport 2.0. This is the first public beta towards it, thus called 1.90. With Apport’s test driven and agile development the version numbers do not mean much anyway (the retracer bots in the data center always just run trunk, for example), so this is as good time as any to reset the rather large “.26″ minor version that we are at right now.

Read more

12.04: Testing FTW

I arrived back home in Augsburg, from last week’s Ubuntu Developer Summit in Orlando, FL. As this is a quality/LTS cycle, we pretty much already knew in advance what to do (bug fixing, bug fixing, some boot speed, and did I mention bug fixing?), but still we had many highly interesting and exciting sessions this time, not so much about what we are going to do, but how we are going to build 12.04.

So far our common practice has been to toss everything new into the development release until Feature Freeze and then try and clean up most of the fallout. Me and many other developers have always cried for having more time for fixing long-standing bugs and not introducing breakage in the first place. It seems that now with 12.04, Ubuntu/Canonical are actually getting serious about it.

(Any resemblance to that postcard from the Kennedy Space Center which I went to last Sunday is of course absolutely unintended and purely coincidental :-) ).

The mission statement is now to have working ISOs, stable ? development, and daily intra-development upgrades every day, quick and regular cleanup of uninstallable packages, component-mismatches, NBS etc., backed by a new “stable +1″ team backed by three people on a rotational shift.

QA team is now setting up daily automatic smoketesting of the installer and other packages which have tests. For the latter we’ll convert some packages to the DEP-8, the proposed format for running autopkgtest on (I’ll do udisks, postgresql-common, pygobject, apport, and jockey soon).

We’ll try do put uploads which might break something (like new libraries) to a staging area first, against which we can run test suites of reverse dependencies before it lands in the new release. As doing this on a large scale still requires infrastructure to be created, we’ll only exercise it for a few packages by uploading to precise-proposed first, but this has a high potential for extension.

We want to commit to fixing major breakage within 3 hours of development time, or otherwise revert the faulty package to the previous version (unless that aggravates problems, such as file conflicts).

Finally, for Canonical upstreams we are introducing “acceptance criteria”, which will hopefully significantly raise the quality and lower the regressions of each Unity etc. release.

So, the mission is clear. In practice we’ll probably have to make some real-life concessions, and Murphy’s law dictates that there still will be some breakage, but we can learn from that as we go.

Let’s build 12.04 LTS!

Read more

Just took the plunge, using the excellent bandwidth and local mirror at UDS:

$ lsb_release -irc
Distributor ID: Ubuntu
Release: 12.04
Codename: precise

Nothing blew up in my face, so it seems today is a good day to die^Wupgrade.

Read more

On a rather calm ten-hour flight to Orlando I once again did some pygobject, udisks, and Apport hacking (It’s scary how productive one can be when not constantly being interrupted by IRC, email, etc). One more visible change amongst these was finally fixing a five year old five-digit bug to integrate apport-retrace into the GUI, now that it does not potentially wreck your installation any more.

If the apport-retrace package is installed, the crash detail dialog will show a new “Examine locally” button:

Apport crash detail dialog

After clicking this, you can choose what do do exactly:

Retrace action dialog

I know this dialog is not a beauty, as it’s implemented using the ui_question_choice() API which is used by package hooks. That makes it work for all available UIs (GTK, KDE, CLI), though, and can easily be extended to have more actions. And if you get this far and want to stack traces, you are used to looking at eye-bleeding gibberish anyway..

Presumably the most useful (and default) action is to download all the debug symbols, open a Terminal, and put you into a GDB session with all these, and the core dump loaded, so that you can poke around the crashed program state with all symbols available.

But you can also run gdb without downloading debug symbols, or just update the .crash report file with a fully symbolic stack trace.

This works just as well in apport-cli, but not yet in the KDE version: Someone needs to implement the equivalent of the apport-gtk implementation to apport-kde and kde/bugreport.ui, i. e. show an “Examine locally” button if self.can_examine_locally() is true, and add an appropriate ui_run_terminal() method (which should be fairly similar to the GTK one, just with Qt/KDEish terminal emulators). But as Kubuntu does not currently use Apport (and also because I didn’t have all the dependencies installed on my laptop) I did not yet do this. Please catch me on IRC/mail/merge proposal if you want to work on this. If you look at above commit, the changes to the GtkBuilder file look huge, but that’s only because I haven’t touched it for ages and the current Glade shuffled the elements quite a bit; it just adds the button to the dialog.

For now this is all sitting in trunk, I’ll do a new upstream release and Ubuntu precise upload soon.

Happy debugging!

Read more

7 years ago, The Ubuntu 4.10 “The Warty Warthog” was announced. A huge congrats to the community, Canonical, and especially Mark for getting so far from “there” to “here”.

This brings back old memories of my first conference in Oxford in August, the great-great-grandfather to what is UDS these days. Back then, there was no company, no Launchpad, no Blueprints, no work items, no detailled plans, just a bunch of ideas, BoFs, and this was a third of the entire crowd:

Warty Hack Room

Back then we worked on the famous TRLS technology (“Totally Rad Laptop Support”) and were proud when we got the ThinkPads to suspend once. During that conference I wrote pmount to provide automatic mounting of USB sticks in a safe manner. Those were the days… :-)

But I can also safely say that there are some things that haven’t changed. Even though both the community and the company (which changed away from recently) grew by two magnitudes since then, we still have the same serious attitude, stern look, and formal attire as we had back then:

We are professionals, really!

We are professionals, really!

Read more

Ich habe gerade Gestatten, Elite zu Ende gelesen (ging schnell, hab erst gestern angefangen). War im Grunde genommen nichts wirklich Neues, was man nicht irgendwie schon gewusst oder geahnt hätte. Aber die gut recherchierte und bewiesene Vehemenz, mit der sich die Oberschicht abschottet und sich selbst als eine Art neuer Adel erhält und das vielbeschworene Leistungsprinzip untergräbt war dann doch schon recht schockierend für mich.

Eine der “Elite”-Schulen die dort unter die Lupe genommen wird — Schloss Neubeuern — haben wir auf unserer Sommerradtour gesehen. Ich war schon beeindruckt von dem Haus, und damals habe ich auch gedacht “Mensch, auf so einer Schule hätte ich mich vielleicht wohlgefühlt”. Aber nach dieser Lektüre bin ich heilfroh dass es mich da nicht hinverschlagen hat.

Read more

Hot on the heels of the PostgreSQL 9.1.0 release I am happy to announce that the final version is now packaged for Debian unstable, the current Ubuntu development version “Oneiric”, and also in my Ubuntu backports PPA for Ubuntu 10.04 LTS, 10.10, and 11.04.

Enjoy trying out all the cool new features like builtin synchronous replication or per-column collation settings for correctly handling international strings, or an even finer-grained access control for large environments. Please see the detailled explanation of the new features.

As already announced a few days ago, 9.0 is gone from Ubuntu 11.10, as it is still only a development version and not an LTS. 9.1 will be the version which the next 12.04 LTS will support, so this slightly reduces the number of major upgrades Ubuntu users will need to do. However, 9.0 will still be available in Debian unstable and backports, and the Ubuntu backports PPA for a couple of months to give DB administrators some time to migrate.

Read more

PostgreSQL 9.1 has had its first release candidate out for some two weeks without major problem reports, so it’s time to promote this more heavily. If you use PostgreSQL, now is the time to try it out and report problems.

We always strive to minimize the number of major versions which we have to support. They not only mean more maintenance for developers, but also more upgrade cycles for the users.

9.0 has not been in any stable Debian or Ubuntu release, and 9.1 final will be released soon. So we recently updated the current Ubuntu development release for 11.10 (“oneiric”) to 9.1. In Debian, the migration from 8.4/9.0 to 9.1 is making good progress, and there is not much which is left until postgresql-9.0 can be removed.

Consequently, I also removed 9.0 from my PostgreSQL backports PPA, as there is nothing any more to backport it from. However, that mostly means that people will now set up installations with 9.1 instead of 9.0, and won’t magically make your already installed 9.0 packages go away. They will just be marked as obsolete in the postgresql-common debconf note.

If you want to build future 9.0 packages yourself, you can do this based on the current branch: bzr branch lp:~pitti/postgresql/debian-9.0, get a the new upstream tarball, name it accordingly, add a new changelog with a new upstream version number, and run bzr bd to build the package (you need to install the bzr-builddeb package for this).

Update 2011-09-09: As I got a ton of pleas to continue the 9.0 backports for a couple of months, and to keep it in Debian unstable for a while longer, I put them back now. I also updated the removal request in Debian to point out that I’m mainly interested in getting 9.0 out of testing. I don’t mind much maintaining it for a couple of more months in unstable. My dear, I had no idea that my backports PPA was that popular!

Read more

The tool to reprocess an Apport crash report to produce a symbolic stack trace, apport-retrace, has been pretty hard to use on a developer system so far: It either installed the packages from the crash report, plus its debug symbol packages (“ddebs”) into the running system (which frequently caused problems like broken dependencies), or it required setting up a chroot and using apport-chroot with fakechroot and fakeroot.

I’m happy to announce that with Apport 1.22, which landed in Oneiric yesterday, this has now become much easier: In the default mode it just calls gdb on the report’s coredump, i. e. expects that all the necessary packages are already installed and will complain about the missing ones. But with the new --sandbox/-Smode, it will just create a temporary directory, download and unpack packages there, and run gdb with some magic options to consider that directory a “virtual root”. These options haven’t been available back when this stuff was written the first time, which is why it used to be so complicated with fakechroots, etc. Now this does not need any root privileges, chroot() calls, etc.

As it only downloads and installs the bare minimum, and does not involve any of the dpkg/apt overhead (maintainer scripts, etc.), it has also become quite a lot faster. That’s how the apport retracers were able to dig through a backlog of about a thousand bugs in just a couple of hours.

So now, if you locally want to retrace or investigate a crash, you can do

   $ apport-retrace -s -S system /var/crash/_usr_bin_gedit.1000.crash

to get the stack traces on stdout, or

   $ apport-retrace -g -S system /var/crash/_usr_bin_gedit.1000.crash

to be put into a gdb session.

If you do this regularly, it’s highly recommended to use a permanent cache dir, where apt can store its indexes and downloaded packages: Use -C ~/.cache/apport-retrace for this (or the long version --cache).

You can also use this to reprocess crashes for a different release than the one you are currently running, by creating a config directory with an appropriate apt sources.list.

The manpage has all the details. (Note that at the time of this writing, still has the old version — use the local one instead.)

Enjoy, and let me know how this works for you!

Read more