Canonical Voices

Posts tagged with 'qa'

pitti

I just released Apport 2.5 with a bunch of new features and some bug fixes.

By default you cannot report bugs and crashes to packages from PPAs, as they are not Ubuntu packages. Some packages like Unity or UbuntuOne define their own crash database which reports bugs against the project instead. This has been a bit cumbersome in the past, as these packages needed to ship a /etc/apport/crashdb.conf.d/ snippet. This has become much easier, package hooks can define a new crash database directly now (#551330):

def add_info(report, ui):
   if determine_whether_to_report_to_upstream:
       report['CrashDB'] = '{ "impl": "launchpad", "project": "picsaw" }'

(Documented in package-hooks.txt)

Apport now also looks for package hooks in /opt (#1020503) if the executable path or a file in the package is somewhere below /opt (it tries all intermediate directories).

With these two, we should have much better support for filing bugs against ARB packages.

This version also finally drops the usage of gksu and moves to PolicyKit. Now we only have one package left in the default install (update-notifier) which uses it. Almost there!

Read more
pitti

A few weeks ago I wrote about my new role as an upstream QA engineer. I have now officially been in that role since June. Quite expectedly I had (and still have) some backlog from my previous Desktop engineer role, but I have had plenty of time to work on automatic tests and some test technology now. If you are interested in the daily details, you can look at the ramblings on my G+ page; in a nutshell I worked on integration tests for udisks2 (mostly upstream now), a mock polkit API, and a small enhancement of the scsi_debug kernel module. On the distro QA side I got the integration tests of udisks2, upower, PostgreSQL, Apport, and ubuntu-drivers-common working and added to our Jenkins autopkgtest runner, where they are executed whenever the particular package or any of its dependencies get updated. This already uncovered a surprising number of actual bugs, so I’m happy that this system starts being useful after the initial hump of getting the tests to run properly in that environment.

In that previous blog post I mentioned that Canonical will hire a second person for an upstream QA engineer role. I am pleased that the job posting is now online, so if you are familiar with how the Linux plumbing and desktop stacks work, are frantic about testing, like to work with the Linux, plumbing, GNOME, and other FOSS communities, know your way around jhbuild, Jenkins, and similar technologies, and would like to explore new possibilities like applying static code checking or creating APIs to fake hardware, please have a look at the role description! Please feel free to contact me on IRC (pitti on Freenode) or by email if you have further questions about the role.

Read more
pitti

As I wrote two weeks ago, I consider the QA related changes in Ubuntu 12.04 a great success. But while we will continue and even extend our efforts there, this is not where the ball stops: it’s great to have the feedback cycle between “break it” and “notice the bug” reduced from potentially a few months to one day in many cases, but wouldn’t it be cool to reduce that to a few minutes, and also put the machinery right at where stuff really happens — into the upstream trunks? If for every commit to PyGObject, GTK, NetworkManager, udisks, D-BUS, telepathy, gvfs, etc. we’d immediately build and test all reverse dependencies and the committer would be told about regressions?

I have had the desire to work on automated tests in Linux Plumbing and GNOME for quite a while now. Also, after 8 years of doing distribution work of packaging and processes (tech lead, release engineering/management, stable release updates, etc.) I wanted to shift my focus towards technology development. So I’ve been looking for a new role for some time now.

It seems that time is finally there: At the recent UDS, Mark announced that we will extend our QA efforts to upstream. I am very happy that in two weeks I can now move into a role to make this happen: Developing technology to make testing easier, work with our key upstreams to set up test suites and reporting, and I also can do some general development in areas that are near and dear to my heart, like udev/systemd, udisks, pygobject, etc. This work will be following the upstream conventions for infrastructure and development policies. In particular, it is not bound by Canonical copyright license agreements.

I have a bunch of random ideas what to work on, such as:

  • Making it possible/easier to write tests with fake hardware. E. g. in the upower integration tests that I wrote a while ago there is some code to create a fake sysfs tree which should really go into udev itself, available from C and introspection and be greatly extended. Also, it’s currently not possible to simulate a uevent that way, that’s something I’d like to add. Right now you can only set up /sys, start your daemon, and check the state after the coldplugging phase.
  • Interview some GNOME developers what kind of bugs/regressions/code they have most trouble with and what/how they would like to test. Then write a test suite with a few working and one non-working case (bugzilla should help with finding these), discuss the structure with the maintainer again, find some ways to make the tests radically simpler by adding/enhancing the API available from gudev/glib/gtk, etc. E. g. in the tests for apport-gtk I noticed that while it’s possible to do automatic testing of GUI applications it is still way harder than it should and needs to be. I guess that’s the main reason why there are hardly any GUI tests in GNOME?
  • I’ve heard from several people that it would be nice to be able to generate some mock wifi/ethernet/modem adapters to be able to automatically test NetworkManager and the like. As network devices are handled specially in Linux, not in the usual /dev and sysfs manner, they are not easy to fake. It probably needs a kernel module similar to scsi_debug, which fakes enough of the properties and behaviour of particular nmetwork card devices to be useful for testing. One could certainly provide a pipe or a regular bridge device at the other end to actually talk to the application through the fake device. (NB this is just an idea, I haven’t looked into details at all yet).
  • For some GUI tests it would be much easier if there was a very simple way of providing mocks/stubs for D-BUS services like udisks or NetworkManager than having to set up the actual daemons, coerce them into some corner-case behaviour, and needing root privileges for the test due to that. There seems to be some prior art in Ruby, but I’d really like to see this in D-BUS itself (perhaps a new library there?), and/or having this in GDBus where it would even be useful for Python or JavaScript tests through gobject-introspection.
  • There are nice tools like the Clang static code analyzer out there. I’d like to play with those and see how we can use it without generating a lot of false positives.
  • Robustify jhbuild to be able to keep up with building everything largely unattended. Right now you need to blow away and rebuild your tree way too often, due to brittle autotools or undeclared dependencies, and if we want to run this automatically in Jenkins it needs to be able to recover by itself. It should be able to keep up with the daily development, automatically starting build/test jobs for all reverse dependencies for a module that just has changed (and for basic libraries like GLib or GTK that’s going to be a lot), and perhaps send out mail notifications when a commit breaks something else. This also needs some discussion first, about how/where to do the notifications, etc.

Other ideas will emerge, and I hope lots of you have their own ideas what we can do. So please talk to me! We’ll also look for a second person to work in that area, so that we have some capacity and also the possibility to bounce ideas and code reviews between each other.

Read more
pitti

I just uploaded Apport 2.1 to Quantal. A big change in that version is that the whole code now works with both Python 2 and 3, except for the launchpadlib crash database backend (as we do not yet have a python3-launchpadlib package).

I took some care that apport report objects get along with both strings (unicode type in Python 2) and byte arrays (str type in Python 2) in values, so most package hooks should still work. However, now is the time to check whether they also work with Python 3, to make the impending transition to Python 3 easier.

However, you need to watch out if you use projects or scripts which directly use python-apport to process reports: The open(), write(), and write_mime() methods now require the passed file descriptors to be open in binary mode. You will get an exception otherwise.

A common pattern so far has been code like

  report = apport.Report()
  report.load(open('myfile.crash'))

This needs to be changed to

  report = apport.Report()
  with open('myfile.crash', 'rb') as f:
      report.load(f)

The “with” context is not strictly required, but it takes care of timely closing the files again. This avoids ResourceWarning spew when you run this in test suites or enable warnings.

Read more
pitti

Half a year ago I blogged about the changed expectancies and processes to improve quality of the development release which we discussed at the UDS in Orlando: A promise that we don’t break the development version, regressions are not to be tolerated, acceptance criteria for Canonical upstreams. For that we introduced the Stable+1 team, actually did some reversions of broken packages, our QA team set up rigorous daily installation image and upgrade tests, and the code development process for Unity and related project was changed to enforce buildability and passing automatic tests with each and every change to trunk.

To be honest I was still a tad sceptic back then when this was planned. These were a lot of changes for one cycle, the stable+1 team was a considerable resource investment (starting with three people fulltime in the first few months), and not to the least our friends in the DX team felt thwarted because they had to sit down for a long time developing tests, and then changing their habits and practices for development.

So was all that effort worth it?

One word: OMGCRYOUTLOUDYES!!!!

Just a random sample of goodness that this brought:

  • It was nice to not have to sit down for an hour every cople of days to figure out how to get back my desktop after the daily dist-upgrade bricked it.
  • Unity, compiz, and friends were remarkably stable. I still remember the previous cycles where every new version got differently crashy, broke virtual workspaces, and what not. The worst thing that happened this cycle is eternally breaking keybindings (or changing them around), but at least those usually had obvious workarounds.
  • As a result of those, I think we had at least one, maybe two magnitudes more testers of the daily development release than in previous cycles. So we got a lot of good bug reports and also patch contributions for smaller issues in Precise which we otherwise would not have discovered.
  • The daily dist-upgrade tests tremendously helped to uncover packaging problems which would break real-world upgrades out there by the dozens. It took months to fix the hardest one: upgrading 10.04 LTS to 12.04 LTS with all universe packages offered in software-center. This beast takes 13 hours to run, so nobody really did manual tests like that in the past cycles.
  • Due to the daily automatic CD image builds we dramatically reduced both the cost of fixing regressions as well as the emergency hackathons during milestone preparations. It is a lot easier to unbreak e. g. LVM setup or OEM install modes on our images when the regression happened just a day before than discovering it two days before a milestone is due, as again nobody tests these less common modes very often.
  • So as a result, I really think the investments into QA and the stable+1 teams already paid off twofold by giving us more time to work on the less critical fixes, avoiding lots of user frustration about broken upgrades, and generally making the daily development a lot more enjoyable. Or, as Rick Spencer puts it: Velocity, velocity, velocity!

    Despite these improvements, there are still some improvements I’m looking forward to in the next cycles: Thanks to Colin Watson we can now use -proposed as a proper staging area, and used this feature rather extensively in the past month. From my point of view, 90% of the remaining daily dist-upgrade failures were due to packages building on different architectures at vastly different times, or failing on some, but not all architectures (“arch skew”). This is something you cannot really predict or guard against as a developer when you upload large and potentially harmful packages directly to the development release, so uploading them to the staging area and letting everything build there will reduce the breakage to zero. This was successfully demonstrated with Unity, GTK, and other packages where arch skew pretty much always causes people to hose their desktop, as well as daily CD images not working.

    I’m also looking forward to combining the staging area with lots of automatic tests against reverse dependencies (e. g. testing the installer against a new GTK or pygobject before it lands), something we just barely tipped our toes in.

    I can’t imagine how we were ever able to develop our new releases the old way. :-)

    Precise Pangolin^W^WUbuntu 12.04, I’m proud of you! Go out and amaze people!

    Read more
pitti

Part of our efforts to reduce power consumption in Ubuntu is to provide an easy tool to hunt down which programs and devices are to blame for inordinate power consumption. powertop’s interactive mode is pretty good for this if you are sitting in a train and want to tweak some knobs to max out battery life, but we need something more reproducible and noninteractive for developers who want to file proper bug reports.

So I wrote a little script power-usage-report which calls fatrace for measuring file access activity from programs, and powertop-1.13 to measure process and device wakeups, clean up and sort their ouput, and generate a report which is appropriate to attach to bug reports, send around, put into Jenkins for measuring daily progress, etc. It is now part of fatrace version 0.4, so today’s Precise upgrades will have it.

The output has several sections for disk access (which prevent the disk from spinning down), wakeups (causing CPU power usage), and device activity. Disk/wakeups are sorted in descending order by process:

$ sudo power-usage-report
Measurement will begin in 5 seconds. Please make sure that the
computer is idle, i. e. do not press keys, start or operate programs, and that
programs are not busy with active tasks other than the one you want to examine.
Starting measurement for 60 seconds...
Measurement complete. Generating report...
======= unity-panel-ser: 5 file access events ======
/usr/share/zoneinfo/UTC: 1 reads
/etc/timezone:
/usr/share/zoneinfo/posix/Europe/Berlin: 1 reads
/etc/localtime: 3 reads

======= gnome-settings-: 1 file access events ======
/etc/fstab: 1 reads

======= telepathy-gabbl: 1 file access events ======
/home/martin/.cache/wocky/caps/caps-cache.db: 1 reads

====== Wakeups ======
  30,9% ( 52,0)   compiz
  16,3% ( 27,4)   [iwlwifi] 
  12,5% ( 21,0)   [i915] 
   3,7% (  6,3)   [ahci] 
   2,3% (  3,9)   swapper/3
   1,2% (  2,0)   gvfs-afc-volume
[...]

====== Devices ======
An audio device is active 100,0% of the time:
hwC0D0 Conexant CX20585 

Recent USB suspend statistics
Active  Device name
100,0%	USB device 1-1.5.4.4 : USB Mouse (A4Tech)
100,0%	/sys/bus/usb/devices/1-1.5.4.2
100,0%	USB device 1-1.5.4 : Kinesis Keyboard Hub (PI Engineering)
  0,0%	USB device 1-1.5.2 : USB2.0 Hub Controller (NEC Corporation)

[...]

You can redirect output to a file, of course. The top header (“Starting measurement..” etc.) will go to stderr and thus not be part of the redirected output.

Read more
pitti

I’m the release engineer in charge for Precise Alpha 1 which is currently being prepared. I must say, this has been a real joy! The fruits of the new QA paradigm and strategy and the new Stable+1 maintenance team have already achieved remarkable results:

  • The archive consistency reports like component-mismatches, uninstallability, etc. now appear about 20 minutes earlier than in oneiric.
  • CD image builds can now happen 30 minutes earlier after the publisher start, and are much quicker now due to moving to newer machines. We can now build an i386 or amd64 CD image in 8 minutes! Currently they still need to wait for the slow powerpc buildd, but moving to a faster machine there is in progress. These improvements lead to much faster image rebuild turnarounds.
  • Candidate CDs now get automatically posted to the new ISO tracker as soon as they appear.
  • Whenever a new Ubuntu image is built (daily or candidate), they automatically get smoke-tested, so we know that the installer works under some standard scenarios and produces an install which actually boots.
  • Due to the new discipline and the stable+1 team, we had working daily ISOs pretty much every day. In previous Alphas, the release engineer(s) pretty much had to work fulltime for a day or two to fix the worst uninstallability etc., all of this now went away.

All this meant that as a release engineer almost all of the hectic and rather dull work like watching for finished ISO builds and posting them or getting the archive into a releasable state completely went away. We only had to decide when it was a good time for building a set of candidate images, and trigger them, which is just copy&pasting some standard commands.

So I could fully concentrate on the interesting bits like actually investigating and debugging bug reports and regressions. As the Law of Conservation of Breakage dictates, taking away work from the button pushing side just caused the actual bugs to be much harder and earned us e. g. this little gem which took Jean-Baptiste, Andy, and me days to even reproduce properly, and will take much more to debug and fix.

In summary, I want to say a huge “Thank you!” to the Canonical QA team, in particular Jean-Baptiste Lallement for setting up the auto-testing and Jenkins integration, and the stable+1 team (Colin Watson, Mike Terry, and Mathieu Trudel-Lapierre in November) for keeping the archive in such excellent shape and improving our tools!

Read more
pitti

Apport and the retracer bot in the Canonical data center have provided server-side automatic closing of duplicate crash report bugs for quite a long time. As we have only kept Apport crash detection enabled in the development release, we got away with this as bugs usually did not get so many duplicates that they became unmanageable. Also, the number of duplicates provided a nice hint to how urgent and widespread a crash actually was.

However, it’s time to end that era and provide something better now:

  • This probably caused a lot of frustration when a reporter of the crash spent time, bandwidth, and creativity to upload the crash data and create a description for it, only to find that it got closed as a duplicate 20 minutes later.
  • Some highly visible crashes sometimes generated up to a hundred duplicates in Launchpad, which was prone to timeouts, and needless catch-up by the retracers.
  • We plan to have a real crash database soon, and eventually want to keep Apport enabled in stable releases. This will raise the number of duplicates that we get by several magnitudes.
  • For common crashes we had to write manual bug patterns to avoid getting even more duplicates.

So with the just released Apport 1.90 we introduce client-side duplicate checking. So from now, when you report a crash, you are likely to see “We already know about this” right away, without having to upload or type anything, and you will get directed to the bug page. You should mark yourself as affected and/or subscribe to the bug, both to get a notification when it gets fixed, and also to properly raise the “hotness” of the bug to bubble up to developer attention.

For the technically interested, this is how we detect duplicates for the “signal” crashes like SIGSEGV (as opposed to e. g. Python crashes, where we always have a fully symbolic stack trace):
As we cannot rely on symbolic stack traces, and do not want to force every user to download tons of debug symbols, Apport now falls back to generating a “crash address signature” which combines the absolute addresses of the (non-symbolic) stack trace and the /proc/pid/maps mapping to a stack of libraries and the relative offsets within those, which is stable under ASLR for a given set of dependency versions. As the offsets are specific to the architecture, we form the signature as combination of the executable name, the signal number, the architecture, and the offset list. For example, the i386 signature of bug looks like this:

/usr/bin/rhythmbox:11:i686:/usr/lib/libgstpbutils-0.10.so.0.24.0+c284:/usr/lib/i386-linux-gnu/libgobject-2.0.so.0.3000.0+3337a:/usr/lib/i386-linux-gnu/libgobject-2.0.so.0.3000.0+8e0

As library dependencies can change, we have more than one architecture, and the faulty function can be called from different entry points, there can be many address signatures for a bug, so the database maintains an N:1 mapping. In its current form the signatures are taken as-is, which is much more strict than it needs to be. Once this works in principle, we can refine the matching to also detect duplicates from different entry points by reducing the part that needs to match to the common prefix of several signatures which were proven to be a duplicate by the retracer (which gets a fully symbolic stack trace).

The retracer bots now exports the current duplicate/address signature database to http://people.canonical.com/~ubuntu-archive/apport-duplicates in an indexed text format from where Apport clients can quickly check whether a bug is known.

For the Launchpad crash database implementation we actually check if the bug is readable by the reporter, i. e. it is private and the reporter is in a subscribed team, or the bug is public; if not, we let him report the bug anyway and duplicate it later through the existing server-side retracer, so that the reporter has a chance of getting subscribed to the bug. We also let the bug be filed if the currently existing symbolic stack trace is bad (tagged as apport-failed-retrace) or if a developer wants a new symbolic stack trace with the current libraries (tagged as apport-request-retrace).

As this is a major new feature, I decided that it’s time to call this Apport 2.0. This is the first public beta towards it, thus called 1.90. With Apport’s test driven and agile development the version numbers do not mean much anyway (the retracer bots in the data center always just run trunk, for example), so this is as good time as any to reset the rather large “.26″ minor version that we are at right now.

Read more
pitti

12.04: Testing FTW

I arrived back home in Augsburg, from last week’s Ubuntu Developer Summit in Orlando, FL. As this is a quality/LTS cycle, we pretty much already knew in advance what to do (bug fixing, bug fixing, some boot speed, and did I mention bug fixing?), but still we had many highly interesting and exciting sessions this time, not so much about what we are going to do, but how we are going to build 12.04.

So far our common practice has been to toss everything new into the development release until Feature Freeze and then try and clean up most of the fallout. Me and many other developers have always cried for having more time for fixing long-standing bugs and not introducing breakage in the first place. It seems that now with 12.04, Ubuntu/Canonical are actually getting serious about it.

(Any resemblance to that postcard from the Kennedy Space Center which I went to last Sunday is of course absolutely unintended and purely coincidental :-) ).

The mission statement is now to have working ISOs, stable ? development, and daily intra-development upgrades every day, quick and regular cleanup of uninstallable packages, component-mismatches, NBS etc., backed by a new “stable +1″ team backed by three people on a rotational shift.

QA team is now setting up daily automatic smoketesting of the installer and other packages which have tests. For the latter we’ll convert some packages to the DEP-8, the proposed format for running autopkgtest on (I’ll do udisks, postgresql-common, pygobject, apport, and jockey soon).

We’ll try do put uploads which might break something (like new libraries) to a staging area first, against which we can run test suites of reverse dependencies before it lands in the new release. As doing this on a large scale still requires infrastructure to be created, we’ll only exercise it for a few packages by uploading to precise-proposed first, but this has a high potential for extension.

We want to commit to fixing major breakage within 3 hours of development time, or otherwise revert the faulty package to the previous version (unless that aggravates problems, such as file conflicts).

Finally, for Canonical upstreams we are introducing “acceptance criteria”, which will hopefully significantly raise the quality and lower the regressions of each Unity etc. release.

So, the mission is clear. In practice we’ll probably have to make some real-life concessions, and Murphy’s law dictates that there still will be some breakage, but we can learn from that as we go.

Let’s build 12.04 LTS!

Read more
pitti

On a rather calm ten-hour flight to Orlando I once again did some pygobject, udisks, and Apport hacking (It’s scary how productive one can be when not constantly being interrupted by IRC, email, etc). One more visible change amongst these was finally fixing a five year old five-digit bug to integrate apport-retrace into the GUI, now that it does not potentially wreck your installation any more.

If the apport-retrace package is installed, the crash detail dialog will show a new “Examine locally” button:

Apport crash detail dialog

After clicking this, you can choose what do do exactly:

Retrace action dialog

I know this dialog is not a beauty, as it’s implemented using the ui_question_choice() API which is used by package hooks. That makes it work for all available UIs (GTK, KDE, CLI), though, and can easily be extended to have more actions. And if you get this far and want to stack traces, you are used to looking at eye-bleeding gibberish anyway..

Presumably the most useful (and default) action is to download all the debug symbols, open a Terminal, and put you into a GDB session with all these, and the core dump loaded, so that you can poke around the crashed program state with all symbols available.

But you can also run gdb without downloading debug symbols, or just update the .crash report file with a fully symbolic stack trace.

This works just as well in apport-cli, but not yet in the KDE version: Someone needs to implement the equivalent of the apport-gtk implementation to apport-kde and kde/bugreport.ui, i. e. show an “Examine locally” button if self.can_examine_locally() is true, and add an appropriate ui_run_terminal() method (which should be fairly similar to the GTK one, just with Qt/KDEish terminal emulators). But as Kubuntu does not currently use Apport (and also because I didn’t have all the dependencies installed on my laptop) I did not yet do this. Please catch me on IRC/mail/merge proposal if you want to work on this. If you look at above commit, the changes to the GtkBuilder file look huge, but that’s only because I haven’t touched it for ages and the current Glade shuffled the elements quite a bit; it just adds the button to the dialog.

For now this is all sitting in trunk, I’ll do a new upstream release and Ubuntu precise upload soon.

Happy debugging!

Read more
pitti

The tool to reprocess an Apport crash report to produce a symbolic stack trace, apport-retrace, has been pretty hard to use on a developer system so far: It either installed the packages from the crash report, plus its debug symbol packages (“ddebs”) into the running system (which frequently caused problems like broken dependencies), or it required setting up a chroot and using apport-chroot with fakechroot and fakeroot.

I’m happy to announce that with Apport 1.22, which landed in Oneiric yesterday, this has now become much easier: In the default mode it just calls gdb on the report’s coredump, i. e. expects that all the necessary packages are already installed and will complain about the missing ones. But with the new --sandbox/-Smode, it will just create a temporary directory, download and unpack packages there, and run gdb with some magic options to consider that directory a “virtual root”. These options haven’t been available back when this stuff was written the first time, which is why it used to be so complicated with fakechroots, etc. Now this does not need any root privileges, chroot() calls, etc.

As it only downloads and installs the bare minimum, and does not involve any of the dpkg/apt overhead (maintainer scripts, etc.), it has also become quite a lot faster. That’s how the apport retracers were able to dig through a backlog of about a thousand bugs in just a couple of hours.

So now, if you locally want to retrace or investigate a crash, you can do

   $ apport-retrace -s -S system /var/crash/_usr_bin_gedit.1000.crash

to get the stack traces on stdout, or

   $ apport-retrace -g -S system /var/crash/_usr_bin_gedit.1000.crash

to be put into a gdb session.

If you do this regularly, it’s highly recommended to use a permanent cache dir, where apt can store its indexes and downloaded packages: Use -C ~/.cache/apport-retrace for this (or the long version --cache).

You can also use this to reprocess crashes for a different release than the one you are currently running, by creating a config directory with an appropriate apt sources.list.

The manpage has all the details. (Note that at the time of this writing, manpages.ubuntu.com still has the old version — use the local one instead.)

Enjoy, and let me know how this works for you!

Read more
brendandonegan

Every well seasoned tester knows the advantages and disadvantages that manual/semi-manual and automatic tests have when compared to each other. A manual test is easy to create, just a few simple words and you have your test. Automatic tests allow you to (almost) fire and forget about them with your only concern being the PASS/FAIL at the end. A semi-manual test is a funny hybrid of the two, usually only used in a situation where a fully automated test is almost physically impossible (e.g. verifying screenshots and tests involving peripherals). Manual tests are not good in situations where the same test must be run many times across a large number of configurations. This is exactly what we have in hardware certification, where we must run tests across ~100 systems on a very regular basis. To this end we’ve been taking the opportunity this development cycle to update some of our older tests to be more automated.

One of the tests that I updated was one which would cycle through available resolutions on the system (using the xrandr tool) and request the tester to verify that they all looked okay with no graphical corruption. This is the sort of test that is fine when someone is running the tests on a one-off basis, it’s not so good when one tester needs to supervise 50+ systems during a certification run. One of the main problems is that it causes too much context switching, with the tester constantly needing to keep an eye on all the systems to see if they’ve reached this test yet. Obviously, it being a graphical test, it’s difficult to do fully automated verification so a compromise needed to be reached. The solution I came up with was to integrate screen capture into the test and then upload these screens in a tgz file as an attachment with the test submission. Everything going well, the tester can sit down at their own computer and go through the screens and confirm they’re okay. In fact the person verifying the screens doesn’t even need to be in the lab! The task can be distributed amongst any number of people, anywhere in the world.

Another test that looked like a prime candidate for automation was one for testing the functioning of the wireless card before and after suspending the system. Previously the test case was:

- Disconnect the wireless interface.
– Reconnect and ensure you’re online.
– Suspend the system.
– Repeat the first two steps

This was all specified to be done manually. I am currently updating this test to use nmcli to make sure a connection can be made, then disconnect and reconnect just as would happen if the tester did the steps manually using nm-applet. The one thing I haven’t got down pat yet is connecting to a wireless network where a connection didn’t exist before. This step may be optional as it could be expected that the tester will do this manually at some point during the setup of the tests and we can trust a connection to be available already. This will mean this test has gone from manual to fully automated and hopefully should shave potentially some significant number of minutes off the whole test run!

Saving time on our existing tests will allow us to introduce new tests where appropriate, so we’re able to provide even more thorough certification testing.


Read more
pitti

Hot on the heels of the Announcement of the second 9.1 Beta release there are now packages for it in Debian experimental and backports for Ubuntu 10.04 LTS, 10.10. and 11.04 in my PostgreSQL backports for stable Ubuntu releases PPA.

Warning for upgrades from Beta 1: The on-disk database format changed since Beta-1. So if you already have the beta-1 packages installed, you need to pg_dumpall your 9.1 clusters (if you still need them), and pg_dropcluster all 9.1 clusters before the upgrade. I added a check to the pre-install script to make the postgresql-9.1 package fail early to upgrade if you still have existing 9.1 clusters to avoid data loss.

Read more
brendandonegan

Hug a Bug

When I worked in the Symbian Foundation, as part of the (Symbian) Bug Squad activities that I helped run we would (try) to have regular get together on IRC where the community would come together and work on something in particular. Mainly this was getting triaging done. We didn’t have the benefit of a lot of experience, so this would be done in something of an ad-hoc way with everyone discussing each bugs status and priority until we reached a conclusion.

Now that I’m at Canonical and trying to participate heavily in Ubuntu’s Bug Squad activities, it’s comforting to know that something similar goes on here (maybe we were subconsciously influenced by it ?). It also happens to be on the same day (Thursday) of the week. I’m of course referring to Hug Days, which are co-ordinated by the QA team. I’ve been involved in them over the last few weeks as a participant (rather than an organiser) and I find the structure to be very good and very accessible. Quite simply there is a list of bugs with different statuses (New, Confirmed or Incomplete) and simple instructions on what to do with each bug. New bugs need to be either Confirmed or set to Incomplete if you find you need to ask the reporter for extra details to be able to reproduce the bug. Confirmed bugs themselves need to be revisited and a check done to make sure the bug is still happening, leading to the bug either being Triaged or set back to Incomplete if it’s not happening and you need the reporter to reconfirm. Lastly, Incomplete bugs should be checked for a response from the reporter to the information request. If they gave the necessary info then the bug should be Confirmed. If not a follow up question should be asked and the bug left as Incomplete.

Some tools that are handy to have to assist with the process of going through all these bug reports and updating them correctly are the Hug Day tools, which semi-automate the process of ‘closing’ Hug Day bugs (they aren’t being closed as bugs, but as tasks on the Hug Day), as well as the Firefox Launchpad Improvements, which are useful not just for Hug Days but any bug work. The improvements include canned bug comments for common scenarios such as when an inexperienced bug filer has provided little info on the bug and you need to tell them to provide simple steps to reproduce the bug.

Each Hug Day is based on a particular package (which helps to focus the effort) and this weeks Hug Day is on Nautilus, Ubuntu’s file browser. I have and will be participating in this as much as I can, so if you decide to participate in it then say Hi on Freenode IRC #ubuntu-bugs where there are lots of knowledgeable Ubuntu people waiting to help out newcomers with the task at hand. See you there!


Read more
pitti

Two weeks ago, PostgreSQL announced the first beta version of the new major 9.1 version, with a lot of anticipated new features like synchronous replication or better support for multilingual databases. Please see the release announcement for details.

Due to my recent moving and the Ubuntu Developer Summit it took me a bit to package them for Debian and Ubuntu, but here they are at last. I uploaded postgresql-9.1 to Debian experimental; currently they are sitting in the NEW queue, but I’m sure our restless Debian archive admins will get to it in a few days. I also provided builds for Ubuntu 10.04 LTS, 10.10. and 11.04 in my PostgreSQL backports for stable Ubuntu releases PPA.

I provided full postgresql-common integration, i. e. you can use all the usual tools like pg_createcluster, pg_upgradecluster etc. to install 9.1 side by side with your 8.4/9.0 instances, attempt an upgrade of your existing instances to 9.1 without endangering the running clusters, etc. Fortunately this time there were no deprecated configuration options, so pg_upgradecluster does not actually have to touch your postgresql.conf for the 9.0 ?9.1 upgrade.

They pass upstream’s and postgresql-common’s integration test suite, so should be reasonably working. But please let me know about everything that doesn’t, so that we can get them in perfect shape in time for the final release.

I anticipate that 9.1 will be the default (and only supported) version in the next Debian release (wheezy), and will most likely be the one shipped in the next Ubuntu LTS (in 12.04). It might be that the next Ubuntu release 11.10 will still ship with 9.0, but that pretty much depends on how many extensions get ported to 9.1 by feature freeze.

Read more
pitti

Apport has provided built-in support for automatically identifying and marking duplicate bug reports for normal signal as well as Python crashes. However, we have more kinds of bug reports submitted through Apport which could benefit from automatic duplication: X.org GPU freezes, package installation failures, kernel oopses, or gcc internal compiler errors, i. e. pretty much everything that gets reported automatically these days.

The latest Apport 1.20 (which also just hit current Ubuntu Natty) now allows package hooks to set a special field DuplicateSignature, which abstracts the concept for other kinds of bug reports where Apport doesn’t do automatic duplication. This field should both uniquely identify the problem class (e. g. “XorgGPUFreeze”) as well as the particular problem, i. e. variables which tell this instance apart from different problems. Aside from these requirements, the value can be any free-form string, Apport only treats it as an opaque value. It doesn’t even need to be ASCII only or only be one line, but for better human inspection I recommend this.

So your report could do something like

   report['DuplicateSignature'] = 'XorgGPUFreeze: instruction %s regs:%s:%s:%s' % (
                     current_instruction, regs[0], regs[1], regs[2])

or

    report['DuplicateSignature'] = 'PackageFailure: ' + log.splitlines()[-1]

This is integrated into Apport’s already existing CrashDatabase class, which maintains a signature ?master bug mapping in a SQLite database. So far these contained the crash signatures (built from executable name, signal number, and the topmost 5 stack trace names). As usual, if an incoming report defines a duplicate signature (from the crash stack trace or from DuplicateSignature), the first one will become the master bug, and all subsequent reports will automatically get closed as a duplicate in Launchpad.

Thanks to Bryce Harrington, who already came up with using this in the latest Intel X.org graphics driver for GPU hangs!

Read more

Thanks to the hard work of Brian Murray we now have a section for Unity on http://status.qa.ubuntu.com

And here it is for the package level:

You’ll notice the “bitesize” section, more on this next week 

Read more
pitti

PostgreSQL 9.0 with a whole lot of new features and improvements is nearing completion. The first release candidate was just announced.

As with the beta versions, I uploaded RC1 to Debian experimental again. If you want to test/use them on Ubuntu 10.04 (Lucid Lynx), you can get packages from my “PostgreSQL backports for stable Ubuntu releases” PPA. Please let me know if you need them for other releases.

Just for the records, both Debian 6.0 “Squeeze” and Ubuntu 10.10 “Maverick Meerkat” will release and officially support 8.4 only, as 9.0 is too late for the feature freezes of both. Also, it will take quite some time to update all the packaged extensions to 9.0. As usual, 9.0 will be provided as official backports for both Debian and Ubuntu.

Happy testing!

Read more
pitti

The Debian import freeze is settled, the first rush of major changes went into Maverick, and the dust now has settled a bit. Thus it’s time to turn back some attention to crashes and quality in general.

This morning I created maverick chroots for the Apport retracers, and they are currently processing the backlog. I also uploaded a new Apport package which now enables crash reporting by default again.

Happy segfaulting!

Read more
pitti

PostgreSQL did microrelease updates three weeks ago: 8.4.3, 8.3.10, and 8.1.20 are the ones relevant for Debian/Ubuntu. There haven’t been reports about regressions in Debian or the upstream lists so far, so it’s time to push these into stable releases.

The new releases are in Lucid Beta-2, and hardy/jaunty/karmic-proposed. If you are running PostgreSQL, please upgrade to the proposed versions and give feedback to LP #557408.

Updates for Debian Lenny are prepared as well, and await release team ack.

On a related note, I recently fixed quite a major problem in pg_upgradecluster in postgresql-common 106: It did not copy database-level ACLs and configuration settings (Debian #543506). Fixing this required some reenginering of the upgrade process. It’s all thoroughly test case’d, but practical feedback would be very welcome! Remember, if anything goes wrong, the cluster of the previous version is still intact and untouched, so you can run upgrades as many times as you like and only pg_dropcluster the old one when you’re completely satisfied with the upgrade.

Thanks,

Martin

Read more