Canonical Voices

What Michael Hudson talks about

Michael Hudson-Doyle

I wanted to use PostgreSQL‘s array_agg function from Django to help bound the number of queries a complex page makes.

I managed to make it work, but it was a bit painful. I’d love to hear from anyone else that’s managed to make it work, especially if you didn’t have to hack as hard as me :-)

First, let’s define some models for us to play with:

from django.db import models

class Location(models.Model):
    name = models.CharField(max_length=1024)

class Hurricane(models.Model):
    year = models.IntegerField()
    name = models.CharField(max_length=1024)
    location = models.ForeignKey(Location)
 

The problem I want to solve is: for a given Location, show the hurricanes that occurred for the last ten years in which there were hurricanes.

In this simplified case, this is probably quite easy to solve without being as fancy as I am about to be, but the thing that makes it non-trivial is the bit at the end: displaying from the last ten years is easy (“WHERE YEAR > 2002“) but there may be years where there were not hurricanes. The trick I want to use is to GROUP ON year, use array_agg to collect the ids of the hurricanes for each year, and use LIMIT to ensure I only get 10 years worth. Then I can gather the Hurricane ids up in Python code and issue another query to fetch them all in a single query. I’m aiming for something like this:

hurr=> SELECT year, array_agg(id) FROM hurr_hurricane
hurr-> WHERE hurr_hurricane.location_id = 1
hurr-> GROUP BY year ORDER BY year LIMIT 10;
 year | array_agg
------+-----------
 2004 | {2,1}
 2006 | {3}
 2007 | {4}
(3 rows)
 

I’ve never quite gotten my head around Django’s support for aggregates, but it’s not hard to get Django to emit a query of the right shape:

>>> qs = Hurricane.objects.filter(
...     location=loc).values('year').annotate(Sum('id'))[:10]
>>> print qs.query
SELECT "hurr_hurricane"."year", SUM("hurr_hurricane"."id") AS "id__sum"
FROM "hurr_hurricane" WHERE "hurr_hurricane"."location_id" = 1
GROUP BY "hurr_hurricane"."year", "hurr_hurricane"."year"
ORDER BY "hurr_hurricane"."year" LIMIT 10
>>> pprint(list(qs))
[{'id__sum': 3, 'year': 2004},
 {'id__sum': 3, 'year': 2006},
 {'id__sum': 4, 'year': 2007}]
 

We don’t want to SUM though, we want to array_agg. I don’t really know to what extent it’s actually supported, but googling can find enough of a clue on how to use custom aggregates with Django. Culting the approprate cargo leads to:

from django.db.models.sql.aggregates import Aggregate as SQLAggregate
from django.db.models import Aggregate

class SQLArrayAgg(SQLAggregate):
    sql_function = 'array_agg'

class ArrayAgg(Aggregate):
    name = 'ArrayAgg'
    def add_to_query(self, query, alias, col, source, is_summary):
        klass = SQLArrayAgg
        aggregate = klass(
            col, source=source, is_summary=is_summary, **self.extra)
        query.aggregates[alias] = aggregate
 

and then:

>>> qs = Hurricane.objects.filter(
...     location=loc).values('year').annotate(ArrayAgg('id')).order_by('year')[:10]
>>> print qs.query
SELECT "hurr_hurricane"."year", array_agg("hurr_hurricane"."id") AS "id__arrayagg"
FROM "hurr_hurricane" WHERE "hurr_hurricane"."location_id" = 1
GROUP BY "hurr_hurricane"."year", "hurr_hurricane"."year"
ORDER BY "hurr_hurricane"."year" ASC LIMIT 10
 

Yay! Except:

>>> list(qs)
[]
 

Huh? This sent me around the houses for ages and ages, but it turns out that there is an easy way of seeing the problem:

>>> for hurricane in qs:
...    print hurricane
Traceback (most recent call last):
    ...
    ...
File "/srv/lava/.cache/eggs/Django-1.4.1-py2.7.egg/django/db/models/sql/query.py", line 316, in convert_values
    return connection.ops.convert_values(value, field)
File "/srv/lava/.cache/eggs/Django-1.4.1-py2.7.egg/django/db/backends/__init__.py", line 843, in convert_values
    return int(value)
TypeError: int() argument must be a string or a number, not 'list'
 

(Something in executing list(qs) is swallowing the exception – I presume this is a bug, I’ll file it soon if it’s unreported.)

So let’s look at where the exception comes from (django/db/backends/__init__.py, Django version 1.4.1 below):

def convert_values(self, value, field):
    """Coerce the value returned by the database backend into a consistent type that
    is compatible with the field type.
    """
    internal_type = field.get_internal_type()
    if internal_type == 'DecimalField':
        return value
    elif internal_type and internal_type.endswith('IntegerField') or internal_type == 'AutoField':
        return int(value)
    elif internal_type in ('DateField', 'DateTimeField', 'TimeField'):
        return value
    # No field, or the field isn't known to be a decimal or integer
    # Default to a float
    return float(value)
 

Django appears to assume that all aggregates return numeric types. This is a bit annoying (another bug to file?), but there is a slimy hack we can pull: DecimalFields are assumed to be the correct type already and do not get converted. So here’s the final, working, array_agg support:

from django.db.models import DecimalField

class SQLArrayAgg(SQLAggregate):
    sql_function = 'array_agg'

class ArrayAgg(Aggregate):
    name = 'ArrayAgg'
    def add_to_query(self, query, alias, col, source, is_summary):
        klass = SQLArrayAgg
        aggregate = klass(
            col, source=source, is_summary=is_summary, **self.extra)
        aggregate.field = DecimalField() # vomit
        query.aggregates[alias] = aggregate
 

And to prove it works:

>>> pprint(list(qs))
[{'id__arrayagg': [2, 1], 'year': 2004},
 {'id__arrayagg': [3], 'year': 2006},
 {'id__arrayagg': [4], 'year': 2007}]
 

Yay! Now: does it really have to be this hard?

Read more
Michael Hudson-Doyle

The Ubuntu developers and, because Linaro started out with a fork of Ubuntu’s processes, the Linaro developers track much of their work in blueprints on Launchpad and work items in those blueprints.  We’ve even built a fairly sophisticated web site that tracks work item completion in a manager friendly way.

Editing these work items is a fairly tedious job however.  The syntax is fairly simple but still easy to get wrong, and fiddling around in a textarea just to change one work item to DONE is not very friendly.  So, after a challenge laid down by James, I’ve built a Greasemonkey script that adds a button to blueprint pages which opens an editor that lets you easily change the status or and add work items to the blueprint.

In action, it looks like this (it uses the LP JavaScript widgets, so it almost feels like a native part of Launchpad):

There are probably a few bugs, and certainly things it doesn’t do (off the top of my head: track the milestone a work item is allocated to, let you delete or reorder work items or allow you to change the assignee of a workitem with a nice person picker) but I think it will save us all a bit of time every day even in its current state.

The script is now part of the launchpad-gm-scripts project on Launchpad.  I don’t really know the best way of installing greasemonkey scripts yet.  Grabbing the branch, opening it in Nautilus and dragging it into a firefox window worked for me.

Chrome/chromium has a similar extension, but it hasn’t been tested there.  If you do, I’d love to know if it works :)  Please submit bug reports or merge proposals on LP if you find problems!

Read more
Michael Hudson-Doyle

It seems like it’s taken a long time to get all the pieces hooked up, but I’m very happy to report that finally you can see the results of testing an android build in LAVA directly in the build page!  If you go to a build page for a build that is automatically tested such as https://android-build.linaro.org/builds/~linaro-android/panda/#build=250 (after a little while for some ajax to happen), you should see a table listing the test runs we ran and the summary of the pass/fail counts:

It may not seem all that earth shattering, but there have been many bits and pieces that needed to be put together to get this to work:

  • We needed to build the scheduler, so that we could submit this job to run on the first Panda board to become available.
  • We needed to build the infrastructure to allow the job to be submitted with some measure of security from a remote host.
  • We needed to work on the build system scripts to submit the job without disclosing the authorization token.
  • We needed to add views to the scheduler and dashboard that present data in a ajax-friendly way.
  • We needed to work on the build system frontend to make uses of these views and format the data.
  • And finally, we needed to test and deploy all of these changes.

So I think I’m justified in being happy to have this finally working in production :-)  Of course, it’s just a start: we want to build similar facilities for the other working groups to use, if nothing else.

Read more
Michael Hudson-Doyle

New stuff in LAVA 2011.08

We’ve just deployed the 2011.08 release of the LAVA components to http://validation.linaro.org.  It feels like we’re getting to the point where we can start working on the really interesting features now, which is nice after what feels like months of building infrastructure.  While the official release highlights are a good list of what happened, here is a summary from my point of view of the changes from what was previously deployed:

lava-dashboard

  • I added a put_ex method for uploading a bundle and getting the URL of the bundle in response rather than just its SHA1.
  • The main change though is a shiny new image status page.

lava-dispatcher

  • I changed the dispatcher to print out the URL of the result bundle (on a specific fd, so the scheduler can cleanly see it in amongst all the useful but noisy stuff the dispatcher usually outputs)
  • Le Chi Thu worked on being able to define test suites to run outside the lava-dispatcher/lava-test source code.
  • There were also sundry changes to make things more reliable.

lava-scheduler

  • Many changes here – I spent most of the month hacking on the scheduler
  • Record the link to the results as output by the dispatcher and show it on the job page.
  • I made the scheduler daemon reconnect to the database if the connection fails or drops.  This means we could finally change to starting the daemon on boot, without having to care if postgres was fully initialized or not.
  • I changed the way the dispatcher is run to allow the scheduler daemon to be restarted while there are running jobs.
  • Jobs can now be cancelled while they are in the submitted or running states.
  • There is now a simple device view, and admins can ask for no more jobs to be submitted to a board.

lava-server

  • The main changes here were to allow the other LAVA components to provide content to put on the front page.

lava-test

  • The work to support out-of-tree tests had an impact on lava-test, too
  • We added support for the smem, bootchart and xrestop tests, as well as the usual reliability fixes.
  • The bootchart test for a while managed to overwrite the kernel for the master/recovery partition of the SD cards, which was a problem, especially when the new kernel had broken ethernet support…

What’s on for next month?  Well from my side, it’s likely to be still around ‘joining the dots’, starting with enabling the android team to see the results of a test run on the build page.  I think other likely developments are to be making our story for running tests on android a bit clearer, refactoring the way configuraton is done for the dispatcher (and making debian packages for it), supporting more hardware in the lab, putting the final pieces together to allow daily testing of the kernel working group output, documentation, email notifications of various things, and I’m sure various things that I’ve forgotten or haven’t cropped up yet.  It’s going to be another busy month!

Read more
Michael Hudson-Doyle

A few weeks ago now, most of the Linaro engineers met at “Linaro Connect”, the new name for our get-together.  Linaro bootstrapped its processes by borrow heavily from Ubuntu, including the “two planning meetings, two hacking meetings” pattern. Over the last year though it’s become clear that this isn’t totally appropriate for Linaro and while we’re sticking to the same number of meetings, 4 a year, each meeting now has the same status and will be a mix of planning and hacking.  Based on a sample size of 1, this seems to be a good idea – last week’s meeting was excellent.  Very intense, which is why I never got around to blogging during the event, but also very productive.

The validation team had a dedicated hacking room, and on Monday we set up a “mini-Lab” that we could run tests on.  This took a surprisingly (and pleasingly) short amount of time, although we weren’t as neat about the cabling as we are in the real lab:

The main awkwardness in this kind of setup where you are connecting to the serial ports via USB rather than a console server is that the device names of the usb serial dongles is not predictable, and so naming boards becomes a challenge.  Dave worked out a set of hacks to mostly make this work, although I know nothing about the details.

Now that a few weeks have passed I can’t really remember what we did next :)  There was a lot of hacking and a lot talking.  These are some things I remember:

  • I spent some time talking to the Android developers about getting the results of the tests to display on the build page. Luckily there were no new surprises and I managed to come up with a plan for getting this to work (have the process that runs the tests and uploads the bundle to the dashboard print out the URL to the result bundle and have the lava scheduler read this and record the link).
  • We all talked to the kernel team about how to test their work on an automated basis.
  • I talked to Michael Hope about the toolchain builds that are currently done in his basement, although we mostly deferred that conversation until after the event itself.
  • There was a lot of talk about making the front page of the validation server show something more useful.
  • I implemented a prototype for replacing QATracker with something that could guide a user through manual tests and upload the results directly to the dashboard.
  • We talked to ARM about possibly using some of the LAVA components we have built for their internal testing,
  • There was talk about the practicalities of using the LAVA lab to measure the effect of power management changes.

I’m sure there was lots of other stuff, but this should give some impression of how much went on!

Read more
Michael Hudson

This is the first of a hopefully weekly series of posts describing the work my team is doing.  This means that this post is probably mostly background about the team’s goals, but in the coming weeks I intend to outline what we’ve done in the past week and plans for the next week.

We’re all about validation obviously – telling whether the code the other Linaro engineers are producing “works” in whatever sense that means.  It could be a simple compile or boot test for the kernel, testing whether the code produced by gcc is smaller or faster, whether a kernel scheduler change reduces power consumption for a certain workload, or many other things.

Beyond simple validation though, what we’re really about is automated validation.  We want to build and test the kernel on all supported boards every day.  We want to build and test proposed android changes in gerrit before they are landed, and the same for the gcc work.

We have built up a validation lab in Cambridge – the boards from the Linaro members we want to test on, but also Cyclades serial console servers, routers, and a few servers.  It looks a bit like this:

The thing that makes our task more complicated than “just install jenkins” is the hardware we run on, of course, and the fact that for many of our tests we need to boot a custom kernel on said hardware.  We’ve written a program (called “lava-dispatcher” or just “the dispatcher”) that knows how to install a custom hwpack and root filesystem by manipulating a board over a serial link, another (“lava-scheduler”) that handles incoming requests to run tests and runs the dispatcher as appropriate and yet another (“lava-dashboard”, aka “launch-control”) that displays the results from the tests.  We’ve also built up a number of infrastructure projects that help us run these main three, and command line clients for most things.  You can see all the code at https://code.launchpad.net/lava and the running instance is at http://validation.linaro.org/lava-server.

So, what are we working on this week?  The main areas are to improve the UI of the scheduler – currently it runs jobs, but is very opaque about what they are doing, improving the front page to make it clearer to the uninitiated what validation is happening and improving the reliability of the dispatcher.  We’re also hard at work “joining the dots” so that the daily builds of Android that are already being produced can be tested daily, and have the build output and test results all visible from the same page.

Next week I’ll be at Linaro Connect in the UK, but I’ll try to update here on what we get done this week and what our plans are for the Connect.

Read more
Michael Hudson

As part of the infrastructure team for Linaro, I’m working on various kinds of build service — particular image builds and package builds.  Both of these involve running things like ‘apt-get update’ and ‘debootstrap’ a lot, and for as I’m in New Zealand and the servers these commands pull from are in London, a caching proxy between the board and the internet makes things go a _lot_ faster.

On Michael Hope‘s recommendation I installed polipo which seems to mostly work, although it seems to fall over whenever I sleep my laptop, which is a bit lame.  I have it configured to listen on 10.0.0.1 and answer requests from 127.0.0.1 and 10.0.0.0/24 (so I can point programs running on my laptop at it if I want/to test it’s working).  My config file looks like this:

proxyAddress = 10.0.0.1
allowedClients = 127.0.0.1, 10.0.0.0/24
cacheIsShared = false
dnsUseGethostbyname = happily

The only non-trivial thing to figure out was the dnsUseGethostbyname thing — this is required for hostnames in /etc/hosts to be resolved by the proxy, something that matters when the board is trying to download files from “librarian.launchpad.dev”.

Then it just remains to set http_proxy to http://10.0.0.1:8123/ in .bashrc on the board and allow the http_proxy environment variable to be passed though sudo by adding the following line using visudo:

Defaults        env_keep += "http_proxy"

Then just enjoy the better performance of the package manipulations (after the first time, at least).

Read more
Michael Hudson

Setting up users on my xM

This is easy stuff — and not at all ARM specific — but I never remember this sort of thing.  I set up my user account with:

# adduser mwh
# adduser mwh sudo

Then I made a ssh key, added it to my account on Launchpad and put the ssh key from my laptop into .ssh/authorized_keys.

Read more
Michael Hudson

Setting up networking on a Beagle xM

Setting up networking on the Beagle xM is a bit of a pain.  The ethernet port is wired up in some weird way via the USB subsystem (so the interface ends up being called “usb0″) and there’s no MAC address in the EEPROM on the card, so it gets a random MAC address every time you bring the interface up, which makes making DHCP do what you want something of a chore.

What follows is what I’ve done.  It may well not be anything like optimal, but it seems to work.  I access the internet here at the office via wifi, and wanted to share the connection to the xM via my laptop’s ethernet port.  There’s a wiki page about this but for whatever reason the network manager method of connection sharing didn’t work at all for me.

1. Choose a name for the board (I chose “bitty” for some reason).

2. Configure dhclient on the board to send this as its hostname when asking for a lease by adding “send host-name “bitty”;” to /etc/dhcp3/dhclient.conf

3. On the laptop, use Network Manager to give the ethernet port a static address of 10.0.0.1.

4. Install dnsmasq.  This is a DNS + DHCP server.

5. Configure dnsmasq.  Predictably, I can’t remember all the things I’ve changed in here.  I think I added “interface=eth0″, “dhcp-range=10.0.0.50,10.0.0.100″ and “dhcp-host=bitty,10.0.0.45,1h” to respectively answer dhcp on the wired ethernet port, do dhcp at all and to assign a particular address to the machine that calls itself “bitty”.

6. I then followed the iptables and other instructions from the “Gateway set up” section of the aforementioned wiki page.

7. Restarted dnsmasq.

10. Added “10.0.0.45 bitty” to /etc/hosts on the laptop

11. Run “dhclient usb0″ as root on the board (via the serial console).

You should now have network!  So you can run “apt-get update” and so on like a good citizen.

Looking at the above list of instructions, I don’t know why I bothered with DHCP and didn’t just configure everything statically on the board, especially as the above has the following problem: when you restart the board and run dhclient the board has a new MAC address,  but as the dhcp server thinks that 10.0.0.45 is still leased to the address the board had before the reboot it doesn’t hand out the desired.  The fix for this is “sudo rm /var/lib/misc/dnsmasq.leases” “sudo /etc/init.d/dnsmasq restart” but that seems like an utter hack.

That’s enough for one post, next will be a brief reminder (mostly for myself) on how to set up users from the command line and then I’ll talk about more me-specific things like setting up a caching proxy to make debootstrap go much faster.

Read more
Michael Hudson-Doyle

Installing Linaro for a Beagle xM

After I’d unpacked and booted my xM, I wanted to install a Linaro daily build on it.  This was actually fairly complicated for me because of a few bugs I ran into on the way, but as they’re all fixed now I’ll describe the way it’s meant to work :-)

Basically, the instructions on the wiki are what you want.  You can download the latest snapshot from http://snapshots.linaro.org/10.11-daily/linaro-headless/ (which is what I’d recommend at this point; I can state that the 20100915-1 build works for me) or you can navigate your way to a more official release from https://wiki.linaro.org/Releases (but don’t use the Linaro 10.11 Beta — it has a not very serious kernel bug that makes upgrades harder than needed on xMs).

Once you’ve downloaded the file (using dlimage or just boringly) and run linaro-media-create with a command line like

sudo linaro-media-create --dev beagle --rootfs ext3 --mmc /dev/sdb \
    --binary ~/Downloads/linaro-m-headless-tar-20100915-1.tar.gz

(make sure you get the –mmc bit right!), pop the card into your board, power it up and with the serial console connected run “screen /dev/ttyUSB0 115200″ again.  The Linaro image is slightly different to the one that comes with the board in that you get a root prompt directly on the serial console, no need to log in.

As an aside, when I want to boot on a different card, I usually type ‘poweroff’ on the serial console, pull the card out, pop the new one in and press the reset button.  I don’t know if this is the best process :-)  There is a kernel bug that prevents clean shutdown after the card has been on for a while, but it happens late enough in the shutdown process that I ignore it.

Next up, I’ll talk about how I set up my cards for networking and general user-level hackery.

Read more
Michael Hudson

As I said in my previous post, I recently received a Beagle Board xM to test some of my work stuff on and also just to get a bit more familiar with the world of ARM development.

When I got the board, I had no idea what to do with it.  None.  There didn’t seem to be a guide that I could find with google for people with my level of utter inexperience, so I thought I’d try to write one up.

First things first, you need some gadgets.  Well, to get started the only thing you really need is a USB->Serial adapter.  I don’t know much about these gadgets but was recommended this one and it works fine for me.  While you’re in the shop, you’ll probably want to buy a handful of microSD cards.  The faster and bigger the better probably, but 2 gigs or bigger is fine for playing around.  You’ll need some way of writing to these cards as well, of course.

The xM comes with a microSD card that contains “The Ångström Distribution”.  I don’t know what this is really, but it’s good to test that the board works, so pop the card in the slot in the board and plug the board in (I agonized for ages about this — I didn’t want to just plug it in in case a brief power contact would damage the board or something, but it seems to be the thing to do).  Then connect up the USB/serial adapter and run “screen /dev/ttyUSB0 115200″.  If the board has booted fully, you should get to a tty login.  Type ‘root’ as the user name and you’re in!  This default install is fairly plain — it’s a fairly minimal linux using busybox.  But if it boots, the board likely at least somewhat works.

Next up is installing a Linaro daily build on the board and getting networking working so that you can install all the fun software that makes up Ubuntu!

If anything in this post strikes you as wrong or unclear, please let me know in the comments.

Read more
Michael Hudson

My new(-ish) job

After three years of working on Launchpad, back in May I transferred over to working for Linaro.  On the face of it, this was a large move, moving from hacking on a programmer-supporting tool to making Linux work better on ARM processors, but I’m not working on the kernel or anything, in fact I’m still working on programmer-supporting tools, even hacking on Launchpad a bit from time to time.

Last week I received an ARM board of my very own to play with, a Beagle Board xM.  My first steps with this will be the subject of my next blog post…

Read more
Michael Hudson

Hello world!

Welcome to Canonical Voices. This is your first post. Edit or delete it, then start blogging!

Read more