Canonical Voices

What Michael Hudson talks about

Posts tagged with 'linaro'

Michael Hudson-Doyle

Using postgresql’s array_agg from Django

I wanted to use PostgreSQL‘s array_agg function from Django to help bound the number of queries a complex page makes.

I managed to make it work, but it was a bit painful. I’d love to hear from anyone else that’s managed to make it work, especially if you didn’t have to hack as hard as me 🙂

First, let’s define some models for us to play with:

from django.db import models

class Location(models.Model):
    name = models.CharField(max_length=1024)

class Hurricane(models.Model):
    year = models.IntegerField()
    name = models.CharField(max_length=1024)
    location = models.ForeignKey(Location)

The problem I want to solve is: for a given Location, show the hurricanes that occurred for the last ten years in which there were hurricanes.

In this simplified case, this is probably quite easy to solve without being as fancy as I am about to be, but the thing that makes it non-trivial is the bit at the end: displaying from the last ten years is easy (“WHERE YEAR > 2002“) but there may be years where there were not hurricanes. The trick I want to use is to GROUP ON year, use array_agg to collect the ids of the hurricanes for each year, and use LIMIT to ensure I only get 10 years worth. Then I can gather the Hurricane ids up in Python code and issue another query to fetch them all in a single query. I’m aiming for something like this:

hurr=> SELECT year, array_agg(id) FROM hurr_hurricane
hurr-> WHERE hurr_hurricane.location_id = 1
hurr-> GROUP BY year ORDER BY year LIMIT 10;
 year | array_agg
 2004 | {2,1}
 2006 | {3}
 2007 | {4}
(3 rows)

I’ve never quite gotten my head around Django’s support for aggregates, but it’s not hard to get Django to emit a query of the right shape:

>>> qs = Hurricane.objects.filter(
...     location=loc).values('year').annotate(Sum('id'))[:10]
>>> print qs.query
SELECT "hurr_hurricane"."year", SUM("hurr_hurricane"."id") AS "id__sum"
FROM "hurr_hurricane" WHERE "hurr_hurricane"."location_id" = 1
GROUP BY "hurr_hurricane"."year", "hurr_hurricane"."year"
ORDER BY "hurr_hurricane"."year" LIMIT 10
>>> pprint(list(qs))
[{'id__sum': 3, 'year': 2004},
 {'id__sum': 3, 'year': 2006},
 {'id__sum': 4, 'year': 2007}]

We don’t want to SUM though, we want to array_agg. I don’t really know to what extent it’s actually supported, but googling can find enough of a clue on how to use custom aggregates with Django. Culting the approprate cargo leads to:

from django.db.models.sql.aggregates import Aggregate as SQLAggregate
from django.db.models import Aggregate

class SQLArrayAgg(SQLAggregate):
    sql_function = 'array_agg'

class ArrayAgg(Aggregate):
    name = 'ArrayAgg'
    def add_to_query(self, query, alias, col, source, is_summary):
        klass = SQLArrayAgg
        aggregate = klass(
            col, source=source, is_summary=is_summary, **self.extra)
        query.aggregates[alias] = aggregate

and then:

>>> qs = Hurricane.objects.filter(
...     location=loc).values('year').annotate(ArrayAgg('id')).order_by('year')[:10]
>>> print qs.query
SELECT "hurr_hurricane"."year", array_agg("hurr_hurricane"."id") AS "id__arrayagg"
FROM "hurr_hurricane" WHERE "hurr_hurricane"."location_id" = 1
GROUP BY "hurr_hurricane"."year", "hurr_hurricane"."year"
ORDER BY "hurr_hurricane"."year" ASC LIMIT 10

Yay! Except:

>>> list(qs)

Huh? This sent me around the houses for ages and ages, but it turns out that there is an easy way of seeing the problem:

>>> for hurricane in qs:
...    print hurricane
Traceback (most recent call last):
File "/srv/lava/.cache/eggs/Django-1.4.1-py2.7.egg/django/db/models/sql/", line 316, in convert_values
    return connection.ops.convert_values(value, field)
File "/srv/lava/.cache/eggs/Django-1.4.1-py2.7.egg/django/db/backends/", line 843, in convert_values
    return int(value)
TypeError: int() argument must be a string or a number, not 'list'

(Something in executing list(qs) is swallowing the exception – I presume this is a bug, I’ll file it soon if it’s unreported.)

So let’s look at where the exception comes from (django/db/backends/, Django version 1.4.1 below):

def convert_values(self, value, field):
    """Coerce the value returned by the database backend into a consistent type that
    is compatible with the field type.
    internal_type = field.get_internal_type()
    if internal_type == 'DecimalField':
        return value
    elif internal_type and internal_type.endswith('IntegerField') or internal_type == 'AutoField':
        return int(value)
    elif internal_type in ('DateField', 'DateTimeField', 'TimeField'):
        return value
    # No field, or the field isn't known to be a decimal or integer
    # Default to a float
    return float(value)

Django appears to assume that all aggregates return numeric types. This is a bit annoying (another bug to file?), but there is a slimy hack we can pull: DecimalFields are assumed to be the correct type already and do not get converted. So here’s the final, working, array_agg support:

from django.db.models import DecimalField

class SQLArrayAgg(SQLAggregate):
    sql_function = 'array_agg'

class ArrayAgg(Aggregate):
    name = 'ArrayAgg'
    def add_to_query(self, query, alias, col, source, is_summary):
        klass = SQLArrayAgg
        aggregate = klass(
            col, source=source, is_summary=is_summary, **self.extra)
        aggregate.field = DecimalField() # vomit
        query.aggregates[alias] = aggregate

And to prove it works:

>>> pprint(list(qs))
[{'id__arrayagg': [2, 1], 'year': 2004},
 {'id__arrayagg': [3], 'year': 2006},
 {'id__arrayagg': [4], 'year': 2007}]

Yay! Now: does it really have to be this hard?

Read more
Michael Hudson-Doyle

The Ubuntu developers and, because Linaro started out with a fork of Ubuntu’s processes, the Linaro developers track much of their work in blueprints on Launchpad and work items in those blueprints.  We’ve even built a fairly sophisticated web site that tracks work item completion in a manager friendly way.

Editing these work items is a fairly tedious job however.  The syntax is fairly simple but still easy to get wrong, and fiddling around in a textarea just to change one work item to DONE is not very friendly.  So, after a challenge laid down by James, I’ve built a Greasemonkey script that adds a button to blueprint pages which opens an editor that lets you easily change the status or and add work items to the blueprint.

In action, it looks like this (it uses the LP JavaScript widgets, so it almost feels like a native part of Launchpad):

There are probably a few bugs, and certainly things it doesn’t do (off the top of my head: track the milestone a work item is allocated to, let you delete or reorder work items or allow you to change the assignee of a workitem with a nice person picker) but I think it will save us all a bit of time every day even in its current state.

The script is now part of the launchpad-gm-scripts project on Launchpad.  I don’t really know the best way of installing greasemonkey scripts yet.  Grabbing the branch, opening it in Nautilus and dragging it into a firefox window worked for me.

Chrome/chromium has a similar extension, but it hasn’t been tested there.  If you do, I’d love to know if it works 🙂  Please submit bug reports or merge proposals on LP if you find problems!

Read more
Michael Hudson-Doyle

Viewing lava results in android-build

It seems like it’s taken a long time to get all the pieces hooked up, but I’m very happy to report that finally you can see the results of testing an android build in LAVA directly in the build page!  If you go to a build page for a build that is automatically tested such as (after a little while for some ajax to happen), you should see a table listing the test runs we ran and the summary of the pass/fail counts:

It may not seem all that earth shattering, but there have been many bits and pieces that needed to be put together to get this to work:

  • We needed to build the scheduler, so that we could submit this job to run on the first Panda board to become available.
  • We needed to build the infrastructure to allow the job to be submitted with some measure of security from a remote host.
  • We needed to work on the build system scripts to submit the job without disclosing the authorization token.
  • We needed to add views to the scheduler and dashboard that present data in a ajax-friendly way.
  • We needed to work on the build system frontend to make uses of these views and format the data.
  • And finally, we needed to test and deploy all of these changes.

So I think I’m justified in being happy to have this finally working in production 🙂  Of course, it’s just a start: we want to build similar facilities for the other working groups to use, if nothing else.

Read more
Michael Hudson-Doyle

A few weeks ago now, most of the Linaro engineers met at “Linaro Connect”, the new name for our get-together.  Linaro bootstrapped its processes by borrow heavily from Ubuntu, including the “two planning meetings, two hacking meetings” pattern. Over the last year though it’s become clear that this isn’t totally appropriate for Linaro and while we’re sticking to the same number of meetings, 4 a year, each meeting now has the same status and will be a mix of planning and hacking.  Based on a sample size of 1, this seems to be a good idea – last week’s meeting was excellent.  Very intense, which is why I never got around to blogging during the event, but also very productive.

The validation team had a dedicated hacking room, and on Monday we set up a “mini-Lab” that we could run tests on.  This took a surprisingly (and pleasingly) short amount of time, although we weren’t as neat about the cabling as we are in the real lab:

The main awkwardness in this kind of setup where you are connecting to the serial ports via USB rather than a console server is that the device names of the usb serial dongles is not predictable, and so naming boards becomes a challenge.  Dave worked out a set of hacks to mostly make this work, although I know nothing about the details.

Now that a few weeks have passed I can’t really remember what we did next 🙂  There was a lot of hacking and a lot talking.  These are some things I remember:

  • I spent some time talking to the Android developers about getting the results of the tests to display on the build page. Luckily there were no new surprises and I managed to come up with a plan for getting this to work (have the process that runs the tests and uploads the bundle to the dashboard print out the URL to the result bundle and have the lava scheduler read this and record the link).
  • We all talked to the kernel team about how to test their work on an automated basis.
  • I talked to Michael Hope about the toolchain builds that are currently done in his basement, although we mostly deferred that conversation until after the event itself.
  • There was a lot of talk about making the front page of the validation server show something more useful.
  • I implemented a prototype for replacing QATracker with something that could guide a user through manual tests and upload the results directly to the dashboard.
  • We talked to ARM about possibly using some of the LAVA components we have built for their internal testing,
  • There was talk about the practicalities of using the LAVA lab to measure the effect of power management changes.

I’m sure there was lots of other stuff, but this should give some impression of how much went on!

Read more
Michael Hudson-Doyle

This is the first of a hopefully weekly series of posts describing the work my team is doing.  This means that this post is probably mostly background about the team’s goals, but in the coming weeks I intend to outline what we’ve done in the past week and plans for the next week.

We’re all about validation obviously – telling whether the code the other Linaro engineers are producing “works” in whatever sense that means.  It could be a simple compile or boot test for the kernel, testing whether the code produced by gcc is smaller or faster, whether a kernel scheduler change reduces power consumption for a certain workload, or many other things.

Beyond simple validation though, what we’re really about is automated validation.  We want to build and test the kernel on all supported boards every day.  We want to build and test proposed android changes in gerrit before they are landed, and the same for the gcc work.

We have built up a validation lab in Cambridge – the boards from the Linaro members we want to test on, but also Cyclades serial console servers, routers, and a few servers.  It looks a bit like this:

The thing that makes our task more complicated than “just install jenkins” is the hardware we run on, of course, and the fact that for many of our tests we need to boot a custom kernel on said hardware.  We’ve written a program (called “lava-dispatcher” or just “the dispatcher”) that knows how to install a custom hwpack and root filesystem by manipulating a board over a serial link, another (“lava-scheduler”) that handles incoming requests to run tests and runs the dispatcher as appropriate and yet another (“lava-dashboard”, aka “launch-control”) that displays the results from the tests.  We’ve also built up a number of infrastructure projects that help us run these main three, and command line clients for most things.  You can see all the code at and the running instance is at

So, what are we working on this week?  The main areas are to improve the UI of the scheduler – currently it runs jobs, but is very opaque about what they are doing, improving the front page to make it clearer to the uninitiated what validation is happening and improving the reliability of the dispatcher.  We’re also hard at work “joining the dots” so that the daily builds of Android that are already being produced can be tested daily, and have the build output and test results all visible from the same page.

Next week I’ll be at Linaro Connect in the UK, but I’ll try to update here on what we get done this week and what our plans are for the Connect.

Read more
Michael Hudson-Doyle

Installing Linaro for a Beagle xM

After I’d unpacked and booted my xM, I wanted to install a Linaro daily build on it.  This was actually fairly complicated for me because of a few bugs I ran into on the way, but as they’re all fixed now I’ll describe the way it’s meant to work 🙂

Basically, the instructions on the wiki are what you want.  You can download the latest snapshot from (which is what I’d recommend at this point; I can state that the 20100915-1 build works for me) or you can navigate your way to a more official release from (but don’t use the Linaro 10.11 Beta — it has a not very serious kernel bug that makes upgrades harder than needed on xMs).

Once you’ve downloaded the file (using dlimage or just boringly) and run linaro-media-create with a command line like

sudo linaro-media-create --dev beagle --rootfs ext3 --mmc /dev/sdb \
    --binary ~/Downloads/linaro-m-headless-tar-20100915-1.tar.gz

(make sure you get the –mmc bit right!), pop the card into your board, power it up and with the serial console connected run “screen /dev/ttyUSB0 115200” again.  The Linaro image is slightly different to the one that comes with the board in that you get a root prompt directly on the serial console, no need to log in.

As an aside, when I want to boot on a different card, I usually type ‘poweroff’ on the serial console, pull the card out, pop the new one in and press the reset button.  I don’t know if this is the best process 🙂  There is a kernel bug that prevents clean shutdown after the card has been on for a while, but it happens late enough in the shutdown process that I ignore it.

Next up, I’ll talk about how I set up my cards for networking and general user-level hackery.

Read more
Michael Hudson

As I said in my previous post, I recently received a Beagle Board xM to test some of my work stuff on and also just to get a bit more familiar with the world of ARM development.

When I got the board, I had no idea what to do with it.  None.  There didn’t seem to be a guide that I could find with google for people with my level of utter inexperience, so I thought I’d try to write one up.

First things first, you need some gadgets.  Well, to get started the only thing you really need is a USB->Serial adapter.  I don’t know much about these gadgets but was recommended this one and it works fine for me.  While you’re in the shop, you’ll probably want to buy a handful of microSD cards.  The faster and bigger the better probably, but 2 gigs or bigger is fine for playing around.  You’ll need some way of writing to these cards as well, of course.

The xM comes with a microSD card that contains “The Ångström Distribution”.  I don’t know what this is really, but it’s good to test that the board works, so pop the card in the slot in the board and plug the board in (I agonized for ages about this — I didn’t want to just plug it in in case a brief power contact would damage the board or something, but it seems to be the thing to do).  Then connect up the USB/serial adapter and run “screen /dev/ttyUSB0 115200″.  If the board has booted fully, you should get to a tty login.  Type ‘root’ as the user name and you’re in!  This default install is fairly plain — it’s a fairly minimal linux using busybox.  But if it boots, the board likely at least somewhat works.

Next up is installing a Linaro daily build on the board and getting networking working so that you can install all the fun software that makes up Ubuntu!

If anything in this post strikes you as wrong or unclear, please let me know in the comments.

Read more
Michael Hudson

My new(-ish) job

After three years of working on Launchpad, back in May I transferred over to working for Linaro.  On the face of it, this was a large move, moving from hacking on a programmer-supporting tool to making Linux work better on ARM processors, but I’m not working on the kernel or anything, in fact I’m still working on programmer-supporting tools, even hacking on Launchpad a bit from time to time.

Last week I received an ARM board of my very own to play with, a Beagle Board xM.  My first steps with this will be the subject of my next blog post…

Read more