Canonical Voices

What jedimike's adventures in typing talks about

Posts tagged with 'voices'


Inside Canonical, we have projects going on and testers testing them all over the world. This means transferring daily builds of projects that can be gigabytes in size, to people who might not have a great deal of bandwidth.

We had been using rsync to do this, using a previous daily build as a seed so that rsync only transferred data when it found a diff. However, that still meant transferring a lot of data, as rsync’s capability of calculating a delta between ISOs could be better. Still, it did have a speedup of 25-30%.

Zsync is a project that does it better. The deltas are significantly smaller for our use cases, and it goes over http which means we don’t have to set up shell access for users.

We tested both rsync and zsync against two of our projects that get a lot of activity, and have large ISOs to transfer to testers. Here’s the results of the delta size that each tool transferred.

Target ISO Size for Project X: 1976.3mb

Age of seed ISO: 7 days
rsync transfer : 1486.7mb
zsync transfer : 531mb

Age of seed ISO: 2 days
rsync transfer : 1479.9mb
zsync transfer : 375.9mb

Target ISO size for Project Y: 1104mb

Age of seed ISO: 7 days
rsync transfer : 758mb
zsync transfer : 66.1mb

So for our use case, zsync seemed the obvious choice. There were a couple of barriers to using it though.

Our image archives are HTTPS and protected by OpenID authentication. The zsync client does neither of these things, as it has its own internal HTTP client. The project itself has not seen any activity since 2010, so the chances of a new version using libcurl getting released are pretty much zero.

There were a couple of projects attempting to update zsync, at various stages of completion, but they were either incomplete for our needs (i.e. missing authentication methods) or at the early stages of development.

So, we spun our own solution.

Zsync-curl is a fork of the zsync client that uses libcurl. To solve our OpenID problem, it allows you to set arbritary cookies, so we have a script that authenticates against our OpenID provider, and sets the authenticated session cookie when calling our libcurl backed zsync binary.

The zsync-curl packages install a new zsync_curl binary which sits alongside zsync and zsyncmake from the official zsync distribution.

All of our benchmarks show this will save a significant amount of data transfer, so we’re looking forward to getting it out to our testers around the world and seeing how much it saves in real world usage.

Read more

Django-group-access is an app for django that controls row-level access to data based on group membership. This was perfect for a few projects we were developing in Canonical. So perfect, in fact, that we wrote django-group-access, and open sourced it.

It does a few nifty things. And these nifty things got us out of a tight spot where we had to redefine access control rules at short notice.

1) Declare access control rules, instead of having to code them in your views.

This means that if you want to control access to instances of MyModel, you just have to declare that MyModel is access controlled. MyModel will automatically pick up all the attributes needed to share it with groups, and have a user take ownership of it. It’s as simple as:

from django_group_access import register
from myapp.models import MyModel


2) Optional middleware will automatically filter your querysets so your views do not have to be aware of access control.

In some access control apps (admittedly, ones that are more focussed on providing row-level permissions, rather than row-level access) you have to modify your code and models to be aware of the access control. For example, you might have to do something like this:

def my_view(request):
    records = MyModel.objects.accessible_by_user(request.user).all()

That could be a lot of work if you’re integrating it into an existing app, and prohibitive if you’re integrating into a third party app.

With the automatic filtering in django-group-access, it remains:

def my_view(request):
    records = MyModel.objects.all()

That would give you all the records that the currently logged in user has access to. Plus, at no extra charge, it includes the .unrestricted method to give you unfiltered access to a queryset, like this:

def my_view(request):
    # all the visible records for the current user
    records = MyModel.objects.all()
    # all the records that exist
    all_records = records.unrestricted()

3) It allows you to create a hierarchy of models that control access.

Say you had a Room model, and a House model. If you had access to a Room, you want access to the House too.

With django-group-access, that’s easy.

class House(models.Model):
    address = models.TextField()

class Room(models.Model):
    description = models.TextField()
    house = models.ForeignKey(House)

register(House, control_relation='room')

Now, because django-group-access allows us to declare access rules instead of coding them into the models, and because it does automatic filtering, changing how the visibility of records is worked out becomes as easy as changing how your models are registered, and migrating your schema. (And, of course, changing the test data that your extensive unit and integration tests, set up. That’ll teach you to use test data factories…)

Development on django-group-access is ongoing, with plans for django-access-tng which will include row-level permissions and finer grained access control that doesn’t need to share records with groups to grant access.

But pretend you didn’t read that last paragraph. Our current product is the best. The new one might never come. Buy now to avoid disappointment.

Read more