Canonical Voices

What the blog of robin talks about

(Also published on Canonical's design team blog)

The weekend before last, I went to PyCon UK 2015.

I already wrote about the keynotes, which were more abstract. Here I'm going to talk about the other talks I saw, which were generally more technical or at least had more to do with Python.


The talks I saw covered a whole range of topics - from testing through documentation and ways to achieve simplicity to leadership. Here are some key take-aways:

The talks

Following are slightly more in-depth summaries of the talks I thought were interesting.


15:30: Leadership of Technical Teams - Owen Campbell

There were two key points I took away from this talk. The first was Owen's suggestion that leaders should take every opportunity to practice leading. Find opportunities in your personal life to lead teams of all sorts.

The second point was more complex. He suggested that all leaders exist on two spectra:

  • Amount of control: hand-off to dictatorial
  • Knowledge of the field: novice to expert

The less you know about a field the more hands-off you should be. And conversely, if you're the only one who knows what you're talking about, you should probably be more of a dictator.

Although he cautioned that people tend to mis-estimate their ability, and particularly when it comes to process (e.g. agile), people think they know more than they do. No-one is really an expert on process.

He suggested that leading technical teams is particularly challenging because you slide up and down the knowledge scale on a minute-to-minute basis sometimes, so you have to learn to be authoritative one moment and then permissive the next, as appropriate.

17:00: Document all the things - Kristian Glass

Kristian spoke about the importance, and difficulty, of good documentation. Here are some particular points he made:

  • Document why a step is necessary, as well as what it is
  • Remember that error messages are documentation
  • Try pair documentation - novice sitting with expert
  • Checklists are great
  • Stop answering questions face-to-face. Always write it down instead.
  • Github pages are better than wikis (PRs, better tracking)

One of Kristian's main points was that it goes against the grain to write documentation, 'cos the person with the knowledge can't see why it's important, and the novice can't write the documentation.

He suggested pair documentation as a solution, which sounds like a good idea, but I was also wondering if a StackOverflow model might work, where users submit questions, and the team treat them like bugs - need to stay on top of answering them. This answer base would then become the documentation.


11:00: Asking About Gender - the Whats, Whys and Hows - Claire Gowler

Claire spoke about how so many online forms expect people to be either simply "male" or "female", when the truth can be much more complicated.

My main takeaway from this was the basic point that forms very often ask for much more information than they need, and make too many assumptions about their users. When it comes to asking someone's name, try radically reducing the complexity by just having one text field called "name". Or better yet, don't even ask their name if you don't need it.

I think this feeds into the whole field of simplicity very nicely. A very many apps try to do much more than they need to, and ask for much more information than they need. Thinking about how little you know about your user can help you realise what you actually don't need to know about your user.

11:30: Finding more bugs with less work - David R. MacIver

David MacIver is the author of the Hypothesis testing library.

Hypothesis is a Python library for creating unit tests which are simpler to write and more powerful when run, finding edge cases in your code you wouldn’t have thought to look for. It is stable, powerful and easy to add to any existing test suite.

When we write tests normally, we choose the input cases, and we normally do this and we often end up being really kind to our tests. E.g.:

def test_average():
  assert my_average([2, 4]) == 3

What Hypothesis does it help us test with a much wider and more challenging range of values. E.g.:

from hypothesis.strategies import lists, floats

def test_average(float_list):
  ave = reduce(lambda x, y: x + y, float_list) / len(float_list)
  assert average(float_list) == ave

There are many cases where Hypothesis won't be much use, but it's certainly good to have in your toolkit.


10:00: Simplicity Is A Feature - Cory Benfield

Cory presented simplicity as the opposite of complexity - that is, the fewer options something gives you, the more simple and straightforward it is.

"Simplicity is about defaults"

To present as simple an interface as possible, the important thing is to have many sensible defaults as possible, so the user has to make hardly any choices.

Cory was heavily involved in the Python Requests library, and presented it as an example of how to achieve apparent simplicity in a complex tool.

"Simple things should be simple, complex things should be possible"

He suggested thinking of an "onion model", where your application has layers, so everything is customisable at one of the layers, but the outermost layer is as simple as possible. He suggested that 3 layers is a good number:

  • Layer 1: Low-level - everything is customisable, even things that are just for weird edge-cases.
  • Layer 2: Features - a nicer, but still customisable interface for all the core features.
  • Layer 3: Simplicity - hardly any mandatory options, sensible defaults
    • People should always find this first
    • Support 80% of users 80% of the time
    • In the face of ambiguity do the right thing

He also mentioned that he likes README driven development, which seems like is an interesting approach.

11:00: How (not) to argue - a recipe for more productive tech conversations - Harry Percival

I think this one could be particularly useful for me.

Harry spoke about how many people (including him) have a very strong need to be right. Especially men. Especially those who went to boarding school. And software development tends to be full of these people.

Collaboration is particularly important in open source, and strongly disagreeing with people rarely leads to consensus, in fact it's more likely to achieve the opposite. So it's important that we learn how to get along.

He suggests various strategies to try out, for getting along with people better:

  • Try simply giving in, do it someone else's way once in a while (hard to do graciously)
  • Socratic dialogue: Ask someone to explain their solution to you in simple terms
  • Dogfooding - try out your idea before arguing for its strength
  • Bide your time: Wait for the moment to see how it goes
  • Expose yourself to other cultures, where arguments are less acceptable

All of this comes down to stepping back, waiting and exercising humility. All of which are easier said than done, but all of which are very valuable if I could only manage it.

11:30: FIDO - The dog ate my password - Alex Willmer

After covering fairly common ground of how and why passwords suck, Alex introduced the FIDO alliance.

The FIDO alliance's goal is to standardise authentication methods and hopefully replace passwords. They have created two standards for device-based authentication to try to replace passwords:

  • UAF: First-factor passwordless biometric authentication
  • U2F: Second-factor device authentication

Browsers are just starting to support U2F, whereas support for UAF is farther off. Keep an eye out.

14:30: Data Visualisation with Python and Javascript - crafting a data-visualisation for the web - Kyran Dale

Kyran demoed using Scrapy and Pandas to retrieve the Nobel laureatte data from Wikipedia, using Flask to serve it as a RESTful API, and then using D3 to create an interactive browser-based visualisation.

Read more

(Also posted on

I routinely have at least 20 tabs open in Chrome, 10 files open in Atom (my editor of choice) and I'm often running virtual machines as well. This means my poor little X1 Carbon often runs out of memory, at which point Ubuntu completely freezes up, preventing me from doing anything at all.

Just a few days ago I had written a long post which I lost completely when my system froze, because Atom doesn't yet recover documents after crashes.

If this sounds at all familiar to you, I now have a solution! (Although it didn't save me in this case because it needs to be enabled first - see below.)


The magic SysRq key can run a bunch of kernel-level commands. One of these commands is called oom_kill. OOM stands for "Out of memory", so oom_kill will kill the process taking up the most memory, to free some up. In most cases this should unfreeze Ubuntu.

You can run oom_kill from the keyboard with the following shortcut:

# Kill the process taking up the most memory
alt + SysRq + f

Except that this is disabled by default on Ubuntu:

Enabling SysRq functions

For security reasons, SysRq keyboard functions are disabled by default. To enable them, change the value in the file /etc/sysctl.d/10-magic-sysrq.conf to 1:

# /etc/sysctl.d/10-magic-sysrq.conf
kernel.sysrq = 1

And to enable the new config run:

sudo sysctl --system

SysRq shortcut for the Thinkpad X1

Most laptops don't have a physical SysRq key. Instead they offer a keyboard combination to emulate the key. On my Thinkpad, this is fn + s. However, there's a quirk that the SysRq key is only "pressed" when you release.

So to run oom_kill on a Thinkpad, after enabling it, do the following:

  • Press and hold alt
  • To emulate SysRq, press fn and s keys together, then release them (keep holding alt)
  • Press f

This will kill the most expensive process (usually the browser tab running in my case), and freeup some memory.

Now, if your computer ever freezes up, you can just do this, and hopefully fix it.

Read more

(Also published on Canonical's design team blog)

Last weekend I went to my first Pycon, my second conference in a fortnight.

The conference runs from Friday to Monday, with 3 days of talks followed by one day of "sprints", which is basically a hack day.

PyCon has a code of conduct to discourage any form of othering:

Happily, PyCon UK is a diverse community who maintain a reputation as a friendly, welcoming and dynamic group.

We trust that attendees will treat each other in a way that reflects the widely held view that diversity and friendliness are strengths of our community to be celebrated and fostered.

And for me, the conference lived up to this, with a very friendly feel, and a lot of diversity in its attendants. The friendly and informal atmosphere was impressive for such a large event with more than 450 people.

Unfortunately, the Monday sprint day was cut short by the discovery of an unexploded bomb.

Many keynotes, without much Python

There were a lot of "keynote" talks, with 2 on Friday, and one each on Saturday and Sunday. And interestingly none of them were really about Python, instead covering future technology, space travel and the psychology of power and impostor syndrome.

But of course there were plenty of Python talks throughout the rest of the day - you can read about them on my other post. And I think it was a good decision to have more abstract keynotes. It shows that the Python community really is more of a general community than just a special interest group.

Van Lindberg on data economics, Marx and the Internet of Things

In the opening keynote on Friday morning, the PSF chairman showed that total computing power is almost doubling every year, and that by 2020, the total processing power in portable devices will exceed that in PCs and servers.

He then used the fact that data can't travel faster than 11.8 inches per nanosecond to argue that we will see a fundamental shift in the economics of data processing.

The big-data models of today's tech giants will be challenged as it starts to be quicker and make more economic sense to process data at source, rather than transfer it to distant servers to be processed. Centralised servers will be relegated to mere aggregators of pre-processed data.

He likened this to Marx seizing the means of production in a movement which will empower users, as our portable Things start to hold the real information, and choose who to share it with.

I really hope he's right, and that the centralised data companies are doomed to fail to be replaced by the Internet of Autonomous Things, because the world of centralised data is not an equal world.

Does Python have a future on small processors? Isn't it too inefficient?

In a world where all the interesting software is running on light-weight portable devices, processing efficiency becomes important once again. Van used this to argue that efforts to run Python effectively on low-powered devices, like MicroPython, will be essential for Python as a language to survive.

Daniele Procida: All I really want is power

The second keynote was just after lunch on Friday, Daniele Procida, organiser of DjangoCon Europe openly admitted that what he really wanted out of life was power. He put forward the somewhat controversial idea that power and usefulness are the same thing, and that ideas without power are useless.

He made the very good point that power only comes to those who ask for it, or fight for it. And that if we want power not to be abused, we really need to talk about it a whole lot more, even though it makes people uncomfortable (try asking someone their salary). We should acknowledge who has the power, and what power we have, and watch where the power goes.

He suggested that, while in politics or industry, power is very much a rivalled good, in open source it is entirely an unrivalled good. The way you grab power in the open source community is by doing good for the community, by helping out. And so by weilding power you are actually increasing power for those around you.

I don't agree with him on this final point. I think power can be and is hoarded and abused in the open source community as well. A lot of people use their power in the community to edge out others, or make others feel small, or to soak up influence through talks and presentations and then exert their will over the will of others. I am certainly somewhat guilty of this. Which is why we should definitely watch the power, especially our own power, to see what effect it's having.

The takeaway maxim from this for me is that we should always make every effort to share power, as opposed to jealously guarding it. It's not that sharing power in the open source community is inevitable or necessarily comes naturally, but at least in the open source community sharing power genuinely can help you gain respect, where I fear the same isn't so true of politics or industry.

Dr Simon Sheridan: Landing on a comet: From planning to reality

Simon Sheridan was an incredibly most humble and unassuming man, given his towering achievements. He is a world-class space scientist who was part of the European Space Agency team who helped to land Rosetta on comet 67P.

Most of what he mentioned was basically covered in the news, but it was wonderful to hear it from his perspective.

Naomi Ceder: Confessions of a True Impostor

When, a short way into her Sunday morning keynote, Naomi Ceder asked the room:

How many of you would say that you have in some way or another suffered from imposter syndrome along with me?

Almost everybody put their hands up. This is why I think this was such an important talk.

She didn't talk about this per se, but contributing to the open source community is hard. No-one talks about it much, but I certainly feel there's a lot of pressure. Because of its very nature, your contributions will be open, to be seen by anyone, to be criticised by anyone. And let's face it, your contributions are never going to be perfect. And the rules of the game aren't written down anywhere, so the chance of being ridiculed seem pretty high. Open source may be a benevolent idea, but it's damned scary to take part in.

I believe this is why less than 2% of open source contributors are female, compared with more like 25-30% women in software development in general. And, as with impostor syndrome, the same trend is true of other marginalised groups. It's not surprising to me that people who are used to being criticised and discriminated against wouldn't subject themselves to that willingly.

And, as Naomi's question showed, it is not just marginalised people who feel this pressure, it's all of us. And it's a problem. As we know, confidence is no indicator of actual ability, meaning that many many talented people may be too scared to contribute to open source.

As Naomi pointed out, impostor syndrome is a socially created condition - when people are expected to do badly, they do badly. In fact I completely agree with her suggestion that the existing Wikipedia definition of impostor syndrome (at the time of writing) could be more sensitively phrased to define it as a "social condition" rather than a "psychological phenomenon", as well as avoiding singling out women.

While Naomi chose to focus in her talk on how we personally can copy try to mitigate feelings of being an impostor, I think the really important message here is one for the community. It's not our fault that open source is scary, that's just the nature of openness. But we have to make it more welcoming. The success of the open source movement really does depend on it being diverse and accepting.

What I think is really interesting is that stereotype threat can be mitigated by reminding people of their values, of what's important to them. And this is what I hope will save open source. The more we express our principles and passion for open source, the more we express our values, the easier it is to counter negative feelings, to be welcoming, to stop feeling like impostors.

A great conference

Overall, the conference was exhausting, but I'm very grateful that I got to attend. It was inspiring and informative, and a great example of how to maintain a great community.

If you want you can now go and read about the other talks.

Read more

I was just having a discussion with my friends about If I Were You adding an extra pre-roll advertisement to their latest podcast, and it inspired me to write about my moral opinion of advertising in general.

Selling consumers

By choosing to add an advertisement to a magazine article, TV show or podcast, the content creator is choosing to sell a portion of their audience's attention. The audience has devoted their time to watch the actual content, but they are instead subjected to watching an advertisement for a random product.

Now you could argue that everyone who watches any media with ads knows that this is the deal. They are choosing to watch the show, knowing it is ad-supported, so they should be allowed to make that choice. Where's the harm?

My problem with it is the insidious effect that it has on that audience, and society at large. The advertising space is up for sale, often simply to the highest bidder. That means that whoever is willing to pay the most gets to subtly manipulate that audience. Are all those audience members aware that that's what they're signing up for? And even if they are, what about the wider effect on society?

Advertising contributes hugely to obesity, the most serious health problem facing western nations, eating disorders and other psychological problems.

Societal capture

While it is true (and a great thing) that we are all becoming wiser to the tricks of advertisers, adverts still carry a huge amount of power. We all know that campaign finance for US election races basically decides the outcome. If you can spend billions on your campaign adverts, you will almost certainly win.

While possibly not quite as harmful as campaign adverts, I believe the same theory applies to advertising at large. The biggest companies can afford to buy more of these random advertising slots than anyone else, and it has a huge effect on society. Is there anyone who hasn't heard of Coke or McDonalds? How many women don't feel a constant pressure to look slim and beautiful? And this advertising also helps the massive corporations keep their monopolies.

Society is genuinely shaped by the media, and the media is made up of a huge amount of advertisements. This means that the corporations with the most money get to shape society in a way that suits them. And that model for society is always based on bigger profits for those companies, not the interests of society.

If there were fewer media spots up for sale, I believe the whole of society would benefit immesurably.

Advertising is a major culprit in runaway climate change

The biggest and most obvious problem is that advertising, beyond a shadow of a doubt, fuels consumerism and therefore over-consumption. And this consumerism is terribly bad for the climate - the number one danger facing humanity. We are at a point where developed nations are producing emissions at a catastrophic rate. And there's no one culprit - our societies are simply structured to be wasteful. We consume more food than we need, and buy a lot more than we consume. We all fly all over the planet all the time. We buy new clothes, and throw out old ones, far more often than we need to.

And all of this is because bit corporations, who are solely interested in us continuing to consume in ever greater quantities, get to be constantly manipulating everyone within society with their money through paid advertisments.

Financing without ads

The problem is, so many free services that we currently enjoy would simply not exist without ads. Most of the digital services we rely on are entirely ad-sponsored (Facebook, Google and Bing's myriad services, Twitter, Youtube). To be fair, Google have worked to make ads a bit less intrusive, and I do think that's a good thing, but it's not like the corporate influence on society seems to have reduced at all since 1998.

If advertising were somehow less profitable, or just too morally odious to justify, then these digital services would have to be based on considerably different profit models, and they may well not exist at all. The obvious alternate model is to simply charge directly for these services, but only a tiny fraction of the people who use these services today would have signed up to pay even a small amount for them. I can't pretend this isn't a difficult problem.

I would genuinely like to see more companies try different profit models. For example, Github provide a full free service for open-source work, but charge for privacy, Humble Bundle let you "pay what you like" for content, and Wikipedia are financed purely through donations.

I also believe that if more companies were more honest and open with their finances, the fans would be more happy to help out by paying donations or subscriptions.

Ethical advertising

Okay, let's be honest, advertising isn't going anywhere. But I still hope that we can try to limit the damage by requiring content creators to be more ethical with their advertising.

I think any advert on any website, TV show, magazine article or whatever should be considered an endorsement. Any criticisms leveled against the advert or the company that made the advert should also be applied to the organisation that chose to give the advertisement air-time. This does happen to some extent (e.g. the This World advert in the Guardian), but I think it should happen more. This would hopefully force organisations to take more ethical responsibility over who they sell advertising space to, which would do a world of good.

It would also be nice if content-creators were choosing adverts, rather than the media company that distributes the content - e.g. adverts in the breaks in the middle of TV shows should be chosen by the TV show authors. This would mean that the fans of the show would at least be watching adverts that the creator chose.

Installing Ad-Block

Some think it's un-ethical to install Ad-Block, as then you are potentially depriving the good content-creators of their revenue.

Given my ethical position on ads, I disagree with this. I think that one of the ways people can help to shape society for the better is to deliberately (and hopefully, vocally) reject things they find obnoxious. Therefore, the very existence of Ad-Block, and the number of people who have installed it, are a statement in opposition to ad-based financing models. And I hope that it might have some small effect in discouraging organisations from choosing to go that way.

Read more

Following are my long-form notes for a short presentation I gave to the team here at Canonical.

We are all aware that the Internet is truly today's information superhighway.

So much of the world's information today is written in HTML that it's almost synonymous with "information".

HTML is the basic component of the Internet. We all use the Internet. If you take away CSS and JavaScript, you're left with just a whole bunch of HTML.

Understanding the interplay between markup and the Internet is important for anyone who writes content for the Internet

Simplicity and accessibility


We write JavaScript, CSS and back-end code for simplicity and clarity just so other developers, and probably only developer in our team can easily read and work on the code.

HTML is always the most public and central part of all our information, it is the most import thing to make as simple and intuitive as possible. Our HTML might be downloaded, viewed or hacked around with by anyone. They don't need to be a developer by trade. Anyone who knows how to "view source" can read our markup. Anyone who knows how to click "save web page" can hack around it.

Good writing

I'd like to suggest that anyone who writes professionally, in today's world, should have some understanding of how markup works.

People in more and more areas have to write markup sometimes. Anyone who writes blogs in Wordpress has probably had to edit the raw markup at some point. But also, anyone who writes in any medium that might be converted into markup at any point in the future should be aware of some of the ways it works.

I would therefore posit that using the correct tag to markup your information is as important as choosing how to layout your word document (headings, bullet-points etc.).

If you're ever writing markup, go and familiarise yourself with the HTML extensions in HTML5. And if you have something new to markup (e.g. a pull-quote, a code-block or a graph) give it a Google, see what best practice is.


A tempting attitude to take to writing markup is to focus on the average user, or maybe at least users within the inter-quartile range. If you look at Google Analytics, you will see that almost all visits to our sites are from people with modern, HTML5 & ECMAScript 5 capable browsers. As long as things look good on that setup, it's not so important to cover the edge-cases.

I would say that there are likely many flaws in this analysis. One is that maybe instead of hurting 1% of people by not worrying about the edge-cases, we're hurting 50% of the people, 2% of the time. Which, in terms of public opinion, is worse.

For example, if I try to load a website on the train (which I do more often than most, but many people do occasionally), there is a high likelihood that my connection will drop half-way through and I'll get a partially loaded page. At this point, since I will have downloaded the markup first, it is paramount that the markup looks sensible and contains all the relevant information.

Fortunately, there's a simple formula - if you understand the basic components for the web and write in them as simply and straightforwardly as you can, first, then most things will just work.

One of the beautiful things about the web is it's actually impossible to predict exactly how people are going to want to use it. But simplicity and directness are your friends.


The Internet is a collection of links. The real genius of HTML is its extremely light referencing system.

Referencing has been a core component of scientific work forever, but HTML and the Internet bring that scientific process to into the commons.

Not only that, but the whole structure of the Internet depends on references. Good linking makes documents more understandable - it's easy to follow a link to find out more about a base concept you don't properly understand.

People follow links to discover new content, but more importantly, search engines use these links to find new content and to categorise it for searching. The quantity, specificity and wording of your links contribute to the strength of the Internet.

This is where an understanding matters not just to people who write in HTML, but anyone who writes content for the Internet.

When you're writing, especially if you're explaining a concept, if ever you use a term which you think could be described in more depth, find a link for it. People will thank you.

Rather than just adding the full link into the page's text (e.g. "see:"), or writing "click here", add the link to a relevant part of your sentence. This is important because search engines will use your link text to help describe what that link is about.

It's also helpful if your link text is not exactly the same as simply the title of the post you're linking to. This is because it's helpful for that page to be described in many different ways, organically, by people linking to it.

IDs and anchors

Your readers will thank you for specific linking. If the topic you're trying to cover with your link is under a sub-heading half way down the document, see if you can find an anchor which will take them straight there (

On the development side, I believe that responsible HTML will contain IDs for this reason. Each heading, sub-heading or useful document section should ideally have an ID set on it, so people can link directly to that section if they need to.

Thank you

You're not going to do most of what I've said above, most of the time. But I think just keeping it in mind will make a difference. Learning how to write responsibly for the web is a creative and infinite journey. But every time you publish anything, and even better if you make an extra link or find a new more specific markup tag, you're strengthening the Internet. Thank you.

Read more

(Also posted on the Canonical blog)

On 10th September 2014, Canonical are joining in with Internet Slowdown day to support the fight for net neutrality.

Along with Reddit, Tumblr, Boing Boing, Kickstarter and many more sites, we will be sporting banners on our main sites, and

Net neutrality

From Wikipedia:

Net neutrality is the principle that Internet service providers and governments should treat all data on the Internet equally, not discriminating or charging differentially by user, content, site, platform, application, type of attached equipment, and modes of communication.

Internet Slowdown day

#InternetSlowdown day is in protest to the FCC’s plans to allow ISPs in America to offer “paid prioritization” of their traffic to certain companies.

If large companies were allowed to pay ISPs to prioritise their traffic, it would be much harder for competing companies to enter the market, effectively giving large corporations a greater monopoly.

I believe that internet service providers should conform to common carrier laws where the carrier is required to provide service to the general public without discrimination.

If you too support net neutrality, please consider signing the Battle for the net petition.

Read more

(This article is was originally posted on

On release day we can get up to 8,000 requests a second to from people trying to download the new release. In fact, last October (13.10) was the first release day in a long time that the site didn't crash under the load at some point during the day (huge credit to the infrastructure team). has been running on Drupal, but we've been gradually migrating it to a more bespoke Django based system. In March we started work on migrating the download section in time for the release of Trusty Tahr. This was a prime opportunity to look for ways to reduce some of the load on the servers.

Choosing geolocated download mirrors is hard work for an application

When someone downloads Ubuntu from (on a thank-you page), they are actually sent to one of the 300 or so mirror sites that's nearby.

To pick a mirror for the user, the application has to:

  1. Decide from the client's IP address what country they're in
  2. Get the list of mirrors and find the ones that are in their country
  3. Randomly pick them a mirror, while sending more people to mirrors with higher bandwidth

This process is by far the most intensive operation on the whole site, not because these tasks are particularly complicated in themselves, but because this needs to be done for each and every user - potentially 8,000 a second while every other page on the site can be aggressively cached to prevent most requests from hitting the application itself.

For the site to be able to handle this load, we'd need to load-balance requests across perhaps 40 VMs.

Can everything be done client-side?

Our first thought was to embed the entire mirror list in the thank-you page and use JavaScript in the users' browsers to select an appropriate mirror. This would drastically reduce the load on the application, because the download page would then be effectively static and cache-able like every other page.

The only way to reliably get the user's location client-side is with the geolocation API, which is only supported by 85% of users' browsers. Another slight issue is that the user has to give permission before they could be assigned a mirror, which would slightly hinder their experience.

This solution would inconvenience users just a bit too much. So we found a trade-off:

A mixed solution - Apache geolocation

mod_geoip2 for Apache can apply server rules based on a user's location and is much faster than doing geolocation at the application level. This means that we can use Apache to send users to a country-specific version of the download page (e.g. the German desktop thank-you page) by adding &country=GB to the end of the URL.

These country specific pages contain the list of mirrors for that country, and each one can now be cached, vastly reducing the load on the server. Client-side JavaScript randomly selects a mirror for the user, weighted by the bandwidth of each mirror, and kicks off their download, without the need for client-side geolocation support.

This solution was successfully implemented shortly before the release of Trusty Tahr.

Read more

Docker is a fantastic tool for running virtual images and managing light Linux containers extremely quickly.

One thing this has been very useful for in my job at Canonical is quickly running older versions of Ubuntu - for example to test how to install specific packages on Precise when I'm running Trusty.

Installing Docker

The simplest way to install Docker on Ubuntu is using the automatic script:

curl -sSL | sudo sh

You may then want to authorise your user to run Docker directly (as opposed to using sudo) by adding yourself to the docker group:

sudo gpasswd -a [YOUR-USERNAME] docker

You need to log out and back in again before this will take effect.

Spinning up an old version of Ubuntu

With docker installed, you should be able to run it as follows. The below example is for Ubuntu Precise, but you can replace "precise" with any available ubuntu version:

mkdir share  # Shared folder with docker image - optional
docker run -v `pwd`/share:/share -i -t ubuntu:precise /bin/bash  # Run ubuntu, with a shared folder
root@cba49fae35ce:/#  # We're in!

The -v `pwd`/share:/share part mounts the local ./share/ folder at /share/ within the Docker instance, for easily sharing files with the host OS. Setting this up is optional, but might well be useful.

There are some import things to note:

  • This is a very stripped-down operating system. You are logged in as the root user, your home directory is the filesystem root (/), and very few packages are installed. Almost always, the first thing you'll want to run is apt-get update. You'll then almost certainly need to install a few packages before this instance will be of any use.
  • Every time you run the above command it will spin up a new instance of the Ubuntu image from scratch. If you log out, retrieving your current instance in that same state is complicated. So don't logout until you're done. Or learn about managing Docker containers.
  • In some cases, Docker will be unable to resolve DNS correctly, meaning that apt-get update will fail. In this case, follow the guide to fix DNS.

Read more

Fix Docker's DNS

Docker is really useful for a great many things - including, but not limited to, quickly testing older versions of Ubuntu. If you've not used it before, why not try out the online demo?.

Networking issues

Sometimes docker is unable to use the host OS's DNS resolver, resulting in a DNS resolve error within your Docker container:

$ sudo docker run -i -t ubuntu /bin/bash  # Start a docker container
root@0cca56c41dfe:/# apt-get update  # Try to Update apt from within the container
Err precise Release.gpg
Temporary failure resolving ''  # DNS resolve failure
W: Some index files failed to download. They have been ignored, or old ones used instead.

How to fix it

We can fix this by explicitly telling Docker to use Google's DNS public server (

However, within some networks (for example, Canonical's London office) all public DNS will be blocked, so we should find and explicitly add the network's DNS server as a backup as well:

Get the address of your current DNS server

From the host OS, check the address of the DNS server you're using locally with nm-tool, e.g.:

$ nm-tool
  IPv4 Settings:
    Prefix:          21 (

    DNS:     # This is my DNS server address

Add your DNS server as a 2nd DNS server for Docker

Now open up the docker config file at /etc/default/docker, and update or replace the DOCKER_OPTS setting to add Google's DNS server first, but yours as a backup: --dns --dns=[YOUR-DNS-SERVER]. E.g.:

# /etc/default/docker
# ...
# Use DOCKER_OPTS to modify the daemon startup options.
DOCKER_OPTS="--dns --dns"
# Google's DNS first ^, and ours ^ second

Restart Docker

sudo service docker restart


Hopefully, all should now be well:

$ sudo docker run -i -t ubuntu /bin/bash  # Start a docker container
root@0cca56c41dfe:/# apt-get update  # Try to Update apt from within the container
Get:1 precise Release.gpg [198 B]  # DNS resolves properly

Read more

If you glance up to the address bar, you will see that this post is being served securely. I've done this because I believe strongly in the importance of internet privacy, and I support the Reset The Net campaign to encrypt the web.

I've done this completely for free. Here's how:

Get a free certificate

StartSSL isn't the nicest website in the world to use. However, they will give you a free certificate without too much hassle. Click "Sign up" and follow the instructions.

Get an OpenShift Bronze account

Sign up to a RedHat OpenShift Bronze account. Although this account is free to use, as long as you only use one 1-3 gears, it does require you to provide card details.

Once you have an account, create a new application. On the application screen, open the list of domain aliases by clicking on the aliases link (might say "change"):

Application page - click on aliases

Edit your selected domain name and upload the certificate, chain file and private key. NB: Make sure you upload the chain file. If the chain file isn't uploaded initially it may not register later on.

Pushing your site

Now you can push any website to the created application and it should be securely hosted.

Given that you only get 1-3 gears for free, if you have a static site, it's more likely to handle high load. For instance, this site gets about 250 visitors a day and runs perfectly fine on the free resources from OpenShift.

Read more

Following are some guidelines about Agile philosophy that I wrote for my team back in September 2012.

I also wrote a popular StackExchange answer about Agile project planning which you might fine useful if you're thinking about implementing Agile.

Agile software development is a philosophy for managing software projects and teams. It has similarities to lean manufacturing principles for "eliminating waste".

The philosophy centers around the agile manifesto:

We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:

  • Individuals and interactions over Processes and tools
  • Working software over Comprehensive documentation
  • Customer collaboration over Contract negotiation
  • Responding to change over Following a plan

That is, while there is value in the items on the right, we value the items on the left more.

Of the various software development methodologies out there, Scrum and Extreme programming particularly try to follow agile software development principles.

Lean software development is also rapidly gaining support within the agile community.

Agile practices and principles

Without choosing to follow any one defined methodology for project management, here are some common practices that could be adopted by an agile team:

Read more

I wrote this set of programming principles for my team to follow back in 2012. I'm sure there are many like it, but this one is mine. May you find it useful.

Writing code

Try to write expressive code.

Beware code bloat - adhere to the YAGNI principle

Practicing Behaviour-Driven Development can help with both of these aims.

Do less: Before writing a new piece of functionality, go and look for similary solutions that already exist and extend them.

Code architecture

Namespace your classes, and code to an interface (this is an implementation of the Design by Contract principle), and make your interfaces (both programming interfaces and user-interfaces) as simple as possible.

Try to learn and comply with all 5 principles of SOLID (watch this great video).

Learn as many Design Patterns as you can to inform your coding, but beware of implementing them blindly. Developers can be over-zealous in their use of Design Patterns and may end up over-engineering a solution.

Some useful design patterns:


Try to learn an IDE with advanced features. These can really save you a lot of time:

  • Syntax highlighting
  • Auto-complete for function, class and method names
  • Auto-formatting
  • Code navigation help - e.g. jump to class declaration
  • Collapsing of code blocks
  • Overviews of code, e.g. a list of all methods within a class
  • Debugging tools like break points

Some suggestions:

Read more

Luminous beings are we

A diary entry from October 15th 2013

Today I very much wanted to work on my voice. Work out how to get my message across - feel like I was saying something genuine, something of significance.

I like the idea of sketches. Particularly sketches about systems and networks. How everyone is connected, and human society grows like an organism, each little autonomous cell influencing each other one. We are like a neural network.

And I wanted to illustrate how these autonomous nodes make up an ebbing and flowing tide, with each individual or group potentially changing the direction of the tide. We are all connected, we all influence each other, we all have power to change the flow of the tide, but we also are swept along by it. I find this vision inspiring but not intimidating. Any one of us can be the instigator of a change of direction, but we are under no pressure to be.

Hmm. Some academics probably study sentient fluids.... Like traffic. That would be an interesting topic.

People grow and develop in this way too. We rush or stagnate through deliberate or accidental events. We are none of us ultimately in control. I believe this absolves any one person of too much responsibility, but at the same time we are all responsible. I wish I could communicate this idea succinctly. I hope a vision like this can lead to people judging each other less. It's hard to explain how.

I think this is like a hacker's vision. There are endless possibilities for this organism. No-one knows where it will go. There is no defined end-goal. We are constantly discovering. Every individual life is a unique exploration. There can be no higher goal than to explore, finding solutions and perspectives that are unique, continuing the exploration.

This is hacking - life is hacking.

But somehow I feel like I'm letting down this purpose. I am not exploring as much as I could be. I'm somewhat stagnating. I'd like to be inspiring people, and communicating my thoughts and ideas honestly. I certainly feel like I have thoughts and ideas, unique perspectives, and my current job and my current lifestyle are not realising one tenth of them. How to solve this?

That'll do for now. Goodnight, diary.

Read more

Writing expressive code

As any coder gains experience, they inevitebly learn more and more ways to solve the same problem.

The very first consideration is simplicity. We probably want to use as simple and direct a solution as possible - to avoid over-engineering. But the simplest solution is not necessarily the shortest solution.

After simplicity, the very next consideration should be expressiveness. You should always be thinking about how deeply a new developer is going to have to delve into your code to understand what's going on.

Code is poetry

Writing expressive code may help future coders to understand what's going on. It may even help you in the future. But it may also help you simply to understand the problem. Thinking carefully about how to define and encapsulate the components of your solution will often help you to understand the problem better, leading to a more logical solution.

"Self-documenting code"

"Self-documenting code" is about structuring your code and choosing your method and variable names so that your code will be largely self-describing. This is a great practice, and can make some comments redundant:

$user = new User(); // create a new user object
$user->loadFromSession(session); // update the user from the session
if ($user->isAuthenticated()) { ... } // If the user is authenticated...

However, as a recent discussion with a friend of mine highlighted to me, expressive code is not a replacement for comments - no code is entirely "self-documenting". Always write as expresively as you can, but also always document where it makes sense. Methods, functions and classes should always be summarised with a comment - as mentioned in the Python coding conventions.


It's worth thinking carefully about how you name your variables and methods.

Don't abbreviate

var uid = 10; // I am unlikely to know what uid stands for without context
var userIdentifier = 10; // Better

Be specific

Use as concrete and specific nouns as you can to describe methods and functions:

var event; // bad - generic
var newsLinkClickEvent; // good - specific


No-one likes to read a really long procedural program. It's very difficult to follow. It's much easier to read a shorter set of well-encapsulated method calls. If you need to delve deeper, simply look in the relevant method:

// Instead of showing you all the details of how we update the user
// We encapsulate that in the updateDetails method
// allowing you to quickly see the top-level processes
function saveUserDetails(userStore, userDetails) {
    var user = new User();
    user.updateDetails(userDetails); // sets a whole bunch of details on the user; // Converts user data into the correct format, and then saves it in the user store

Do you need an else?

The use of many if .. else conditionals make programs confusing. In many cases, the else part can be encapsulated in a separate method or function call, making the program easier to read:

// With the else
if (user.permissionGroup == 'administrator') {
} else {
    page.showError("Sorry you don't have permission to delete this article");
// Without the else
if (!user.deleteArticle(article)) {
    page.showError("Sorry you don't have permission to delete this article");

In cases where a switch is used, or multiple if .. else if statements, you could consider using different types instead:

class User {
    function deleteArticle($article) {
        $success = false;

        if (
            user->permissionGroup == 'administrator'
            || user->permissionGroup == 'editor'
        ) {
            $success = $article->delete();

        return $success;

You can remove the need for this if, by making special types:

trait ArticleDeletion {
    function deleteArticle($article) {
        return $article->delete();

class Editor implements User { use ArticleDeletion; }
class Administrator implements User { use ArticleDeletion; }

Notice that I've deliberately opted not to make Administrator inherit from Editor, but instead compose them separately. This keeps my structure more flat and flexible. This is an example of composition over inheritence.


While encapsulation is often a good thing, to make programs easier to understand at the higher level, it's important to preserve the single responsibility principle by not encapsulating separate concerns together.

For example, one could write:

var user = new User();
user.UpdateFromForm(); // Imports user data from the page form

While this is both short and fairly clear, it suffers from two other problems:

  • The user has to delve further into the code to find basic information, like the name of the Database class, or which form the details are stored in
  • If we want to use a different instance of the Database, we have to edit the User class, which doesn't make a whole lot of sense.

In general you should always pass objects around, rather than instantiating them inside each other:

var user = new User();
var userData = Request.Form;
var database = new DatabaseManager();


This is more lines, but it is nonetheless clearer what is actually happening, and it's more versatile.


Always try to format your code so that it is easily readable. Don't be afraid of white space, and use indentation sensibly to highlight the structure of your code.

Where there is an accepted code style guide, you should try to follow it. For example, PHP has the FIG standards.

However, I don't think it's worthwhile being overly anal about code standards (my thinking has evolved on this somewhat) because you'll never be able to get everybody to code exactly the same way. So if (like me) you're a coder who feels the need to reformat code whenever you see it to make it fit in with anal standards, you could probably so with training yourself out of that habit. As long as you can read it, leave it be.

Delete commented out code

If you're using a version control system (like Git) there really is no need to keep large blocks of commented-out or unused code. You should just delete it, to keep your codebase tidier. If you really need it again, you can just got and find it in the version control history.


There will always be a trade-off between expresiveness and succinctness.

Depth vs. encapsulation

It is desirible to keep as flat a structure as possible in your objects, so that programmers don't have to delve through parent class after parent class to find the relevant bit of code. But it is also important to keep code encapsulated in logical units.

Both the goals are often achievable by doing composition over inheritence using dependency injection or traits / multiple inheritence.

Special syntax

In many languages there are often slightly obscure constructs that can nonetheless save time. With many of these there is a readability vs. simplicity trade-off.

Ternary operators and null coalescing

Both C# and PHP have null coalescing operators:

var userType = user.Type ?? defaultType; // C#
$userType = $user->Type ?: $defaultType; // PHP

And almost all languages support the ternary operator:

var userType = user.Type != null ? user.Type : defaultType;

Both of these constructs are much more succinct than a full if .. else construct, but they are less semantically clear, hence the trade-off. Personally, I think it's fine to use the ternary operator in simple conditionals like this, but if it gets any more complicated then you should always use a full if .. else statement.

Plugins / libraries

For example, in C#:

var brownFish;

foreach (var fish in fishes) {
    if (fish.colour == "brown") {
        brownFish = fish;

Can be simplified with the Linq library:

using System.Linq;

var brownFish = fishes.First(fish => fish.colour == "brown");

The latter is clearly simpler, and hopefully not too difficult to understand, but its does require:

  1. Knowledge of the Linq library
  2. An understanding of lambda expressions work

I think that in this case the Linq solution is so much simpler and quite expressive enough that it should definitely be preferred - and hopefully if another developer doesn't know about Linq, it will be quite easy for them to pick up, and will expand their knowledge.

Single-use variables

While the following variable is pointless:

var arrayLength = myArray.length;

for (var arrayIterator; arrayIterator < arrayLength; arrayIterator++) { ... }

There are some cases where variables can be used to add useful semantic meaning:

var slideshowContainer = jQuery('main>.show');


Read more

In the last couple of months I've had a number of discussions with people who were under the impression that encryption has been cracked by the NSA.

If you like, jump straight to what you can do about it.

The story

The story started in September, in the Guardian:

NSA and GCHQ unlock encryption used to protect emails, banking and medical records

(Guardian - Revealed: how US and UK spy agencies defeat internet privacy and security, James Ball, Julian Borger and Glenn Greenwald, 5th September 2013)

This came up again today, because Sir Tim Berners-Lee made a statement:

In an interview with the Guardian, he expressed particular outrage that GCHQ and the NSA had weakened online security by cracking much of the online encryption on which hundreds of millions of users rely to guard data privacy.

(Guardian - Tim Berners-Lee condemns spy agencies as heads face MPs, Ed Pilkington, 7th November 2013)

And something very similar to this was stated in the Radio 4 news program I was listening to this morning.

The worry

On the face of it this sounds like the NSA's geniuses have reverse-engineered some core cryptographic principles - e.g. worked out how to quickly deduce prime factors from a public key (read an explanation of RSA).

This would be very serious. I was sceptical though, because I believe that if there were key vulnerabilities in public algorithms, the public would have found them long before the NSA. They don't have a monopoly on good mathematicians. This is, after all, why open-source code and public algorithms are inherently more secure.

The truth

Helpfully, Massachusetts Institute of Technology published an article 4 days later clarifying what the NSA had likely achieved:

New details of the NSA’s capabilities suggest encryption can still be trusted. But more effort is needed to fix problems with how it is used.

(NSA Leak Leaves Crypto-Math Intact but Highlights Known Workarounds, Tom Simonite, 9th September 2013)

This shows that (still as far as we know) the NSA have done nothing unprecedented. They have, however, gone to huge lengths to exploit every known vulnerability in security systems, regardless of legality. Mostly, these vulnerabilities are with the end-point systems, not the cryptography itself.

What the NSA and GCHQ have done

I've tried to list these in order of severity:

  • Intercepted huge amounts of encrypted and unencrypted internet traffic
  • Used network taps to get hold of Google and Yahoo's (and probably others') unencrypted private data as it's transferred between their servers
  • Acquired private-keys wherever they can, presumably through traditional hacking methods like brute-forcing passwords, social engineering, or inside contacts.
  • Built back doors into certain commercial encryption software products (most notably, Microsoft)
  • Used brute-force attacks to find weaker (1024-bit) RSA private keys
  • Used court orders to force companies to give up personal information

A word about RSA brute-forcing

We have known for a while that 1024-bit RSA keys could feasibly be brute-forced by anyone with enough resources - and many assumed that the U.S security agencies would almost certainly be doing it. So for the more paranoid among us, this should be no surprise.

“RSA 1024 is entirely too weak to be used anywhere with any confidence in its security” says Tom Ritter

However, MIT also claim that these weaker keys are:

used by most websites that offer secure SSL connections

This surprises me, as I know that GoDaddy at least won't sell you a certificate for a key shorter than 2048-bit - and I would assume other certificate vendors would follow suit. But maybe this is fairly recent.

However, even if "most websites" use RSA-1024, it doesn't mean that the NSA is decrypting all of this encrypted traffic, because it still requires a huge amount of resources (and time) to do, and the sheer number of such keys being used will also be huge. This means the NSA can only be decrypting data from specifically targeted sites. They won't have decrypted all of it.

What you can do

Now that we know this is going on, it only means that we should be more stringent about the security best-practices that already existed:

  • Use only public, open-source, tried and tested programs and algorithms
  • Use 2048-bit or longer RSA keys
  • Configure secure servers to prefer "perfect forward secrecy" cyphers
  • Avoid the mainstream service providers (Google, Yahoo, Microsoft) where you can
  • Secure your end-points: disable your root login; use secure passwords; know who has access to your private keys

Read more

On Saturday night, there was a big fight outside one of our night-clubs here in Nottingham, in which 3 people were stabbed.

BBC publishing stupid opinions

The BBC wrote an article, including a quote from the nightclub owner:

This is not a localised problem, knife crime is becoming a huge national issue Community sentences and conditional discharges do nothing to discourage criminals

and the pull-quote:

Tougher sentences needed

I don't understand why the BBC felt the need to give a platform to this particular schmuck. It is the responsibility of journalists, in my opinion, to stem the tide of sensationalism after events like this - after all, they should understand better than anyone the frequency with which stories like this occur.

The truth about knife crime

According to knife crime statistics from

The number of knife offences recorded (during the year to June 2012) was 9% lower than in the preceding year.

NHS data suggests there were 4,490 people admitted to English hospitals in 2011/12 due to assault by a sharp object. The lowest level since 2002/03.

Similarly, the Office for National Statistics has stats showing that total knife-related offences in the year to March 2013 is 26,336, down from 31,147 the previous year.

So, Knife crime is not "becoming" any kind of problem. It's an old problem, but it's improving. So shut-up Simon Raine.

Also, I don't believe "tougher custodial sentences" have ever been the best solution. I don't have time to find the evidence now, but I believe custodial sentences only harden criminals, and that rehabilitation is the way forward. And the police and the justice system are slowly realising this - which may be partly helping the knife crime stats. Don't let stupid opinions like these derail that effort.

Read more

If you want to a tool to crawl through your site looking for 404 or 500 errors, there are online tools (e.g. The W3C's online link checker), browser plugins for Firefox and Chrome, or windows programs like Xenu's Link Sleuth.

A unix link checker

Today I found linkchecker - available as a unix command-line program (although it also has a GUI or a web interface).

Install the command-line tool

You can install the command-line tool simply on Ubuntu:

sudo apt-get install linkchecker

Using linkchecker

Like any good command-line program, it has a manual page, but it can be a bit daunting to read, so I give some shortcuts below.

By default, linkchecker will give you a lot of warnings. It'll warn you for any links that result in 301s, as well as all 404s, timeouts, etc., as well as giving you status updates every second or so.


linkchecker will not crawl a website that is disallowed by a robots.txt file, and there's no way to override that. The solution is to change the robots.txt file to allow linkchecker through:

User-Agent: *
Disallow: /
User-Agent: LinkChecker
Allow: /

Redirecting output

linkchecker seems to be expecting you to redirect its output to a file. If you do so, it will only put the actual warnings and errors in the file, and report status to the command-line:

$ linkchecker > siteerrors.log
35 URLs active,     0 URLs queued, 13873 URLs checked, runtime 1 hour, 51 minutes


If you're testing a development site, it's quite likely it will be fairly slow to respond and linkchecker may experience many timeouts, so you probably want to up that timeout time:

$ linkchecker --timeout=300 > siteerrors.log

Ignore warnings

I don't know about you, but the sites I work on have loads of errors. I want to find 404s and 50*s before I worry about redirect warnings.

$ linkchecker --timeout=300 --no-warnings > siteerrors.log

Output type

The default text output is fairly verbose. For easy readability, you probably want the logging to be in CSV format:

$ linkchecker --timeout=300 --no-warnings -ocsv > siteerrors.csv

Other options

If you find and fix all your basic 404 and 50* errors, you might then want to turn warnings back on (remove --no-warnings) and start using --check-html and --check-css.

Checking websites with OpenID (2014-04-17 update)

Today I had to use linkchecker to check a site which required authentication with Canonical's OpenID system. To do this, a StackOverflow answer helped me immensely.

I first accessed the site as normal with Chromium, opened the console window and dumped all the cookies that were set in that site:

> document.cookie
"__utmc="111111111"; pysid=1e53e0a04bf8e953c9156ea841e41157;"

I then saved these cookies in cookies.txt in a format that linkchecker will understand:
Set-cookie: __utmc="111111111"
Set-cookie: pysid="1e53e0a04bf8e953c9156ea841e41157"

And included it in my linkchecker command with --cookiefile:

linkchecker --cookiefile=cookies.txt --timeout=300 --no-warnings -ocsv > siteerrors.csv

Use it!

If you work on a website of any significant size, there are almost certainly dozens of broken links and other errors. Link checkers will crawl through the website checking each link for errors.

Link checking your website may seem obvious, but in my experience hardly any dev teams do it regularly.

You might well want to use linkchecker to do automated link checking! I haven't implemented this yet, but I'll try to let you know when I do.

Read more

SeeTheStats is a great free service for exposing your Google Analytics data (the only way to do Analytics) to the public.

Here is some information about my site:

How many people visit my site?

What country are they from?

What pages are they looking at?

What browsers are they using?

What operating systems are they using?

How big are their screens?

My SeeTheStats page

You can also see all these stats over at

Read more

With the advent of web fonts (e.g. from Google Fonts), thankfully web designers are no longer tied to a limited set of "web safe" fonts.

Fonts and performance

However, there is a potential performance hit with this. You will need to link your CSS files to the font files. The problem here isn't so much the size of the font file (they are typically under 100 KB), it's more that each new HTTP request that a page makes effects performance

Also, when loading web fonts externally you will sometimes see a flicker where the page loads initially with the default browser fonts, and then the new fonts are downloaded and applied afterwards. This flicker can look quite unprofessional.

Font formats and IE8

If you want to support Internet Explorer 8 or older, you unfortunately need to include your fonts in two formats: WOFF and EOT.

However, if you're willing to drop IE8 support (and reap the benefits), or to simply serve the browser default font to IE8, then you can provide your fonts in WOFF only, which is supported by all other relevant browsers.

Data URLs

So Data URLs, if you haven't heard of them, are a way of encoding binary data as a valid URL string. This means the data can be included directly inside HTML or CSS files. They are fantastically easy to create by simply dragging your binary file into the Data URL Creator.

Data URLs are likely to be a bit larger than the binary file would have been. In my experience they tend to be about 20% larger. So the larger the file you're dealing with the less practical it becomes to encode the file as a URL. However, for sub-100k web fonts this difference is not so important.

So using Data URLs, you can include your font directly in your CSS like so:

/* */
@font-face {
    font-family: 'Lato light';
    font-style: normal;
    font-weight: 300;
    src: local('Lato Light'), url('data:application/x-font-woff;base64,d09GRg...BQAAAAB'), format('woff');

(For example, here's what I use for this very site)

This will now mean that your web pages will only have to download one CSS file, rather than a CSS file and a bunch of font files, which will help performance. Personally I think it's also neat not to have to create a special directory for font files. Keeping it all in one place (CSS) just seems nice and neat to me.

A word about caching

Whether the above suggestion is actually a good idea will depend on how often your CSS changes. Hopefully you'll be merging your CSS files into one file already to reduce HTTP requests. This of course means that whenever that merged CSS file changes, your users will have to download the whole file again to see your changes.

If your fonts were downloaded as separate files, rather than being included in your CSS, then the fonts may well be cached even if the CSS has changed. However, if you include your fonts inside your CSS files as suggested above, this will mean that whenever your CSS changes a much larger CSS file will have to be downloaded each time. Including your fonts inside your CSS is likely to double the size of your CSS file.

This is a complex decision, but to give you some rough advice I'd say - if you CSS changes more than a couple of times a month then keep your fonts as separate files. If it's less often (as it is with this site) then it's probably worth including them inside the CSS as Data URLs.

If you have a different opinion on this, please let me know in the comments.

Read more

I am always thinking about good general rules for making the world a better place, but it's extremely difficult to succinctly communicate them to anyone.

This is the story of how my friends and I created and agreed on a statement of values.

The foundation

A couple of months ago, I was in an IRC chat room with some friends of mine (do people actually still use IRC? tell me in the comments), and @0atman aired an idea for a charitable project. We all thought it was a good one, and long a discussion ensued about the best way to run the project.

We all felt that it should be run democratically to some extent - that is, largely owned by its members - but we were worried about the project being hijacked and becoming something that none of us wanted it to be.

A potential solution, we felt, was to first create a foundation with exclusive membership and a solid stated set of values. That way, the project could be started by the foundation, but not inherently attached to it, meaning that if the project took a different direction, the foundation would remain intact. This would allow us to either create a fork of the project, bringing it back in line with our values, or start a completely new one, while allowing the existing project to continue in its new direction with our blessing.

Thus was formed the Blackgate Foundation.

(Nothing has come of the project idea yet. I hope it may in the future.)

Arguments over values

Since we formed the foundation specifically to be a solid moral centre for our future projects, the values of the foundation were paramount, so we started debating them in ernest.

Politically and morally we have a lot of things in common, but it was surprising how much we found to argue about. We disagreed about the necessity for punishment, whether there's ever a case to go to war, whether utilitarianism was a term we could or should associate ourselves with, whether we agreed with the values of humanism, our opinions on religion.

We discussed it for days, on IRC and in comments and edits on a Google Document (I don't want to advertise Google particularly, but Google Documents really are an amazingly effective way to collaborate with people). It got kinda heated at times. But eventually we came out with a largely agreed upon statement of values, and I think our individual values all changed a little along the way.

The statement of values

I am proud of what we produced, and I had a lot of fun doing it. I think it sums up my values rather well. I think it's firm and clear without being offensive or inflamatory. I'd love to know what you think of it - please let me know in the comments.

It can be seen on the blackgate foundation website or in our GitHub repositority, but I'm also reproducing it here in its current form (we may decide to change it in the future):

Statement of values

We, the members of the Blackgate Foundation, value:


  • Humanity should strive to treat and provide for all people equally regardless of appearance, sexuality, gender, beliefs, ability or actions.
  • All people should be equally represented and no person fundamentally deserves to be better off than any other.

Science & openness

  • The pursuit of knowledge is a human instinct and a universal force for good.
  • There is value in sceptical, evidence-based and objective reasoning in the persuit of knowledge.
  • Knowledge should be made available to all of humanity. We should strive to build on existing work rather than doing work from scratch.
  • There is value in open processes and collective decision making - many eyes guard against injustices and inefficiencies.


  • Diversity is important in all things. Many opinions and diverse practices prevent stagnation, create risilience through redundancy and speed evolution and learning.
  • Centres of control should be diverse and small and subservient and answerable to all over whom they hold influence. Any decisions by such centres should be evidence based and open to discussion.
  • The interests of humanity should always come before those of any individual or group, particularly applies to corporate protectionism and nationalism.


  • Violence in all its forms is divisive and inflammatory and therefore always undesirable.
  • We renounce the glorification of violence and the use of violence to solve disputes.
  • It is in the interest of humanity to seek to understand and help those who act violently.

Evidence-based morality

  • Morality is not absolute. Moral guidelines should be formed through evidence-based reasoning.
  • There exist solid evidence-based arguments for the most universally accepted moral tenets.
  • "Bad" and "evil" are counter-productive concepts. Humanity should strive to avoid ultimately judging any person as either.


  • All human activity should continually strive to be sustainable. Notable examples are human impact on the environment and the global economy.

Try it!

Why don't you try writing down your morals and values in a similar form? Or do it with some friends? I really enjoyed it and couldn't recommend it more.

Read more