Canonical Voices

What Björn Tillenius talks about

In short, if you want to use web.go in Google App Engine’s Go runtime environment, check out my google-app-engine branch of web.go. Using that one you can start using web.go like this:

package webgoexample

import (
    "http"
    "log"
    "os"
    "web"
)

var server *web.Server

func init() {
    server = &web.Server{
        Config: web.Config,
        Logger: log.New(os.Stdout, "", log.Ldate|log.Ltime)}
    server.Get("/", func(ctx *web.Context) {
        ctx.Write([]uint8("Hello from web.go!"))
    })

    // Send all requests to web.go.
    http.HandleFunc("/", handler)
}

func handler(writer http.ResponseWriter, request *http.Request) {
    server.ServeHTTP(writer, request)
}

BTW, if you simply put the web.go branch in your root dir, you have to remove the examples/ directory, otherwise App Engine won’t be able to compile your project.

The branch in question has quite minimal changes to web.go, but I haven’t proposed to merge it to trunk yet, since it removes some functionality. First of all I changed it not to set up /debug/ paths, so that I could remove the use of http/pprof, since it’s not available on App Engine. After that I had to also remove the use of net.ResolveTCPAddr, which is also not available on App Engine. I basically replaced it with net.SplitHostPort, which I suspect is good enough. It doesn’t resolve host names and port names, but I’d be surprised if hr.RemoteAddr wouldn’t be an IP address and a port number.

Read more

I often see code like this:

message["To"] = formataddr((name, email))

This looks like it should work, especially since the docstring of formataddr() says that it will return a string value suitable for a To or Cc header. However, while it works most of the time, it fails if name is a unicode string containing non-ascii characters. It may look ok if you look simply read message["To"], but as soon as you convert the message or header to a byte string, you will see the problem.

>>> from email.Message import Message
>>> from email.Utils import formataddr
>>> msg = Message()
>>> msg["To"] = formataddr((u"Björn", "bjorn@tillenius.me"))
>>> msg["To"]
u'Bj\xf6rn <bjorn@tillenius.me>'
>>> msg.as_string()
'To: =?utf-8?b?QmrDtnJuIDxiam9ybkB0aWxsZW5pdXMubWU+?=\n\n'

Most code that will use the To address in the example will fail, since there’s no visible e-mail address in there. The header should look like this, i.e. only the name itself should be encoded:

To: =?utf-8?b?QmrDtnJu?= <bjorn@tillenius.me>

I wish Python would handle this better. I usually end up writing a helper function like this for projects I work on:

def format_address(name, email):
    email = str(email)
    if not name:
        return email
    name = str(Header(name))
    return formataddr((name, email))

Read more

To start with, I think drive-by fixes are great. If you see that something is wrong when fixing something else, it can be a good idea to fix it right away, since otherwise you probably won’t do it.

However, even when doing drive-by fixes, I still think that each landed branch should focus on one thing only. As soon as you start to group unrelated things together, you make more work for others. It might be easier for you, but think about all the people that are going to look at your changes. Please, don’t be lazy! It doesn’t take much work to extract the drive-by fix into a separate branch, and most importantly to land it separately. If you do find that it’s too time-consuming to do this, let’s talk, and see what is taking time. There should be something we can do to make it easier.

There’s no such thing as a risk-free drive-by fix. There’s always the potential of something going wrong (even if the application is well-tested). When something goes wrong, someone needs to go back and look at what was done. Now, if you land the drive-by fix together with unrelated (or even related) changes, you basically hide it. By reducing your workload slightly, you create much more work for someone else.

For example, on Friday I saw that we had some problems with scripts in Launchpad. They were trying to write to a mail directory, to which they didn’t have access. That was odd, since scripts have always talked to the SMTP servers directly, and didn’t use the queued mailer that needed write access to that directory. Looking through the recent commit logs didn’t reveal anything. Luckily enough, William Grant pointed out that r9205 of db-devel contained a change to sendmail.py, which probably was the cause of the problems. This turned out to be correct, but it was still pretty much impossible to see why that change was made. I decided that the best thing to do was to revert the change, but I wasn’t sure exactly what to revert. That diff of that revision is more than 4000 lines, and more than 70 files were changed. So how can I know which other files were change to accommodate the change in sendmail.py. I tried looking at the commit logs, but that didn’t reveal much. The only thing I could do was to revert the change in sendmail.py and send it off to ec2, waiting three hours to see if anything broke.

So, I plead again, if you do drive-by fixes (and you should), please spend a few minutes extra, to extract the fix into a separate branch, and land it separately!

Is there maybe anything we can do to make this easier to do?

Read more

I’ve been working on a new release/merge workflow for Launchpad. I’ve written it from the developers’ point of view, but I’d love some comments from users of launchpad.net, so let me try to explain how you, as users, would be affected by this change.

The proposal is that we would decouple our feature development cycles from our release cycles. We would more or less get rid of our releases, and push features to our users when they are ready to be used. Every feature would first go to edge.launchpad.net, and when it’s considered good enough, it will get pushed to launchpad.net for everyone to use. Bug fixes would also go to edge.launchpad.net first, and pushed to launchpad.net when they are confirmed to work. Sadly, Launchpad will still go down once a month for updating DB and other regular maintenance, just like before. The amount (and frequency) of downtime would stay the same as before.

There are users who are in our beta team and use edge.launchpad.net all the time, and those who want a more stable Launchpad, and use launchpad.net.

Users of launchpad.net

Those who aren’t in the beta team would get bug fixes sooner than with the current workflow. Instead of having to wait for the end of the release cycle, they will get it as soon as the fix has been confirmed to work on edge.launchpad.net. The same is true for features, kind of. These users would have to wait a bit longer than today, since today we push even unfinished features to launchpad.net users at the end of the release cycle. With the new workflow, these users would have to wait for the feature to be considered complete, but in return these user should get a better experience when seeing the feature for the first time.

One potential source of problem is that even though fixes and features get tested on edge.launchpad.net, before going to launchpad.net, with each update there is the potential of some other issue being introduced. For example, fixing a layout bug on one page, might make another page look different. With the current workflow this happens only once a month, instead of a few times every month with the new workflow. That said, even today we update launchpad.net multiple times every month, to fix more serious issues.

Users of edge.launchpad.net

If you are in the beta team, and use edge.launchpad.net on a regular basis, it won’t be that different from how it works today. Just like today, you would be exposed to features that are under development. What would change is that we will try to do a better job at telling you which features that are on edge.launchpad.net only. This way you will have a better chance at actually helping us test and use the features, and tell us about any problems, so that we can fix it right away. This should make you more aware of new features that are being added to Launchpad, and provide a better opportunity for you to make it better.

One potential source of problem here is that developers will know that their work won’t end up on launchpad.net, before they say it’s ready, so they push more rough features to edge.launchpad.net. Thus it could be a more rockier ride than today. But of course, our developers care a lot about their users, so they won’t land their work, unless it’s really good! :-)

Conclusion

My hope is that this will provide a better and stable experience for users of launchpad.net, and provide users of edge.launchpad.net a better opportunity to help us making new features rock! But I’m interested to hear what you, the actual users, think about this.

Read more

I made the transition from the Bugs team lead to the Launchpad Technical Architect quite a while ago. While the first time was spent mainly on satisfying my coding desires, it’s time to define what I’m going to spend my time as technical architect! My road map that shows the high level things that I’ll be focusing on is available here:

I’ll also be writing blog posts (and sending mails to the launchpad-dev mailing list of course) regularly to keep people updated with my progress and findings. My blog is at http://tillenius.me/ and I tag all posts related to Launchpad with launchpad.

I’m currently working on decoupling our feature development cycles with our release cycles, which I do mainly, because I think it’s important, not because it’s part of the technical architect’s responsibility. But in parallel with that my next task is to set up a team that can help me doing a good job. I’ll expand more about the team in another post, but in short it will consist of members from each sub-team in Launchpad. It will act as a forum to discuss what needs my attention, and they will also help me figuring out solutions to problems, and help me implement the solutions.

One of the first major tasks will be to come up with a testing strategy. Currently when we write tests we don’t think that much about it. Everyone have their preferences, and we have a wide variety of testing styles, making it hard to find which code paths are exercised by which tests, and how good test coverage we have. This leads to us sometimes having bad test coverage, and some times having too much test coverage, i.e. we have redundant tests that make the test suite slower. Coming up with guidelines on how to write tests, which kind of tests to write, where to place them, etc., is the first step. But we also need to figure out how to make our test suite faster, what kind of documentation to provide, and so on.

In addition to the tasks on the roadmap, I also have a number of things I do on a regular basis. This includes reviewing database patches for API consistency, help teams design features from a technical point of view, keep my eyes open for areas in the code that need refactoring and clean-up.

Read more

We’ve used Windmill in our Launchpad buildbots for a while now, and it’s actually worked out quite well. I was afraid that we would have a lot of fallout, since in the beginning Windmill was fragile and caused a lot of intermittent test failure. However, so far I’d said that we’ve had very little problems. There was one intermittent failure, but it was known from the beginning that it would fail eventually. Apart from that we’ve had only one major issue, and that’s that something is using 100% CPU when our combined Javascript file is bigger than 512 000 bytes. This stopped people from landing Javascript additions for a while, and we still haven’t resolved this issue, apart from making the file smaller.

There are some things that would be nice to improve with regards to Windmill. The most important thing is to make sure that launchpad.js can be bigger than 512 000 bytes:

It would also be nice to make the test output nicer. At the moment Windmill logs quite a lot to stderr, making it look like the test failed, even though it didn’t. We don’t want Windmill to log anything really, unless it’s a critical failure:

I was going to say that we also have some problems related to logging in (because we look at a possibly stale page to decide whether the user is logged in), but it seems like Salgado already fixed it!

It would also be nice to investigate whether the problem with asserting a node directly after waiting for it sometimes fails. We had problems like that before; code was waiting for an element, and when using assertNode directly after the wait, the node still didn’t exist. I haven’t seen any test fail like that lately, so it might have been fixed somehow:

There are some other things I could think of that would be nice to have. I haven’t found any bugs filed for them, but I’ll list them here.

  • Don’t run the whole test suite under xvfb-run. It’d be better to start xvfb only for the Windmill tests.
  • Use xvfb by default for Windmill tests. When running the Windmill tests it’s quite annoying to have Firefox pop up now and then. It’d be better to run them headless by default.
  • Switches for making debugging easier. Currently we shut down Firefox after running the Windmill tests. It should be possible to have Firefox remain running after the test has finished running, so that you can manually poke around if you want to. If we use xvfb by default, we also need a switch for not using it.

Read more