Canonical Voices

Posts tagged with 'project'

Gustavo Niemeyer

Back at the Ubuntu Platform Rally last week, I’ve pestered some of the Bazaar team with questions about co-location of branches in the same directory with Bazaar. The great news is that this seems to be really coming for the next release, with first-class integration of the feature in the command set. Unfortunately, though, it’s not quite yet ready for prime time, or even for I’m-crazy-and-want-this-feature time.

Some background on why this feature turns out to be quite important right now may be interesting, since life with Bazaar in the past years hasn’t really brought that up as a blocker. The cause for the new interest lies in some recent changes in the toolset of the Go language. The new go tool not only makes building and interacting with Go packages a breeze, but it also solves a class of problems previously existent. For the go tool to work, though, it requires the use of $GOPATH consistently, and this means that the package has to live in a well defined directory. The traditional way that Bazaar manages branches into their own directories becomes a deal breaker then.

So, last week I had the chance to exchange some ideas with Jelmer Vernooij and Vincent Ladeuil (both Bazaar hackers) on these problems, and they introduced me to the approach of using lightweight checkouts to workaround some of the limitations. Lightweight checkouts in Bazaar makes the working tree resemble a little bit the old-style VCS tools, with the working tree being bound to another location that actually has the core content. The idea is great, and given how well lightweight checkouts work with Bazaar, building a full fledged solution shouldn’t be a lot of work really.

After that conversation, I’ve put a trivial hack together that would make bzr look like git from the outside, by wrapping the command line, and did a lightning talk demo. This got a few more people interested on the concept, which was enough motivation for me to move the idea forward onto a working implementation. Now I just needed the time to do it, but it wasn’t too hard to find it either.

I happen to be part of the unlucky group that too often takes more than 24 hours to get back home from these events. This is not entirely bad, though.. I also happen to be part of the lucky group that can code while flying and riding buses as means to relieve the boredom (reading helps too). This time around, cobzr became the implementation of choice, and given ~10 hours of coding, we have a very neat and over-engineered wrapper for the bzr command.

The core of the implementation is the same as the original hack: wrap bzr and call it from outside to restructure the tree. That said, rather than being entirely lazy and hackish line parsing, it actually parses bzr’s –help output for commands to build a base of supported options, and parses the command line exactly like Bazaar itself would, validating options as it goes and distinguishing between flags with arguments from positional parameters. That enables the proxying to do much more interesting work on the intercepted arguments.

Here is a quick session that shows a branch being created with the tool. It should look fairly familiar for someone used to git:


[~]% bzr branch lp:juju
Branched 443 revisions.

[~]% cd juju
[~/juju]% bzr branch
* master

[~/juju]% bzr checkout -b new-feature
Shared repository with trees (format: 2a)
Location:
shared repository: .bzr/cobzr
Branched 443 revisions.
Branched 443 revisions.
Tree is up to date at revision 443.
Switched to branch: /home/niemeyer/juju/.bzr/cobzr/new-feature/

[~/juju]% bzr branch other-feature
Branched 443 revisions.

[~/juju]% bzr branch
  master
* new-feature
  other-feature

Note that cobzr will not reorganize the tree layout before the multiple branch support is required.

Even though the wrapping is taking place and bzr’s –help output is parsed, there’s pretty much no noticeable overhead given the use of Go for the implementation and also that the processed output of –help is cached (I said it was overengineered).

As an example, the first is the real bzr, while the second is a link to cobzr:


[~/juju]% time /usr/bin/bzr status
/usr/bin/bzr status 0.24s user 0.03s system 88% cpu 0.304 total

[~/juju]% time bzr status
bzr status 0.19s user 0.08s system 88% cpu 0.307 total

This should be more than enough for surviving comfortably until bzr itself comes along with first class support for co-located branches in the next release.

In case you’re interested in using it or are just curious about the command set or other details, please check out the web page for the project:

Read more
Gustavo Niemeyer

A long time before I seriously got into using distributed version control systems (DVCS) such as Bazaar and Git for developing software, it was already well known to me how the mechanics of these systems worked, and why people benefited from them. That said, it wasn’t until I indeed started to use DVCS tools that I understood how much my daily workflow around code bases would be changed and improved.

This weekend, while flying home from MongoSV, I could experience that same feeling in relation to first class concurrency support in programming languages. Everybody knows how the feature may be used, but I have the feeling that until one actually experiences it in practice, it’s very hard to really understand how much the relationship with ordering while developing software may be improved.

I was having some fun working on improvements to Goetveld. This package allows Go programs to communicate with Rietveld servers to manipulate code review entries. The Rietveld API is a bit rough in a few places, and as a result some features of the package actually parse an HTML form to extract some data, before sending it back. You may have done something similar before while attempting to script a web site that wasn’t originally intended to be.

The interesting fact here is that this is an intrinsically serial procedure: load a form, change it, and send it back, right? Well, not really. As one might intuitively expect, establishing an SSL session and its underlying TCP connection are not instantaneous operations.

To give an idea, here is part of a dump of an SSL connection being initiated (that is, no HTTP data was sent yet) to codereview.appspot.com, originated from my home location:

# tcpdump -ttttt -i wlan0 'host codereview.appspot.com and port 443'
(...)
00:00:00.000000 IP (...)
00:00:00.000063 IP (...)
00:00:00.000562 IP (...)
00:00:00.341627 IP (...)
00:00:00.357009 IP (...)
00:00:00.357118 IP (...)
00:00:00.360362 IP (...)
00:00:00.360550 IP (...)
00:00:00.366011 IP (...)
00:00:00.689446 IP (...)
00:00:00.727693 IP (...)

That’s more than half a second before the application layer was even touched. So, turns out that to save that roundtrip time, we can start both the form loading and the form sending requests at the same time. By the time the form loading ends, processing the data locally is extremely fast, and we can complete the sending side by just providing the request body.

At this time you may be thinking something like “Ugh, that’s too much trouble.. why bother?”, and that highlights precisely the point I’d like to make: it is too much trouble because most people are used to languages that turn it into too much trouble, but the issue is not inherently complex. In fact, this is the entire implementation of this logic in Go:

func (r *Rietveld) UpdateIssue(issue *Issue) error {
        op := &opInfo{r: r, issue: issue}
        errs := make(chan error)
        ch := make(chan map[string]string, 1)
        go func() {
                errs <- r.do(&editLoadHandler{op: op, form: ch})
                close(ch)
        }()
        go func() {
                errs <- r.do(&editHandler{op: op, form: ch})
        }()
        return firstError(2, errs)
}

I'm not cheating. The procedure was being done serially before, with very similar logic. Previously it had to take the form variable itself from the first request and manually provide it to the next one. Now, instead of providing the form, it's providing a channel that will be used to send the form across. One might even argue that the channel makes the algorithm more natural, curiously.

This is the kind of procedure that becomes fun and natural to write, after having first class concurrency at hand for some time. But, as in the case of DVCS, it takes a while to get used to the idea that concurrency and simplicity are not necessarily at opposing ends.

Read more
niemeyer

In the past week, I’ve finally stopped to fix something that I’ve been wishing for years: inline code reviews in Launchpad. Well, I haven’t exactly managed fix it in Launchpad, but the integration with Rietveld feels nice enough to be relatively painless.

The integration is done using the lbox tool, that was developed in Go using the lpad package for the communication with Launchpad, and a newly written rietveld package for communication with Rietveld.

If you want to join me in my happines, here are the few steps to get that working for you as well.

First, install lbox from the Launchpad PPA. Since it’s written in Go, it has no dependencies.

$ sudo add-apt-repository ppa:gophers/go
$ sudo apt-get update
$ sudo apt-get install lbox

Now, as an example of using it, let’s suppose we want to perform a change in the lbox code itself. First, we take the branch out of Launchpad.

$ mkdir hacking
$ cd hacking
$ bzr branch lp:lbox
Branched 9 revision(s).

Then, let’s create a feature branch based on the original trunk, and perform a change.

$ bzr branch lbox my-nice-feature
Branched 9 revision(s).

$ cd my-nice-feature
$ echo # Yo >> Makefile
$ bzr commit -m "Yo-ified makefile"
Committing to: /home/user/hacking/my-nice-feature/
modified Makefile
Committed revision 10.

Ok, we’re ready for the magic step, which is actually pushing that branch and proposing the merge on the original branch on both Launchpad and Rietveld. It’s harder to explain than to do it:

$ lbox propose -cr
2011/11/17 23:29:49 Looking up branch information for "."...
2011/11/17 23:29:49 Looking up branch information for "hacking/lbox"...
2011/11/17 23:29:49 Found landing target: bzr+ssh://bazaar.../lbox/
(...)

This command will ask you for a few details interactively, like your authentication details in Launchpad and in Rietveld (your Google Account, details sent over SSL to Google itself; you may have to visit Rietveld first for that to work), and also the change description.

In case something fails, feel free to simply execute the command again, as many times as you want. The command is smart enough to figure that an existing merge proposal and change in Rietveld exist and will update the existing ones with the new details you provide, rather than duplicating work.

Once the command finishes, you can visit the URL for the merge proposal in Launchpad that was printed, and you should see something like this:

Note that the change description already includes a link onto the Rietveld issue at codereview.appspot.com. The issue on Rietveld will look something like this:

Observe how the issue has the same description as the merge proposal, but it links back onto the merge proposal. At the left-hand side, there’s also an interesting detail: the original merge proposal email has been added as the reviewer of this change. This means that any changes performed in Rietveld will be mailed back onto the merge proposal for its record.

In the center you can find the meat of the whole work: the actual change set that is being reviewed. Rietveld works with patch sets, so that you can not only see a given change, but you can also review the history of proposals that the proponent has made, and any inline comments performed in them.

Click on the side-by-side link next to Makefile to get an overview of the actual change, and to make comments on it just click on the desired line:

Your comments won’t be sent immediately. Once you’re done making comments and want to deliver the review, click on the “Publish+Mail Comments” link at the top-right, which will take you onto a page that enables complementing with any heading details if desired.

Since the merge proposal is registered as the reviewer of the issue in Rietveld, publishing the review will deliver a message back onto the merge proposal itself, including context links that enable anyone to be taken to the precise review point back in Rietveld:

Then, once you do make the suggested changes and want to publish a new version of the branch, simply repeat the original command: “lbox propose -cr”. This will push the new diff onto Rietveld and create a new patch set. You’ll also be given the chance to edit the previous description, and any changes there will take place both in the merge proposal and in the Rietveld issue.

lbox also has other useful command line options, such as -bug, -new-bug, to associate Launchpad bugs with the merge proposal and put them in progress, or -bp to associate a blueprint with the branch and bug (if provided) being handled.

This should turn your code reviews in Launchpad into significantly more pleasant tasks, and maybe even save some of your precious life time for more interesting activities.

Happy reviewing!

Read more
niemeyer

Certainly one of the reasons why many people are attracted to the Go language is its first-class concurrency aspects. Features like communication channels, lightweight processes (goroutines), and proper scheduling of these are not only native to the language but are integrated in a tasteful manner.

If you stay around listening to community conversations for a few days there’s a good chance you’ll hear someone proudly mentioning the tenet:

Do not communicate by sharing memory; instead, share memory by communicating.

There is a blog post on the topic, and also a code walk covering it.

That model is very sensible, and being able to approach problems this way makes a significant difference when designing algorithms, but that’s not exactly news. What I address in this post is an open aspect we have today in Go related to this design: the termination of background activity.

As an example, let’s build a purposefully simplistic goroutine that sends lines across a channel:

type LineReader struct {
        Ch chan string
        r  *bufio.Reader
}

func NewLineReader(r io.Reader) *LineReader {
        lr := &LineReader{
                Ch: make(chan string),
                r:  bufio.NewReader(r),
        }
        go lr.loop()
        return lr
}

The type has a channel where the client can consume lines from, and an internal buffer
used to produce the lines efficiently. Then, we have a function that creates an initialized
reader, fires the reading loop, and returns. Nothing surprising there.

Now, let’s look at the loop itself:

func (lr *LineReader) loop() {
        for {
                line, err := lr.r.ReadSlice('n')
                if err != nil {
                        close(lr.Ch)
                        return
                }
                lr.Ch <- string(line)
        }
}

In the loop we'll grab a line from the buffer, close the channel in case of errors and stop, or otherwise send the line to the other side, perhaps blocking while the other side is busy with other activities. Should sound sane and familiar to Go developers.

There are two details related to the termination of this logic, though: first, the error information is being dropped, and then there's no way to interrupt the procedure from outside in a clean way. The error might be easily logged, of course, but what if we wanted to store it in a database, or send it over the wire, or even handle it taking in account its nature? Stopping cleanly is also a valuable feature in many circumstances, like when one is driving the logic from a test runner.

I'm not claiming this is something difficult to do, by any means. What I'm saying is that there isn't today an idiom for handling these aspects in a simple and consistent way. Or maybe there wasn't. The tomb package for Go is an experiment I'm releasing today in an attempt to address this problem.

The model is simple: a Tomb tracks whether the goroutine is alive, dying, or dead, and the death reason.

To understand that model, let's see the concept being applied to the LineReader example. As a first step, creation is tweaked to introduce Tomb support:

type LineReader struct {
        Ch chan string
        r  *bufio.Reader
        t  tomb.Tomb
}

func NewLineReader(r io.Reader) *LineReader {
        lr := &LineReader{
                Ch: make(chan string),
                r:  bufio.NewReader(r),
        }
        go lr.loop()
        return lr
}

Looks very similar. Just a new field in the struct, and the function that creates it hasn't even been touched.

Next, the loop function is modified to support tracking of errors and interruptions:

func (lr *LineReader) loop() {
        defer lr.t.Done()
        for {
                line, err := lr.r.ReadSlice('n')
                if err != nil {
                        close(lr.Ch)
                        lr.t.Kill(err)
                        return
                }
                select {
                case lr.Ch <- string(line):
                case <-lr.t.Dying():
                        close(lr.Ch)
                        return
                }
        }
}

Note a few interesting points here: first, Done is called to track the goroutine termination right before the loop function returns. Then, the previously loose error now goes into the Kill Tomb method, flagging the goroutine as dying. Finally, the channel send was tweaked so that it doesn't block in case the goroutine is dying for whatever reason.

A Tomb has both Dying and Dead channels returned by the respective methods, which are closed when the Tomb state changes accordingly. These channels enable explicit blocking until the state changes, and also to selectively unblock select statements in those cases, as done above.

With the loop modified as above, a Stop method can trivially be introduced to request the clean termination of the goroutine synchronously from outside:

func (lr *LineReader) Stop() error {
        lr.t.Kill(nil)
        return lr.t.Wait()
}

In this case the Kill method will put the tomb in a dying state from outside the running goroutine, and Wait will block until the goroutine terminates itself and notifies via the Done method as seen before. This procedure behaves correctly even if the goroutine was already dead or in a dying state due to internal errors, because only the first call to Kill with an actual error is recorded as the cause for the goroutine death. The nil value provided to t.Kill is used as a reason when terminating cleanly without an actual error, and it causes Wait to return nil once the goroutine terminates, flagging a clean stop per common Go idioms.

This is pretty much all that there is to it. When I started developing in Go I wondered if coming up with a good convention for this sort of problem would require more support from the language, such as some kind of goroutine state tracking in a similar way to what Erlang does with its lightweight processes, but it turns out this is mostly a matter of organizing the workflow with existing building blocks.

The tomb package and its Tomb type are a tangible representation of a good convention for goroutine termination, with familiar method names inspired in existing idioms. If you want to make use of it, go get the package with:

$ go get launchpad.net/tomb

The API documentation with details is available at:

http://gopkgdoc.appspot.com/pkg/launchpad.net/tomb

Have fun!

UPDATE 1: there was a minor simplification in the API since this post was originally written, and the post was changed accordingly.

UPDATE 2: there was a second simplification in the API since this post was originally written, and the post was changed accordingly once again to serve as reference.

Read more
Gustavo Niemeyer

About 1 year after development started in Ensemble, today the stars finally aligned just the right way (review queue mostly empty, no other pressing needs, etc) for me to start writing the specification about the repository system we’ve been jointly planning for a long time. This is the system that the Ensemble client will communicate with for discovering which formulas are available, for publishing new formulas, for obtaining formula files for deployment, and so on.

We of course would have liked for this part of the project to have been specified and written a while ago, but unfortunately that wasn’t possible for several reasons. That said, there are also good sides of having an important piece flying around in minds and conversations for such a long time: sitting down to specify the system and describe the inner-working details has been a breeze. Even details such as the namespacing of formulas, which hasn’t been entirely clear in my mind, was just streamed into the document as the ideas we’ve been evolving finally got together in a written form.

One curious detail: this is the first long term project at Canonical that will be developed in Go, rather than Python or C/C++, which are the most used languages for projects within Canonical. Not only that, but we’ll also be using MongoDB for a change, rather than the traditional PostgreSQL, and will also use (you guessed) the mgo driver which I’ve been pushing entirely as a personal project for about 8 months now.

Naturally, with so many moving parts that are new to the company culture, this is still being seen as a closely watched experiment. Still, this makes me highly excited, because when I started developing mgo, the MongoDB driver for Go, my hopes that the Go, MongoDB, and mgo trio would eventually be used at Canonical were very low, precisely because they were all alien to the culture. We only got here after quite a lot of internal debate, experiments, and trust too.

All of that means these are happy times. Important feature in Ensemble being specified and written, very exciting tools, home grown software being useful..

Awesomeness.

Read more
niemeyer

One more Go library oriented towards building distributed systems hot off the presses: govclock. This one offers full vector clock support for the Go language. Vector clocks allow recording and analyzing the inherent partial ordering of events in a distributed system in a comfortable way.

The following features are offered by govclock, in addition to basic event tracking:

  • Compact serialization and deserialization
  • Flexible truncation (min/max entries, min/max update time)
  • Unit-independent update times
  • Traditional merging
  • Fast and memory efficient

If you’d like to know more about vector clocks, the Basho guys did a great job in the following pair of blog posts:

The following sample program demonstrates some sequential and concurrent events, dumping and loading, as well as merging of clocks. For more details, please look at the web page. The project is available under a BSD license.


package main

import (
    "launchpad.net/govclock"
    "fmt"
)

func main() {
    vc1 := govclock.New()
    vc1.Update([]byte("A"), 1)

    vc2 := vc1.Copy()
    vc2.Update([]byte("B"), 0)

    fmt.Println(vc2.Compare(vc1, govclock.Ancestor))   // => true
    fmt.Println(vc1.Compare(vc2, govclock.Descendant)) // => true

    vc1.Update([]byte("C"), 5)


    fmt.Println(vc1.Compare(vc2, govclock.Descendant)) // => false
    fmt.Println(vc1.Compare(vc2, govclock.Concurrent)) // => true

    vc2.Merge(vc1)

    fmt.Println(vc1.Compare(vc2, govclock.Descendant)) // => true

    data := vc2.Bytes()
    fmt.Printf("%#vn", string(data))
    // => "x01x01x01x01Ax01x01x01Bx01x00x01C"

    vc3, err := govclock.FromBytes(data)
    if err != nil { panic(err.String()) }

    fmt.Println(vc3.Compare(vc2, govclock.Equal))      // => true
}

Read more
Gustavo Niemeyer

ZooKeeper is a clever generic coordination server for distributed systems, and is one of the core softwares which facilitate the development of Ensemble (project for automagic IaaS deployments which we push at Canonical), so it was a natural choice to experiment with.

Gozk is a complete binding for ZooKeeper which explores the native features of Go to facilitate the interaction with a ZooKeeper server. To avoid reimplementing the well tested bits of the protocol in an unstable way, Gozk is built on top of the standard C ZooKeeper library.

The experience of integrating ZooKeeper with Go was certainly valuable on itself, and worked as a nice way to learn the details of integrating the Go language with a C library. If you’re interested in learning a bit about Go, ZooKeeper, or other details related to the creation of bindings and asynchronous programming, please fasten the seatbelt now.

Basics of C wrapping in Go

Creating the binding on itself was a pretty interesting experiment already. I have worked on the creation of quite a few bindings and language bridges before, and must say I was pleasantly surprised with the experience of creating the Go binding. With Cgo, the name given to the “foreign function interface” mechanism for C integration, one basically declares a special import statement which causes a pre-processor to look at the comment preceding it. Something similar to this:

// #include <zookeeper.h>
import "C"

The comment doesn’t have to be restricted to a single line, or to #include statements even. The C code contained in the comment will be transparently inserted into a helper C file which is compiled and linked with the final object file, and the given snippet will also be parsed and inclusions processed. In the Go side, that “C” import is simulated as if it were a normal Go package so that the C functions, types, and values are all directly accessible.

As an example, a C function with this prototype:

int zoo_wexists(zhandle_t *zh, const char *path, watcher_fn watcher,
                void *context, struct Stat *stat);

In Go may be used as:

cstat := C.struct_Stat{}
rc, cerr := C.zoo_wexists(zk.handle, cpath, nil, nil, &cstat)

When the C function is used in a context where two result values are requested, as done above, Cgo will save the well known errno variable after the function has finished executing and will return it wrapped into an os.Errno value.

Also, note how the C struct is defined in a way that can be passed straight to the C function. Interestingly, the allocation of the memory backing the structure is going to be performed and tracked by the Go runtime, and will be garbage collected appropriately once no more references exist within the Go runtime. This fact has to be kept in mind since the application will crash if a value allocated normally within Go is saved with a foreign C function and maintained after all the Go references are gone. The alternative in these cases is to call the usual C functions to get hold of memory for the involved values. That memory won’t be touched by the garbage collector, and, of course, must be explicitly freed when no longer necessary. Here is a simple example showing explicit allocation:

cbuffer := (*C.char)(C.malloc(bufferSize))
defer C.free(unsafe.Pointer(cbuffer))

Note the use of the defer statement above. Even when dealing with foreign functionality, it comes in handy. The above call will ensure that the buffer is deallocated right before the current function returns, for instance, so it’s a nice way to ensure no leaks happen, even if in the future the function suddenly gets a new exit point which didn’t consider the allocation of resources.

In terms of typing, Go is more strict than C, and Cgo-based logic will also ensure that the types returned and passed into the foreign C functions are correctly typed, in the same way done for the native types. Note above, for instance, how the call to the free() function has to explicitly convert the value into an unsafe.Pointer, even though in C no casting would be necessary to pass a pointer into a void * parameter.

The unsafe.Pointer is in fact a very special type within Go. Using it, one can convert any pointer type into any other pointer type in an unsafe way (thus the package name), and also back and forth into a uintptr value with the address of the memory referenced by the pointer. For every other type conversion, Go will ensure at compilation time that doing the conversion at runtime is a safe operation.

With all of these resources, including the ability to use common Go syntax and functionality even when dealing with foreign types, values, and function calls, the integration task turns out to be quite a pleasant experience. That said, some of the things may still require some good thinking to get right, as we’ll see shortly.

Watch callbacks and channels

One of the most interesting (and slightly tricky) aspects of mapping the ZooKeeper concepts into Go was the “watch” functionality. ZooKeeper allows one to attach a “watch” to a node so that the server will report back when changes happen to the given node. In the C library, this functionality is exposed via a callback function which is executed once the monitored node aspect is modified.

It would certainly be possible to offer this functionality in Go using a similar mechanism, but Go channels provide a number of advantages for that kind of asynchronous notification: waiting for multiple events via the select statement, synchronous blocking until the event happens, testing if the event is already available, etc.

The tricky bit, though, isn’t the use of channels. That part is quite simple. The tricky detail is that the C callback function execution happens in a C thread started by the ZooKeeper library, and happens asynchronously, while the Go application is doing its business elsewhere. Right now, there’s no straightforward way to transfer the execution of this asynchronous C function back into the Go land. The solution for this problem was found with some help from the folks at the golang-nuts mailing list, and luckily it’s not that hard to support or understand. That said, this is a good opportunity to get some coffee or your preferred focus-enhancing drink.

The solution works like this: when the ZooKeeper C library gets a watch notification, it executes a C callback function which is inside a Gozk helper file. Rather than transferring control to Go right away, this C function simply appends data about the event onto a queue, and signals a pthread condition variable to notify that an event is available. Then, on the Go side, once the first ZooKeeper connection is initialized, a new goroutine is fired and loops waiting for events to be available. The interesting detail about this loop, is that it blocks within a foreign C function waiting for an event to be available, through the signaling of the shared pthread condition variable. In the Go side, that’s how the call looks like, just to give a more practical feeling:

// This will block until there's a watch available.
data := C.wait_for_watch()

Then, on the C side, here is the function definition:

watch_data *wait_for_watch() {
    watch_data *data = NULL;
    pthread_mutex_lock(&watch_mutex);
    if (first_watch == NULL)
        pthread_cond_wait(&watch_available, &watch_mutex);
    data = first_watch;
    first_watch = first_watch->next;
    pthread_mutex_unlock(&watch_mutex);
    return data;
}

As you can see, not really a big deal. When that kind of blocking occurs inside a foreign C function, the Go runtime will correctly continue the execution of other goroutines within other operating system threads.

The result of this mechanism is a nice to use interface based on channels, which may be explored in different ways depending on the application needs. Here is a simple example blocking on the event synchronously, for instance:

stat, watch, err := zk.ExistsW("/some/path")
if stat == nil && err == nil {
    event := <-watch
    // Use event ...
}

Concluding

Those were some of the interesting aspects of implementing the ZooKeeper binding. I would like to speak about some additional details, but this post is rather long already, so I'll keep that for a future opportunity. The code is available under the LGPL, so if you're curious about some other aspect, or would like to use ZooKeeper with Go, please move on and check it out!

Read more
Gustavo Niemeyer

It’s time to release my “side project” which has been evolving over the last several months: Gocheck. I’ve been watching Go for some time, and have been getting more and more interested in the language. My first attempt to write something interesting in it made it obvious that there would be benefit in having a richer testing platform than what is available in the standard library. That said, I do understand why the standard one is slim: it’s pretty minimalist, because it’s used by itself to test the rest of the platform. With Gocheck, though, I don’t have that requirement. I’m able to trust that the standard library works well, and focus on having features which will make me more productive while writing tests, including features such as:

  • Better error reporting
  • Richer test helpers: assertions which interrupt the test immediately, deep multi-type comparisons, string matching, etc
  • Suite-based grouping of tests
  • Fixtures: per suite and/or per test set up and tear down
  • Management of temporary directories
  • Panic-catching logic, with proper error reporting
  • Proper counting of successes, failures, panics, missed tests, skips, etc
  • Support for expected failures
  • Fully tested (yes, it manages to test itself reliably!)

That last point was actually quite fun to get right. It’s the first time I wrote a testing framework from the ground up, and of course I wanted to have it fully tested by itself, but I didn’t want to simply use a foreign testing framework to test it. So what it does is basically to have a “bootstrapping” phase, which ensures that the very basic parts of the library work, without trusting on pretty much any internal functionality (e.g. it verifies the number of executed functions, and works with low-level panics). Then, once the lower layers are trusted, tests for higher functionality was introduced by building on the trusted bits.

Gocheck is actually mostly ready for some time now, but I’ve been polishing edges with some real world usage before releasing it. Since both the real world usage and Gocheck itself are side projects, you can imagine that took a bit of time. Today, though, I’ve managed to fix the last few things which were bothering me, so it’s up for world consumption.

I hope you enjoy it, and make some good use of it so that we can all have more reliable software. ;-)

Read more
Gustavo Niemeyer

After a few years in development, version 1.0 of Mocker is now available! Check out the changes since 0.10.1, the supported features, or go straight to the download page.

Read more
Gustavo Niemeyer

A bit of history

I don’t know exactly why, but I’ve always enjoyed IRC bots. Perhaps it’s the fact that it emulates a person in an easy-to-program way, or maybe it’s about having a flexible and shared “command line” tool, or maybe it’s just the fact that it helps people perceive things in an asynchronous way without much effort. Probably a bit of everything, actually.

My bot programming started with pybot many years ago, when I was still working at Conectiva. Besides having many interesting features, this bot eventually got in an abandonware state, since Canonical already had pretty much equivalent features available when I joined, and I had other interests which got in the way. The code was a bit messy as well.. it was a time when I wasn’t very used to testing software properly (a friend has a great excuse for that kind of messy software: “I was young, and needed the money!”).

Then, a couple of years ago, while working in the Landscape project, there was an opportunity of getting some information more visible to the team. Coincidently, it was also a time when I wanted to get some practice with the concepts of Erlang, so I decided to write a bot from scratch with some nice support for plugins, just to get a feeling of how the promised stability of Erlang actually took place for real. This bot is called mup (Mup Pet, more formally), and its code is available publicly through Launchpad.

This was a nice experiment indeed, and I did learn quite a bit about the ins and outs of Erlang with it. Somewhat unexpected, though, was the fact that the bot grew up a few extra features which multiple teams in Canonical started to appreciate. This was of course very nice, but it also made it more obvious that the egocentric reason for having a bot written in Erlang would now hurt, because most of Canonical’s own coding is done in Python, and that’s what internal tools should generally be written in for everyone to contribute and help maintaining the code.

That’s where the desire of migrating mup into a Python-based brain again came from, and having a new feature to write was the perfect motivator for this.

LDAP and two-way SMSing over IRC

Canonical is a very distributed company. Employees are distributed over dozens of countries, literally. Not only that, but most people also work from their homes, rather than in an office. Many different countries also means many different timezones, and working from home with people from different timezones means flexible timing. All of that means communication gets… well.. interesting.

How do we reach someone that should be in an online meeting and is not? Or someone that is traveling to get to a sprint? Or how can someone that has no network connectivity reach an IRC channel to talk to the team? There are probably several answers to this question, but one of them is of course SMS. It’s not exactly cheap if we consider the cost of the data being transfered, but pretty much everyone has a mobile phone which can do SMS, and the model is not that far away from IRC, which is the main communication system used by the company.

So, the itch was itching. Let’s scratch it!

Getting the mobile phone of employees was already a solved problem for mup, because it had a plugin which could interact with the LDAP directory, allowing people to do something like this:

<joe> mup: poke gustavo
<mup> joe: niemeyer is Gustavo Niemeyer <…@canonical.com> <time:…> <mobile:…>

This just had to be migrated from Erlang into a Python-based brain for the reasons stated above. This time, though, there was no reason to write something from scratch. I could even have used pybot itself, but there was also supybot, an IRC bot which started around the same time I wrote the first version of pybot, and unlike the latter, supybot’s author was much more diligent in evolving it. There is quite a comprehensive list of plugins for supybot nowadays, and it includes means for testing plugins and so on. The choice of using it was straighforward, and getting “poke” support ported into a plugin wasn’t hard at all.

So, on to SMSing. Canonical already had a contract with an SMS gateway company which we established to test-drive some ideas on Landscape. With the mobile phone numbers coming out of the LDAP directory in hands and an SMS contract established, all that was needed was a plugin for the bot to talk to the SMS gateway. That “conversation” with the SMS gateway allows not only sending messages, but also receiving SMS messages which were sent to a specific number.

In practice, this means that people which are connected to IRC can very easily deliver an SMS to someone using their nicks. Something like this:

<joe> @sms niemeyer Where are you? We’re waiting!

And this would show up in the mobile screen as:

joe> Where are you? We’re waiting!

In addition to this, people which have no connectivity can also contact individuals and channels on IRC, with mup working as a middle man. The message would show up on IRC in a similar way to:

<mup> [SMS] <niemeyer> Sorry, the flight was delayed. Will be there in 5.

The communication from the bot to the gateway happens via plain HTTPS. The communication back is a bit more complex, though. There is a small proxy service deployed in Google App Engine to receive messages from the SMS gateway. This was done to avoid losing messages when the bot itself is taken down for maintenance. The SMS gateway doesn’t handle this case very well, so it’s better to have something which will be up most of the time buffering messages.

A picture is worth 210 words, so here is a simple diagram explaining how things got linked together:

This is now up for experimentation, and so far it’s working nicely. I’m hoping that in the next few weeks we’ll manage to port the rest of mup into the supybot-based brain.

Read more
Gustavo Niemeyer

Released editmoin 1.15

Version 1.15 of editmoin is now available.

The following changes were made:

  • Moin used to work with numerical IDs for identification, and editmoin was still based on this model. This release adds support for direct authentication as available in current Moin releases. This was inspired by Reimar Bauer.
  • The new file ~/.moin_users is now parsed to obtain usernames, supporting the feature above. Shortcuts are also supported in this file.
  • Added support for textcha question handling.

Read more
Gustavo Niemeyer

In a hurry?

Go check it out!

The context

A while ago I found out about Sikuli, a very interesting project which allows people to script actions in GUIs based on screenshot excerpts. The idea is that you basically take images representing portions of your screen, like a button, or a label, or an icon, and then create a script which can detect a position in the screen which resembles one of these images, and perform actions on them, such as clicking, or hovering.

I had never imagined something like this, and the idea got me really excited about the possibilities. Imagine, for instance, what can be done in terms of testing. Testing of GUIs is unfortunately not yet a trivial task nowadays. We do have frameworks which are based on accessibility hooks, for instance, but these sometimes can’t be used because the hook is missing, or is even far off in terms of the context being tested (imagine testing that a browser can open a specific flash site successfully, for instance).

So, Sikuli opened my eyes to the possibility of using image matching technology in a GUI automation context, and I really wanted to play with it. In the days following the discovery, I fiddled a bit, communicated with the author, and even submitted some changes to make it work well in Ubuntu.

Then, the idea cooled down in my head, and I moved on with life. Well… until two weeks ago.

Right before heading to the Ubuntu Developer Summit for the next Ubuntu release, the desire of automating GUIs appeared again in the context of the widely scoped Ubuntu-level testing suite. Then, over the first few days last week, I was able to catch up with quite a few people which were interested in the concept of automating GUIs, with different purposes (testing, design approval, etc), which of course was all I needed to actually push that old desire forward.

Trying to get Sikuli to work, though, was quite painful. Even though I had sent patches upstream before, it looks like the build process isn’t working in Ubuntu again for other reasons (it’s not a polished build process, honestly), and even if I managed to make it work and contributed that to the upstream, in the end the path to integrate the Java-based tool in the Python-based testing framework which Ubuntu uses (Mago) wasn’t entirely straightforward either.

Reinventing the wheel

So, the the itch was in place, and there was a reason to let the NIH syndrome take over a bit. Plus, image processing is something I’d like to get a foot in anyway, so it felt like a good chance to have a closer look and at the same time contribute a small bit to potential quality improvements of Ubuntu.

That’s when Xpresser was born. Xpresser is a clean room implementation of the concepts explored by Sikuli, in the form of a Python library which can be used standalone, or embedded into other programs and testing frameworks such as Mago.

The project is sponsored by Canonical, and licensed under the LGPL.

Internally, it makes use of opencv for the image matching, pyatspi for the event generation (mouse clicks, etc), gtk for screen capturing and testing (of itself), and numpy for matrix operations. Clearly, the NIH syndrome, wasn’t entirely active. :-) As a side note, I haven’t played with numpy and gtk for some time, and I’m always amazed by the quality of these modules.

Contribute code and ideas

Concluding this post, which is already longer than I expected, the basics of Xpresser are in place, so go ahead and play with it! That said, there are quite a few low hanging fruits to get it to a point of being a really compelling GUI-driving library, so if you have any interest in the concept, I invite you to play with the code and submit contributions too. If you want ideas of what else could be done, let’s have a chat.

Read more
Gustavo Niemeyer

Some interesting changes have been happening in my professional life, so I wanted to share it here to update friends and also for me to keep track of things over time (at some point I will be older and will certainly laugh at what I called “interesting changes” in the ol’days). Given the goal, I apologize but this may come across as more egocentric than usual, so please feel free to jump over to your next blog post at any time.

It’s been little more than four years since I left Conectiva / Mandriva and joined Canonical, in August of 2005. Shortly after I joined, I had the luck of spending a few months working on the different projects which the company was pushing at the time, including Launchpad, then Bazaar, then a little bit on some projects which didn’t end up seeing much light. It was a great experience by itself, since all of these projects were abundant in talent. Following that, in the beginning of 2006, counting on the trust of people which knew more than I did, I was requested/allowed to lead the development of a brand new project the company wanted to attempt. After a few months of research I had the chance to sit next to Chris Armstrong and Jamu Kakar to bootstrap the development of what is now known as the Landscape distributed systems management project.

Fast forward three and a half years, in mid 2009, and Landscape became a massive project with hundreds of thousands of very well tested lines, sprawling not only a client branch, but also external child projects such as the Storm Object Relational Mapper, in use also by Launchpad and Ubuntu One. In the commercial side of things it looks like Landscape’s life is just starting, with its hosted and standalone versions getting more and more attention from enterprise customers. And the three guys which started the project didn’t do it alone, for sure. The toy project of early 2006 has grown to become a well structured team, with added talent spreading areas such as development, business and QA.

While I wasn’t watching, though, something happened. Facing that great action, my attention was slowly being spread thinly among management, architecture, development, testing, code reviews, meetings, and other tasks, sometimes in areas not entirely related, but very interesting of course. The net result of increased attention sprawl isn’t actually good, though. If it persists, even when the several small tasks may be individually significant, the achievement just doesn’t feel significant given the invested effort as a whole. At least not for someone that truly enjoys being a software architect, and loves to feel that the effort invested in the growth of a significant working software is really helping people out in the same magnitude of that investment. In simpler words, it felt like my position within the team just wasn’t helping the team out the same way it did before, and thus it was time for a change.

Last July an external factor helped to catapult that change. Eucalyptus needed a feature to be released with Ubuntu 9.10, due in October, to greatly simplify the installation of some standard machine images.. an Image Store. It felt like a very tight schedule, even more considering that I hadn’t been doing Java for a while, and Eucalyptus uses some sexy (and useful) new technology called the Google Web Toolkit, something I had to get acquainted with. Two months looked like a tight schedule, and a risky bet overall, but it also felt like a great opportunity to strongly refocus on a task that needed someone’s attention urgently. Again I was blessed with trust I’m thankful for, and by now I’m relieved to look back and perceive that it went alright, certainly thanks to the help of other people like Sidnei da Silva and Mathias Gug. Meanwhile, on the Landscape side, my responsibilities were distributed within the team so that I could be fully engaged on the problem.

Moving this forward a little bit we reach the current date. Right now the Landscape project has a new organizational structure, and it actually feels like it’s moving along quite well. Besides the internal changes, a major organizational change also took place around Landscape over that period, and the planned restructuring led me to my current role. In practice, I’m now engaging into the research of a new concept which I’m hoping to publish openly quite soon, if everything goes well. It’s challenging, it’s exciting, and most importantly, allows me to focus strongly on something which has a great potential (I will stop teasing you now). In addition to this, I’ll definitely be spending some of that time on the progress of Landscape and the Image Store, but mostly from an architectural point of view, since both of these projects will have bright hands taking care of them more closely.

Sit by the fireside if you’re interested in the upcoming chapters of that story. ;-)

Read more
Gustavo Niemeyer

Geocaching on the Easter Island

This post is not about what you think it is, unfortunately. I actually do hope to go to the Easter Island at some point, but this post is about a short story which involves geohash.org, Groundspeak (from geocaching.com), and very very poor minded behavior.

The context

So, before anything else, it’s important to understand what geohash.org is. As announced when the service was launched (also as a post on Groundspeak’s own forum), geohash.org offers short URLs which encode a latitude/longitude pair, so that referencing them in emails, forums, and websites is more convenient, and that’s pretty much it.

When people go to geohash.org, they can enter geographic coordinates that they want to encode, and they get back a nice little map with the location, some links to useful services, and most importantly the actual Geohash they can use to link to the location, so as an example they could be redirected to the URL http://geohash.org/6gkzwgjf3.

Of course, it’s pretty boring to be copy & pasting coordinates around, so shortly after the service launched, the support for geocoding addresses was also announced, which means people could type a human oriented address and get back the Geohash page for it. Phew.. much more practical.

The problem

All was going well, until a couple of months ago, when a user reported that the geocoding of addresses wasn’t working anymore. After some investigation, it turned out that geohash.org was indeed going over the free daily quota allowed by the geocoding provider used. But, that didn’t quite fit with the overall usage reports for the system, so I went on to investigate what was up in the logs.

The cause

Something was wrong indeed. The system was getting thousands of queries a day from some application, and not only that, but the queries were entirely unrelated to Geohashes. The application was purely interested in the geocoding of addresses which the site supported for the benefit of Geohash users. Alright, that wasn’t something nice to do, but I took it lightly since the interface implemented could perhaps give the impression that the site was a traditional geocoding system. So, to fix the situation, the non-Geohash API was removed at this point, and requests for the old API then started to get an error saying something like 403 Forbidden: For geocoding without geohashes, please look elsewhere..

Unfortunately, that wasn’t the end of the issue. Last week I went on to see the logs, and the damn application was back, and this time it was using Geohashes, so I became curious about who was doing that. Could I be mistakingly screwing up some real user of Geohashes? So, based on the logs, I went on to search for who could possibly be using the system in such a way. It wasn’t too long until I found out that, to my surprise, it was Groundspeak’s iPhone application. Groundspeak’s paid iPhone application, to be more precise, because the address searching feature is only available for paying users.

Looking at the release notes for the application, there was no doubt. Version 2.3.1, sent to Apple on September 10th, shortly after the old API was blocked, fixes the Search by Address/Postal Code feature says the maintainer, and there’s even a thread discussing the breakage where the maintainer mentions:

The geocoding service we’ve been using just turned their service off. That’s why things are failing; it was relying on an external service for this feature. We’re fixing the issue on our end and using a service that shouldn’t fail as easily. Unfortunately we’ll have to do an update to the store to get this feature out to the users. This will take some time, but in version 2.4 this will work.

Wait, ok, so let’s see this again. First, they were indeed not using Geohashes at all, and instead using geohash.org purely as a geocoding service. Then, when the API they used is disabled with hints that the Geohash service is not a pure geocoding service, they workaround this by decoding the Geohash retrieved and grabbing the coordinates so that they can still use it as a pure geocoding service. At the same time, they tell their users that they changed to “a service that shouldn’t fail as easily”. Under no circumstances they contact someone at geohash.org to see what was going on (shouldn’t be necessary, really, but assuming immaculate innocence, sending an email would be pretty cool).

Redirecting users to the Easter Island

So, yeah, sorry, but I didn’t see many reasons to sustain the situation. Not only because it looks like an unfriendly behavior overall, but also because, on their way of using an unrelated free service to sustain their paid application, they were killing the free geocoding feature of geohash.org with thousands of geocoding requests a day, which impacted on the daily quota the service has by itself.

So, what to do? I could just disable the service again, or maybe contact the maintainers and ask them to please stop using the service in such a way, after all there are dozens of real geocoding services out there! But… hmmm… I figured a friendly poke could be nice at this point, before actually bringing up that whole situation.

And that’s what happened: rather than blocking their client, the service was modified so that all of their geocoding requests translated into the geographic coordinates of the Easter Island.

Of course, users quickly noticed it and started reporting the problem again.

The answer from Groundspeak

After users started complaining loudly, Bryan Roth, which signs as co-founder of Groundspeak, finally contacted me for the first time asking if there was a way to keep the service alive. Unfortunately, I really can’t, and provided the whole explanation to Bryan, and even mentioned that I actually use Google as the upstream geocoding provider and that I would be breaking the terms of service doing this, but offered to redirect their requests to their own servers if necessary.

Their answer to this? Pretty bad I must say. I got nothing via email, but they posted this in the forum:

But seriously, this bug actually has nothing to do with our app and everything to do with the external service we’ve been using to convert an address into GPS coordinates. For the next app update, we’re completely dropping that provider since they’ve now failed us twice. We’ll be using only Google from that point on, so hopefully their data will be more accurate.

I can barely believe what I read. They blame the upstream service, as if they were using a first class geocoding provider somewhere rather than sucking resources from a site they felt cool to link their paid application to, take my suggestion of using Google for geocoding, and lie about the fact that the data would be more accurate (it obviously can’t, since it was already Google that was being used).

I mentioned something about this in the forum itself, but I was moderated out immediately of course.

Way to go Groundspeak.

UPDATE

After some back and forth with Bryan and Josh, the last post got edited away to avoid the misleading details, and Bryan clarified the case in the forum. Then, we actually settled on my proposal of redirecting the iPhone Geocaching.com application requests to Groundspeak’s own servers so that users of previous versions of the application wouldn’t miss the feature while they work on the new release.

If such communication had taken place way back when the feature was being planned, or when it was “fixed” the first time, the whole situation would never have happened.

No matter what, I’m glad it ended up being sorted towards a more friendly solution.

Read more
Gustavo Niemeyer

Wiki + Spreadsheet

The underlying concept is very simple: spreadsheets are a way to organize text, numbers and formulas into what might be seen as a natively numeric environment: a matrix. So what would happen if we loosed some of the bolts of the numeric-oriented organization, and tried to reuse the same concepts into a more formatting-oriented environment which is naturally collaborative: a wiki.

While I do encourage you to answer this with some fantastic new online service (please provide me with an account and the best e-book reader device available once you’re rich) I had a try at answering this question myself a while ago by writing the Calc macro for Moin.

Basically, the Calc macro allows extracting values found in a wiki page into lists (think columns or rows), and applying formulas and further formatting as wanted.

I believe there’s a lot of potential on the basic concept, and the prototype, even though functional and useful, surely has a lot to evolve, so I’ve published the project in Launchpad to make contributions easier. I actually apologize for not publishing it earlier. There was hope that more features would be implemented before releasing, but now it’s clear that it won’t get many improvements from me anytime soon. If you do decide to improve it, please try to prepare patches which are mostly ready for integration, including full testing, since I can’t dedicate much time for it myself in the foreseeable future.

Read more