Canonical Voices

Posts tagged with 'architecture'

niemeyer

The qml package is right now one of the best choices for creating graphic applications under the Go language. Part of the reason why this is true comes from the convenience of QML, a high-level domain-specific language that allows describing visual components, events, animations, and content in general in a succinct and pleasing way. The integration of such a language with Go means having both a good mechanism for describing visual content, and a good platform for doing general development under, which can range from simple data manipulation to involved OpenGL content rendering.

On the practical side, one of the implications of using such a language partnership is that every Go qml application will have some sort of resource content to deal with, carrying the QML logic. Such content may be loaded either from files on disk, or from strings in memory. Loading from a file means the content may be organized in multiple files that directly reference each other without changing the Go application, and may be updated and tested without rebuilding. Loading from a string in memory means the content needs to be self-contained, but results in a standalone binary (linking aside – still depends on Qt libraries).

There’s a well known trick to have both benefits at once, though, and the basic solution has already been made available in multiple packages: have the content on disk, and use an external tool to pack it up into a Go file that is built into the binary when the content is updated. Unfortunately, this trick alone is not enough with the qml package, because the QML engine needs to know what resources are available and where so that the right thing happens when it sees a directory being imported or an image path being referenced.

To solve that problem, the qml package has been enhanced with functionality that leverages the existing Qt resource system to pack content into the binary itself. Rather than using the upstream C++ and XML-based resource compiler, though, a new resource packer was implemented inside the qml package and made available both under a friendly Go API, and as a tool that follows common Go idioms.

The help text for the genqrc tool describes it in detail:

Usage: genqrc [options] <subdir1> [<subdir2> ...]

The genqrc tool packs all resource files under the provided subdirectories into
a single qrc.go file that may be built into the generated binary. Bundled files
may then be loaded by Go or QML code under the URL "qrc:///some/path", where
"some/path" matches the original path for the resource file locally.

Starting with Go 1.4, this tool may be conveniently run by the "go generate"
subcommand by adding a line similar to the following one to any existent .go
file in the project (assuming the subdirectories ./code/ and ./images/ exist):

    //go:generate genqrc code images

Then, just run "go generate" to update the qrc.go file.

During development, the generated qrc.go can repack the filesystem content at
runtime to avoid the process of regenerating the qrc.go file and rebuilding the
application to test every minor change made. Runtime repacking is enabled by
setting the QRC_REPACK environment variable to 1:

    export QRC_REPACK=1

This does not update the static content in the qrc.go file, though, so after
the changes are performed, genqrc must be run again to update the content that
will ship with built binaries.

The tool may be installed via go get as usual:

go get gopkg.in/qml.v1/cmd/genqrc

and once the qrc.go file has been generated, the main qml file may be
loaded with logic equivalent to:

component, err := engine.LoadFile("qrc:///path/to/file.qml")

The loaded file can in turn reference any other content that was bundled
into the Go binary.

For a better picture, this example demonstrates the use of the tool.

Read more
niemeyer

There were a few long standing issues in the yaml.v1 package which were being postponed so that they could be done at once in a single incompatible change, and the time has come: yaml.v2 is now available.

Besides these incompatible changes, other compatible fixes and improvements were performed in that push, and those were also applied to the existing yaml.v1 package so that dependent applications benefit immediately and without modifications.

The subtopics below outline exactly what changed, and how to adapt existent code when necessary.

Type errors

With yaml.v1, decoding a YAML value that is not compatible with the Go value being unmarshaled into will silently drop the offending value without notice. In many cases continuing with degraded behavior by ignoring the problem is intended, but this was the one and only option.

In yaml.v2, these problems will cause a *yaml.TypeError to be returned, containing helpful information about what happened. For example:

yaml: unmarshal errors:
  line 3: cannot unmarshal !!str `foo` into int

On such errors the decoding process still continues until the end of the YAML document, so ignoring the TypeError will produce logic equivalent to the old yaml.v1 behavior.

New Marshaler and Unmarshaler interfaces

The way that yaml.v1 allowed custom types to implement marshaling and unmarshaling of YAML content was slightly confusing and troublesome. For example, considering a CustomType with a keys field:

type CustomType struct {
        keys map[string]int
}

and supposing the goal is to unmarshal this YAML map into it:

custom:
    a: 1
    b: 2
    c: 3

With yaml.v1, one would need to implement logic similar to the following for that:

func (v *CustomType) SetYAML(tag string, value interface{}) bool {
        if tag == "!!map" {
                m := value.(map[interface{}]interface{})
                // ... iterate/validate/convert key/value pairs 
        }
        return goodValue
}

This is too much trouble when the package can easily do those conversions internally already. To fix that, in yaml.v2 the Getter and Setter interfaces are both gone and were replaced by the Marshaler and Unmarshaler interfaces.

Using the new mechanism, the example above would be implemented as follows:

func (v *CustomType) UnmarshalYAML(unmarshal func(interface{}) error) error {
        return unmarshal(&v.keys)
}

Custom-ordered maps

By default both yaml.v1 and yaml.v2 will marshal keys in a stable order which is increasing within the same type and arbitrarily defined across types. So marshaling is already performed in a sensible order, but it cannot be customized in yaml.v1, and there’s also no way to tell which order the map was originally in, as some applications require.

To fix that, there is a new pair of types that support preserving the order of map keys both when marshaling and unmarshaling: MapSlice and MapItem.

Such an ordered map literal would look like:

m := yaml.MapSlice{{"c", 3}, {"b", 2}, {"a", 1}}

The MapSlice type may be used for variables going in and out of the yaml package, or in struct fields, map values, or anywhere else sensible.

Binary values

Strings in YAML must be valid UTF-8 or UTF-16 (with a byte order mark for the latter), and for binary data the specification defines a standard !!binary tag which represents the raw data encoded (encrypted?) as base64. This is now supported both in yaml.v1 and yaml.v2, transparently. That is, any string value that is not valid UTF-8 will be base64-encoded and appropriately tagged so that it roundtrips as the same string. Short strings are inlined, while long ones are automatically broken into several lines and represented in a proper style. For example:

one: !!binary gICA
two: !!binary |
  gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgI
  CAgICAgICAgICAgICAgICAgICAgICAgICAgICA

Multi-line strings

Any string that contains new-line characters (‘\n’) will now be encoded using the literal style. For example, a value that would before be encoded as:

key: "line 1\nline 2\nline 3\n"

is now encoded by both yaml.v1 and yaml.v2 as:

key: |
  line 1
  line 2
  line 3

Other improvements

Besides these major changes, some assorted minor improvements were also performed:

  • Better handling of top-level “null”s (issue #35)
  • Marshal base 60 floats quoted for YAML 1.1 compatibility (issue #34)
  • Better error on invalid map keys (issue #25)
  • Allow non-ASCII characters in plain strings (issue #11).
  • Do not catch unrelated panics by mistake (commit a6dc653f)

For obtaining the yaml.v1 improvements:

go get -u gopkg.in/yaml.v1

For updating to yaml.v2, adapt the code as necessary given the points above, replace the import path, and run:

go get -u gopkg.in/yaml.v2

Read more
niemeyer

As detailed in the preliminary release of qml.v1 for Go a couple of weeks ago, my next task was to finish the improvements in its OpenGL API. Good progress has happened since then, and the new API is mostly done and available for experimentation. At the same time, there’s still work to do on polishing edges and on documenting the extensive API. This blog post aims to present the improvements made, their internal design, and also to invite help for finishing the pending details.

Before diving into the new, let’s first have a quick look at how a Go application using OpenGL might look like with qml.v0. This is an excerpt from the old painting example:

import (
        "gopkg.in/qml.v0"
        "gopkg.in/qml.v0/gl"
)

func (r *GoRect) Paint(p *qml.Painter) {
        width := gl.Float(r.Int("width"))
        height := gl.Float(r.Int("height"))
        gl.Enable(gl.BLEND)
        // ...
}

The example imports both the qml and the gl packages, and then defines a Paint method that makes use of the GL types, functions, and constants from the gl package. It looks quite reasonable, but there are a few relevant shortcomings.

One major issue in the current API is that it offers no means to tell even at a basic level what version of the OpenGL API is being coded against, because the available functions and constants are the complete set extracted from the gl.h header. For example, OpenGL 2.0 has GL_ALPHA and GL_ALPHA4/8/12/16 constants, but OpenGL ES 2.0 has only GL_ALPHA. This simplistic choice was a good start, but comes with a number of undesired side effects:

  • Many trivial errors that should be compile errors fail at runtime instead
  • When the code does work, the developer is not sure about which API version it is targeting
  • Symbols for unsupported API versions may not be available for linking, even if unused

That last point also provides a hint of another general issue: portability. Every system has particularities for how to load the proper OpenGL API entry points. For example, which libraries should be linked with, where they are in the local system, which entry points they support, etc.

So this is the stage for the improvements that are happening. Before detailing the solution, let’s have a look at the new painting example in qml.v1, that makes use of the improved API:

import (
        "gopkg.in/qml.v1"
        "gopkg.in/qml.v1/gl/2.0"
)

func (r *GoRect) Paint(p *qml.Painter) {
        gl := GL.API(p)
        width := float32(r.Int("width"))
        height := float32(r.Int("height"))
        gl.Enable(GL.BLEND)
        // ...
}

With the new API, rather than importing a generic gl package, a version-specific gl/2.0 package is imported under the name GL. That choice of package name allows preserving familiar OpenGL terms for both the functions and the constants (gl.Enable and GL.BLEND, for example). Inside the new Paint method, the gl value obtained from GL.API holds only the functions that are defined for the specific OpenGL API version imported, and the constants in the GL package are also constrained to those available in the given version. Any improper references become build time errors.

To support all the various OpenGL versions and profiles, there are 23 independent packages right now. These packages are of course not being hand-built. Instead, they are generated all at once by a tool that gathers information from various sources. The process can be tersely described as:

  1. A ragel-based parser processes Qt’s qopenglfunctions_*.h header files to collect version-specific functions
  2. The Khronos OpenGL Registry XML is parsed to collect version-specific constants
  3. A number of tweaks defined in the tool’s code is applied to the state
  4. Packages are generated by feeding the state to text templates

Version-specific functions might also be extracted from the Khronos Registry, but there’s a good reason to use information from the Qt headers instead: Qt already solved the portability issue. It works in several platforms, and if somebody is using QML successfully, it means Qt is already using that system’s OpenGL capabilities. So rather than designing a new mechanism to solve the same problem, the qml package now leverages Qt for resolving all the GL function entry points and the linking against available libraries.

Going back to the example, it also demonstrates another improvement that comes with the new API: plain types that do not carry further meaning such as gl.Float and gl.Int were replaced by their native counterparts, float32 and int32. Richer types such as Enum were preserved, and as suggested by David Crawshaw some new types were also introduced to represent entities such as programs, shaders, and buffers. The custom types are all available under the common gl/glbase package that all version-specific packages make use of.

So this is all working and available for experimentation right now. What is left to do is almost exclusively improving the list of function tweaks with two goals in mind, which will be highlighted below as those are areas where help would be appreciated, mainly due to the footprint of the API.

Documentation importing

There are a few hundred functions to document, but a large number of these are variations of the same function. The previous approach was to simply link to the upstream documentation, but it would be much better to have polished documentation attached to the functions themselves. This is the new documentation for MultMatrixd, for example. For now the documentation is being imported manually, but the final process will likely consist of some automation and some manual polishing.

Function polishing

The standard C OpenGL API can often be translated automatically (see BindBuffer or BlendColor), but in other cases the function prototype has to be tweaked to become friendly to Go. The translation tool already has good support for defining most of these tweaks independently from the rest of the machinery. For example, the following logic changes the ShaderSource function from its standard from into something convenient in Go:

name: "ShaderSource",
params: paramTweaks{
        "glstring": {rename: "source", retype: "...string"},
        "length":   {remove: true},
        "count":    {remove: true},
},
before: `
        count := len(source)
        length := make([]int32, count)
        glstring := make([]unsafe.Pointer, count)
        for i, src := range source {
                length[i] = int32(len(src))
                if len(src) > 0 {
                        glstring[i] = *(*unsafe.Pointer)(unsafe.Pointer(&src))
                } else {
                        glstring[i] = unsafe.Pointer(uintptr(0))
                }
        }
`,

Other cases may be much simpler. The MultMatrixd tweak, for instance, simply ensures that the parameter has the proper length, and injects the documentation:

name: "MultMatrixd",
before: `
        if len(m) != 16 {
                panic("parameter m must have length 16 for the 4x4 matrix")
        }
`,
doc: `
        multiplies the current matrix with the provided matrix.
        ...
`,

and as an even simpler example, CreateProgram is tweaked so that it returns a glbase.Program instead of the default uint32.

name:   "CreateProgram",
result: "glbase.Program",

That kind of polishing is where contributions would be most appreciated right now. One valid way of doing this is picking a range of functions and importing and polishing their documentation manually, and while doing that keeping an eye on required tweaks that should be performed on the function based on its documentation and prototype.

If you’d like to help somehow, or just ask questions or report your experience with the new API, please join us in the project mailing list.

Read more
niemeyer

As part of the on going work on Ubuntu Touch phones, I was invited to contribute a Go package to interface with ubuntuoneauth, a C++ and Qt library that authenticates against Ubuntu One using the system account made available by the phone owner. The details of that library and its use case are not interesting for most people right now, but the work to interface with it is a good example to talk about because, besides the result (uoneauth) being an external and independent Go package that extends the qml package, ubuntuoneauth is not a QML library, but rather a plain Qt library. Some of the callbacks use even traditional C++ types that do not inherit from QObject and have no Qt metadata, so offering that functionality from Go nicely takes a bit more work.

What follows are some of the highlights of that integration logic, to serve as a reference for similar extensions in the future. Note that if your interest is in creating QML applications with Go, none of this is necessary and the documentation is a better place to start.

As an initial step, the following examples demonstrate how the original C++ library and the Go package being designed are meant to be used. The first snippet contains the relevant logic taken out of the examples/signing-main.cpp file, tweaked for clarity:

int main(int argc, char *argv[]) {
    (...)
    UbuntuOne::SigningExample *example = new UbuntuOne::SigningExample(&a);
    QTimer::singleShot(0, example, SLOT(doExample()));
    (...)
}

SigningExample::SigningExample(QObject *parent) : QObject(parent) {
    QObject::connect(&service, SIGNAL(credentialsFound(const Token&)),
                     this, SLOT(handleCredentialsFound(Token)));
    QObject::connect(&service, SIGNAL(credentialsNotFound()),
                     this, SLOT(handleCredentialsNotFound()));
    (...)
}

void SigningExample::doExample() {
    service.getCredentials();
}

void SigningExample::handleCredentialsFound(Token token) {
    QString authHeader = token.signUrl(url, QStringLiteral("GET"));
    (...)
}
 
void SigningExample::handleCredentialsNotFound() {
    qDebug() << "No credentials were found.";
}

The example hooks into various signals in the service, one for each possible outcome, and then calls the service’s getCredentials method to initiate the process. If successful, the credentialsFound signal is emitted with a Token value that is able to sign URLs, returning an HTTP header that can authenticate a request.

That same process is more straightforward when using the library from Go:

service := uoneauth.NewService(engine)
token, err := service.Token()
if err != nil {
        return err
}
signature := token.HeaderSignature("GET", url)

Again, this gets a service, a token from it, and signs a URL, in a “forward” way.

So the goal is turning the initial C++ workflow into this simpler Go API. A good next step is looking into how the NewService function is implemented:

func NewService(engine *qml.Engine) *Service {
        s := &Service{reply: make(chan reply, 1)}

        qml.RunMain(func() {
                s.obj = *qml.CommonOf(C.newSSOService(), engine)
        })

        runtime.SetFinalizer(s, (*Service).finalize)

        s.obj.On("credentialsFound", s.credentialsFound)
        s.obj.On("credentialsNotFound", s.credentialsNotFound)
        s.obj.On("twoFactorAuthRequired", s.twoFactorAuthRequired)
        s.obj.On("requestFailed", s.requestFailed)
        return s
}

NewService creates the service instance, and then asks the qml package to run some logic in the main Qt thread via RunMain. This is necessary because a number of operations in Qt, including the creation of objects, are associated with the currently running thread. Using RunMain in this case ensures that the creation of the C++ object performed by newSSOService happens in the main Qt thread (the “GUI thread”).

Then, the address of the C++ UbuntuOne::SSOService type is handed to CommonOf to obtain a Common value that implements all the common logic supported by C++ types that inherit from QObject. This is an unsafe operation as there’s no way for CommonOf to guarantee that the provided address indeed points to a C++ value with a type that inherits from QObject, so the call site must necessarily import the unsafe package to provide the unsafe.Pointer parameter. That’s not a problem in this context, though, since such extension packages are necessarily dealing with unsafe logic in either case.

The obtained Common value is then assigned to the service’s obj field. In most cases, that value is instead assigned to an anonymous Common field, as done in qml.Window for example. Doing so means qml.Window values implement the qml.Object interface, and may be manipulated as a generic object. For the new Service type, though, the fact that this is a generic object won’t be disclosed for the moment, and instead a simpler API will be offered.

Following the function logic further, a finalizer is then registered to ensure the C++ value gets deallocated if the developer forgets to Close the service explicitly. When doing that, it’s important to ensure the Close method drops the finalizer when called, not only to facilitate the garbage collection of the object, but also to avoid deallocating the same value twice.

The next four lines in the function should be straightforward: they register methods of the service to be called when the relevant signals are emitted. Here is the implementation of two of these methods:

func (s *Service) credentialsFound(token *Token) {
        s.sendReply(reply{token: token})
}

func (s *Service) credentialsNotFound() {
        s.sendReply(reply{err: ErrNoCreds})
}

func (s *Service) sendReply(r reply) {
        select {
        case s.reply <- r:
        default:
                panic("internal error: multiple results received")
        }
}

Handling the signals consists of just sending the reply over the channel to whoever initiated the request. The select statement in sendReply just ensures that the invariant of having a reply per request is not broken without being noticed, as that would require a slightly different design.

There’s one more point worth observing in this logic: the token value received as a parameter in credentialsFound was already converted into the local Token type. In most cases, this is unnecessary as the parameter is directly useful as a qml.Object or as another native type (int, etc), but in this case UbuntuOne::Token is a plain C++ type that does not inherit from QObject, so the default signal parameter that would arrive in the Go method has only a type name and the value address.

Instead of taking the plain value, it is turned into a more useful one by registering a converter with the qml package:

func convertToken(engine *qml.Engine, obj qml.Object) interface{} {
        // Copy as the one held by obj is passed by reference.
        addr := unsafe.Pointer(obj.Property("plainAddr").(uintptr))
        token := &Token{C.tokenCopy(addr)}
        runtime.SetFinalizer(token, (*Token).finalize)
        return token
}

func init() {
        qml.RegisterConverter("Token", convertToken)
}

Given that setup, the Service.Token method may simply call the getCredentials method from the underlying UbuntuOne::SSOService method to fire a request, and block waiting for a reply:

func (s *Service) Token() (*Token, error) {
        s.mu.Lock()
        qml.RunMain(func() {
                C.ssoServiceGetCredentials(unsafe.Pointer(s.obj.Addr()))
        })
        reply := <-s.reply
        s.mu.Unlock()
        return reply.token, reply.err
}

The lock ensures that a second request won’t take place before the first one is done, forcing the correct sequencing of replies. Given the current logic, this isn’t strictly necessary since all requests are equivalent, but this will remain correct even if other methods from SSOService are added to this interface.

The returned Token value may then be used to sign URLs by simply calling the respective underlying method:

func (t *Token) HeaderSignature(method, url string) string {
        cmethod, curl := C.CString(method), C.CString(url)
        cheader := C.tokenSignURL(t.addr, cmethod, curl, 0)
        defer freeCStrings(cmethod, curl, cheader)
        return C.GoString(cheader)
}

No care about using qml.RunMain has to be taken in this case, because UbuntuOne::Token is a simple C++ type that does not interact with the Qt machinery.

This completes the journey of creating a package that provides access to the ubuntuoneauth library from Go code. In many cases it’s a better idea to simply rewrite the logic in Go, but there will be situations similar to this library, where either rewriting would take more time than reasonable, or perhaps delegating the maintenance and stabilization of the underlying logic to a different team is the best thing to do. In those cases, an approach such as the one exposed here can easily solve the problem.

Read more
niemeyer

As originally shared on Google+, and as a follow up of the previous post covering OpenGL on Go QML, a new screencast was published to demonstrate the latest features introduced around OpenGL support in Go QML:

Refrences:

Read more
niemeyer

As recently announced, my latest endeavor at Canonical is to enable graphical client-side development with the Go language via a new qml Go package that integrates the language with Qt’s QML framework.

The QML framework solves the problem of designing graphic applications via a language that offers a convenient mix of declarative and procedural features. As a very simple example, if the following QML content is loaded by itself, it will draw a square centralized inside a window:

import QtQuick 2.0

Rectangle {
        width: 640
        height: 280
        color: "black"

        Rectangle {
                width: 250; height: 250
                anchors.centerIn: parent
                color: "red"

                MouseArea {
                        anchors.fill: parent
                        onClicked: console.log("Stop poking me!")
                }
        }
}

As expected, it would look similar to:

Centralized Rectangle

The red rectangle will remain centered as the window is resized, and will print a message every time it is clicked. The clicked signal was hooked into logic implemented in JavaScript in this case, but it might as well have been hooked into custom Go logic with the same ease using the qml package. With very minor changes, it could also be something much more interesting than a red rectangle, such as a modern web browser:

import QtQuick 2.0
import QtWebKit 3.0

Rectangle {
        width: 640
        height: 280
        color: "black"

        WebView {
                width: 250; height: 250
                anchors.centerIn: parent
                url: "http://golang.org"
        }
}

The result would be:

Web Page Rectangle

It’s worth noting that this wasn’t just a screenshot of a web page, but rather a tiny fully functional web browser accessing a web page, easily embedded into the QML application to satisfy whatever browsey needs the author might have.

Being able to leverage all these existing building blocks so comfortably is what makes a graphics platform attractive to develop in, and is what inspired the on going effort to have that platform working under the Go language. As we can observe on some of the work published, that side of things seems to be going well.

Now, the next level up is to enable people to create such building blocks without leaving the Go language. The initial step towards that is already committed to the code repository. It enables Go types to be created and seamlessly integrated into the QML language. For example, consider this simple Go type:

type GoType struct {
        Text string
}

func (v *GoType) OnTextChanged() {
        fmt.Println("Text changed...")
}

If this type is made available to QML content via the qml.RegisterTypes function, it may then be used as any other native type. This would work as a QML file, for instance:

import GoExtensions 1.0

GoType {
        text: "Have you signed up for GopherCon yet?"
}

There’s a relevant detail, though: this type has no visible content by itself. This means that displaying something must be done via interactions with other QML items, or via external systems (dbus, for example).

Solving that problem has been one of my objectives in the last month, and although it’s not yet publicly available, the work is in a good-enough state that I feel comfortable talking about it.

As described, the main goal is enabling these custom types to paint. This will be achieved by exposing a Go package that offers an OpenGL API, and defining methods that the custom types have to implement for being able to render at appropriate times. Although the details aren’t finalized, the current draft can run familiar OpenGL code similar to the following:

func (v *GoType) Paint(p *qml.Painter) {
        obj := p.Object()
        width := gl.Float(obj.Int("width"))
        height := gl.Float(obj.Int("height"))

        gl.Enable(gl.BLEND)
        gl.BlendFunc(gl.SRC_ALPHA, gl.ONE_MINUS_SRC_ALPHA);
        gl.Color4f(1.0, 1.0, 1.0, 0.8)
        gl.Begin(gl.QUADS)
        gl.Vertex2f(0, 0)
        gl.Vertex2f(width, 0)
        gl.Vertex2f(width, height)
        gl.Vertex2f(0, height)
        gl.End()

        gl.LineWidth(2.5)
        gl.Color4f(0.0, 0.0, 0.0, 1.0)
        gl.Begin(gl.LINES)
        gl.Vertex2f(0, 0)
        gl.Vertex2f(width, height)
        gl.Vertex2f(width, 0)
        gl.Vertex2f(0, height)
        gl.End()
}

One interesting point to realize is that, again, this isn’t just rendering into a lonely OpenGL context. The rendered content goes into a framebuffer that lives within the overall QML scene graph, and has the same integration capabilities as every other QML item.

In the following animated image, for example, the white square was generated with the above GL code, and was rendered next to other native items with this QML source:

Go + OpenGL Example

Note the smooth blending and proper overlapping established in the scene graph based on the ordering of elements in the QML source. The animation was also driven by QML without special handling of the content rendered from Go.

Another relevant point in terms of the integration with Go is that the GL code demonstrated above is considered old-school these days. The modern version uses fewer calls, making better use of the graphics card memory by transferring, tracking, and handling data such as vertexes in contiguous arrays. This reduces the impact of the cross-context calls that depart and rejoin the Go runtime, and can unlock interesting use cases that the overhead might otherwise prevent.

In terms of availability of these features, we’re about to enter a holiday season, so they should be polished enough for a first public review at some point in January. Looking forward to it.

UPDATE (2014-01-27): These features are now publicly available. The painting example demonstrates the exact scenario pictured above.

UPDATE (2014-02-18): New screencast demonstrates some of the OpenGL features of Go QML.

Read more
niemeyer

About a year ago I ordered a pack of 10 atmega328p processors from China to play with. They took a while to get here, and it took even longer for me to get back to them, but a few days ago the motivation to start doing something finally appeared.

I’ve never actually played with AVRs before, and felt a bit like I was jumping a step in my electronics enthusiast progress by not diving into its architecture a bit more deeply. Also, despite the obvious advantages of ARM-based chips these days, the platform is still interesting in some perspectives, such as its widespread availability, low price in small quantities, and the ability to plug them in a breadboard and do things without pretty much any circuitry.

To get acquainted with the architecture and to depart from things I work on more frequently, the project is so far taking the shape of an assembly library of functionality relevant for developing small projects, built mainly around binutils for the AVR. I did end up cheating a bit and compiling the assembly code via avr-gcc, just to get the __do_copy_data initialization routine injected, so that I don’t have to pull up the .data section from program memory into RAM manually.

I started running the test programs with the chip itself, with the help of a Pirate Bus, to see if the whole setup was sound. Once it worked a few times, I moved on to use the simulavr simulator to make the process of running and debugging more comfortable. In addition to being able to attach gdb, and trace execution, one of the nice features of simulavr is being able to map a port from the emulated CPU and get bytes written into it sent to an arbitrary file in the outer world. That means we can easily implement a trivial println-like function in assembly:

.set    STDOUT, 0x20

loop:   ld  r17, Z+
        cpi r17, 0
        breq done

        sts STDOUT, r17
        rjmp loop

done:   ldi r17, '\n
        sts STDOUT, r17

Printing strings is only helpful if we do have strings, though, and with such a skeleton system there are no interesting ones yet. What we do have are registers, lots of them (32 in total). A good candidate for the next function would then be an itoa-like function that would put the proper bytes in memory for printing.

So, after going down that road for a bit longer, the lack of a proper way to run tests on the created code was an evident show stopper. There’s no way the created code will be sane without being able to exercise it, and write tests that can be rerun at will. Fortunately, it’s easy enough to apply traditional testing practices to such an environment, given the simulator features mentioned.

To drive those tests, a small tool named avrtest was written in Go. It takes an avrtest.list file that looks like this:

devices: atmega328

[div8u]

main:
        ldi     r24, 128 ; dividend
        ldi     r22, 10  ; divisor
        call    div8u
        prnt8u  r24      ; result
        prnt8u  r22      ; divisor
        prnt8u  r20      ; remainder

want:
        12
        10
        8

[itoa8u]

cycle-limit: 400

main:
        ldi     r24, 128
        call    itoa8u
        prntz

want:
        128

and runs it, showing the typical test runner output:

% ./avrtest
div8u   ok  (784 cycles)
itoa8u  ok  (356 cycles)

or the typical failure, when appropriate:

div8u   failed: unexpected output
        want:
                13
                10
                8
        
        got:
                12
                10
                8

If the failure feels a bit cryptic, all of the intermediary files are kept under the ./_avrtest directory, including a detailed trace file. Here is a snippet of such a trace:

div8u.elf 0x0194: itoa8u      LDI R30, 0x0a 
div8u.elf 0x0196: itoa8u+0x1  LDI R31, 0x01 
div8u.elf 0x0198: itoa8u+0x2  PUSH R17 SP=0x8f6 0x1 
div8u.elf 0x0198: itoa8u+0x2  CPU-waitstate
div8u.elf 0x019a: itoa8u+0x3  LDI R17, 0x30 
div8u.elf 0x019c: itoa8u+0x4  LDI R22, 0x0a 
div8u.elf 0x019e: itoa8u_loop CALL 0x178 SP=0x8f5 0xd1 SP=0x8f4 0x0

Besides that, we should be able to attach gdb to any given test by running the command avrtest gdb <name>. That’s not yet there, but should be pretty soon, after the next cryptic breakage. :-)

That tooling is not organized for a proper release, but I’ll certainly push it up to a public repository as soon as I get a chance to clean up the sandbox.

Read more
niemeyer

As part of one of the projects we’ve been pushing at Canonical, I spent a few days researching about the possibility of extending a compiled Go application with a tiny language that would allow expressing simple procedural logic in a controlled environment. Although we’re not yet sure of the direction we’ll take, the result of this short experiment is being released as the twik language for open fiddling.

The implementation is straightforward, with under 400 lines for the parser and evaluator, and under 350 lines in the default functions provided for the language skeleton: var, func, do, if, and, or, etc.

It also comes with an interactive interpreter to play with. You can install it with:

$ go get launchpad.net/twik/cmd/twik

This is a short sample session:

> (var x 1)
> x
1
> (set x 2)
> x
2
> (set x (func (n) (+ n 1)))
> x
#func
> (x 1)
2
> (func inc (n) (+ n 1))
#func
> (inc 42)
43

Another one demonstrating the lexical scoping:

> (var add
.      (do
.          (var n 0)
.          (func (m) (set n (+ n m)) n)
.      )
. )
> (add 5)
5
> (add -1)
4
> n
twik source:1:1: undefined symbol: n

New functionality may be plugged in by providing Go functions. For example, here is a simple printf function:

func printf(args []interface{}) (interface{}, error) {
        if len(args) > 0 {
                if format, ok := args[0].(string); ok {
                        _, err := fmt.Printf(format, args[1:]...)
                        return nil, err
                }
        }
        return nil, fmt.Errorf("printf takes a format string")
}

func main() {
        ...
        err = scope.Create("printf", printf)
        ...
}

It can now greet the world:

$ cat test.twik

(func hello (name)
      (printf "Hello %s!\n" name)
)

(hello "world")

$ time ./twik test.twik
Hello world!
./twik test.twik  0.00s user 0.00s system 74% cpu 0.005 total

Read more
niemeyer

Note: This is a candidate version of the specification. This note will be removed once v1 is closed, and any changes will be described at the end. Please get in touch if you’re implementing it.

Contents


Introduction

This specification defines strepr, a stable representation that enables computing hashes and cryptographic signatures out of a defined set of composite values that is commonly found across a number of languages and applications.

Although the defined representation is a serialization format, it isn’t meant to be used as a traditional one. It may not be seen entirely in memory at once, or written to disk, or sent across the network. Its role is specifically in aiding the generation of hashes and signatures for values that are serialized via other means (JSON, BSON, YAML, HTTP headers or query parameters, configuration files, etc).

The format is designed with the following principles in mind:

Understandable — The representation must be easy to understand to increase the chances of it being implemented correctly.

Portable — The defined logic works properly when the data is being transferred across different platforms and implementations, independently from the choice of protocol and serialization implementation.

Unambiguous — As a natural requirement for producing stable hashes, there is a single way to process any supported value being held in the native form of the host language.

Meaning-oriented — The stable representation holds the meaning of the data being transferred, not its type. For example, the number 7 must be represented in the same way whether it’s being held in a float64 or in an uint16.


Supported values

The following values are supported:

  • nil: the nil/null/none singleton
  • bool: the true and false singletons
  • string: raw sequence of bytes
  • integers: positive, zero, and negative integer numbers
  • floats: IEEE754 binary floating point numbers
  • list: sequence of values
  • map: associative value→value pairs


Representation

nil = 'z'

The nil/null/none singleton is represented by the single byte 'z' (0x7a).

bool = 't' / 'f'

The true and false singletons are represented by the bytes 't' (0x74) and 'f' (0x66), respectively.

unsigned integer = 'p' <value>

Positive and zero integers are represented by the byte 'p' (0x70) followed by the variable-length encoding of the number.

For example, the number 131 is always represented as {0x70, 0x81, 0x03}, independently from the type that holds it in the host language.

negative integer = 'n' <absolute value>

Negative integers are represented by the byte 'n' (0x6e) followed by the variable-length encoding of the absolute value of the number.

For example, the number -131 is always represented as {0x6e, 0x81, 0x03}, independently from the type that holds it in the host language.

string = 's' <num bytes> <bytes>

Strings are represented by the byte 's' (0x73) followed by the variable-length encoding of the number of bytes in the string, followed by the specified number of raw bytes. If the string holds a list of Unicode code points, the raw bytes must contain their UTF-8 encoding.

For example, the string hi is represented as {0x73, 0x02, 'h', 'i'}

Due to the complexity involved in Unicode normalization, it is not required for the implementation of this specification. Consequently, Unicode strings that if normalized would be equal may have different stable representations.

binary float = 'd' <binary64>

32-bit or 64-bit IEEE754 binary floating point numbers that are not holding integers are represented by the byte 'd' (0x64) followed by the big-endian 64-bit IEEE754 binary floating point encoding of the number.

There are two exceptions to that rule:

1. If the floating point value is holding a NaN, it must necessarily be encoded by the following sequence of bytes: {0x64, 0x7f, 0xf8, 0x00 0x00, 0x00, 0x00, 0x00, 0x00}. This ensures all NaN values have a single representation.

2. If the floating point value is holding an integer number it must instead be encoded as an unsigned or negative integer, as appropriate. Floating point values that hold integer numbers are defined as those where floor(v) == v && abs(v) != ∞.

For example, the value 1.1 is represented as {0x64, 0x3f, 0xf1, 0x99, 0x99, 0x99, 0x99, 0x99, 0x9a}, but the value 1.0 is represented as {0x70, 0x01}, and -0.0 is represented as {0x70, 0x00}.

This distinction means all supported numbers have a single representation, independently from the data type used by the host language and serialization format.

list = 'l' <num items> [<item> ...]

Lists of values are represented by the byte 'l' (0x6c), followed by the variable-length encoding of the number of pairs in the list, followed by the stable representation of each item in the list in the original order.

For example, the value [131, -131] is represented as {0x6c, 0x70, 0x81, 0x03, 0x6e, 0x81, 0x03, 0x65}

map = 'm' <num pairs> [<item key> <item value>  ...]

Associative maps of values are represented by the byte 'm' (0x6d) followed by the variable-length encoding of the number of pairs in the map, followed by an ordered sequence of the stable representation of each key and value in the map. The pairs must be sorted so that the stable representation of the keys is in ascending lexicographical order. A map must not have multiple keys with the same representation.

For example, the map {"a": 4, 5: "b"} is always represented as {0x6d, 0x02, 0x70, 0x05, 0x73, 0x01, 'b', 0x73, 0x01, 'a', 0x70, 0x04}.


Variable-length encoding

Integers are variable-length encoded so that they can be represented in short space and with unbounded size. In an encoded number, the last byte holds the 7 least significant bits of the unsigned value, and zero as the eight bit. If there are remaining non-zero bits, the previous byte holds the next 7 bits, and the eight bit is set on to flag the continuation to the next byte. The process continues until there are non-zero bits remaining. The most significant bits end up in the first byte of the encoded value, which must necessarily not be 0x80.

For example, the number 128 is variable-length encoded as {0x81, 0x00}.


Reference implementation

A reference implementation is available, including a test suite which should be considered when implementing the specification.


Changes

draft1 → draft2

  • Enforce the use of UTF-8 for Unicode strings and explain why normalization is being left out.
  • Enforce a single NaN representation for floats.
  • Explain that map key uniqueness refers to the representation.
  • Don’t claim the specification is easy to implement; floats require attention.
  • Mention reference implementation.

Read more
niemeyer

The very first time the concepts behind the juju project were presented, by then still under the prototype name of Ubuntu Pipes, was about four years ago, in July of 2009. It was a short meeting with Mark Shuttleworth, Simon Wardley, and myself, when Canonical still had an office on a tall building by the Thames. That was just the seed of a long road of meetings and presentations that eventually led to the codification of these ideas into what today is a major component of the Ubuntu strategy on servers.

Despite having covered the core concepts many times in those meetings and presentations, it recently occurred to me that they were never properly written down in any reasonable form. This is an omission that I’ll attempt to fix with this post while still holding the proper context in mind and while things haven’t changed too much.

It’s worth noting that I’ve stepped aside as the project technical lead in January, which makes more likely for some of these ideas to take a turn, but they are still of historical value, and true for the time being.

Contents

This post is long enough to deserve an index, but these sections do build up concepts incrementally, so for a full understanding sequential reading is best:


Classical deployments

In a simplistic sense, deploying an application means configuring and running a set of processes in one or more machines to compose an integrated system. This procedure includes not only configuring the processes for particular needs, but also appropriately interconnecting the processes that compose the system.

The following figure depicts a simple example of such a scenario, with two frontend machines that had the Wordpress software configured on them to serve the same content out of a single backend machine running the MySQL database.

Deploying even that simple environment already requires the administrator to deal with a variety of tasks, such as setting up physical or virtual machines, provisioning the operating system, installing the applications and the necessary dependencies, configuring web servers, configuring the database, configuring the communication across the processes including addresses and credentials, firewall rules, and so on. Then, once the system is up, the deployed system must be managed throughout its whole lifecycle, with upgrades, configuration changes, new services integrated, and more.

The lack of a good mechanism to turn all of these tasks into high-level operations that are convenient, repeatable, and extensible, is what motivated the development of juju. The next sections provide an overview of how these problems are solved.


Preparing a blank slate

Before diving into the way in which juju environments are organized, a few words must be said about what a juju environment is in the first place.

All resources managed by juju are said to be within a juju environment, and such an environment may be prepared by juju itself as long as the administrator has access to one of the supported infrastructure providers (AWS, OpenStack, MAAS, etc).

In practice, creating an environment is done by running juju’s bootstrap command:

$ juju bootstrap

This will start a machine in the configured infrastructure provider and prepare the machine for running the juju state server to control the whole environment. Once the machine and the state server are up, they’ll wait for future instructions that are provided via follow up commands or alternative user interfaces.


Service topologies

The high-level perspective that juju takes about an environment and its lifecycle is similar to the perspective that a person has about them. For instance, although the classical deployment example provided above is simple, the mental model that describes it is even simpler, and consists of just a couple of communicating services:

That’s pretty much the model that an administrator using juju has to input into the system for that deployment to be realized. This may be achieved with the following commands:

$ juju deploy cs:precise/wordpress
$ juju deploy cs:precise/mysql
$ juju add-relation wordpress mysql

These commands will communicate with the previously bootstrapped environment, and will input into the system the desired model. The commands themselves don’t actually change the current state of the deployed software, but rather inform the juju infrastructure of the state that the environment should be in. After the commands take place, the juju state server will act to transform the current state of the deployment into the desired one.

In the example described, for instance, juju starts by deploying two new machines that are able to run the service units responsible for Wordpress and MySQL, and configures the machines to run agents that manipulate the system as needed to realize the requested model. An intermediate stage of that process might conceptually be represented as:

topology-step-1

The service units are then provided with the information necessary to configure and start the real software that is responsible for the requested workload (Wordpress and MySQL themselves, in this example), and are also provided with a mechanism that enables service units that were related together to easily exchange data such as addresses, credentials, and so on.

At this point, the service units are able to realize the requested model:

topology-step-2

This is close to the original scenario described, except that there’s a single frontend machine running Wordpress. The next section details how to add that second frontend machine.


Scaling services horizontally

The next step to match the original scenario described is to add a second service unit that can run Wordpress, and that can be achieved by the single command:

$ juju add-unit wordpress

No further commands or information are necessary, because the juju state server understands what the model of the deployment is. That model includes both the configuration of the involved services and the fact that units of the wordpress service should talk to units of the mysql service.

This final step makes the deployed system look equivalent to the original scenario depicted:

topology-step-3

Although that is equivalent to the classic deployment first described, as hinted by these examples an environment managed by juju isn’t static. Services may be added, removed, reconfigured, upgraded, expanded, contracted, and related together, and these actions may take place at any time during the lifetime of an environment.

The way that the service reacts to such changes isn’t enforced by the juju infrastructure. Instead, juju delegates service-specific decisions to the charm that implements the service behavior, as described in the following section.


Charms

A juju-managed environment wouldn't be nearly as interesting if all it could do was constrained by preconceived ideas that the juju developers had about what services should be supported and how they should interact among themselves and with the world.

Instead, the activities within a service deployed by juju are all orchestrated by a juju charm, which is generally named after the main software it exposes. A charm is defined by its metadata, one or more executable hooks that are called after certain events take place, and optionally some custom content.

The charm metadata contains basic declarative information, such as the name and description of the charm, relationships the charm may participate in, and configuration options that the charm is able to handle.

The charm hooks are executable files with well-defined names that may be written in any language. These hooks are run non-concurrently to inform the charm that something happened, and they give a chance for the charm to react to such events in arbitrary ways. There are hooks to inform that the service is supposed to be first installed, or started, or configured, or for when a relation was joined, departed, and so on.

This means that in the previous example the service units depicted are in fact reporting relevant events to the hooks that live within the wordpress charm, and those hooks are the ones responsible for bringing the Wordpress software and any other dependencies up.

wordpress-service-unit

The interface offered by juju to the charm implementation is the same, independently from which infrastructure provider is being used. As long as the charm author takes some care, one can create entire service stacks that can be moved around among different infrastructure providers.


Relations

In the examples above, the concept of service relationships was introduced naturally, because it’s indeed a common and critical aspect of any system that depends on more than a single process. Interestingly, despite it being such a foundational idea, most management systems in fact pay little attention to how the interconnections are modeled.

With juju, it’s fair to say that service relations were part of the system since inception, and have driven the whole mindset around it.

Relations in juju have three main properties: an interface, a kind, and a name.

The relation interface is simply a unique name that represents the protocol that is conventionally followed by the service units to exchange information via their respective hooks. As long as the name is the same, the charms are assumed to have been written in a compatible way, and thus the relation is allowed to be established via the user interface. Relations with different interfaces cannot be established.

The relation kind informs whether a service unit that deploys the given charm will act as a provider, a requirer, or a peer in the relation. Providers and requirers are complementary, in the sense that a service that provides an interface can only have that specific relation established with a service that requires the same interface, and vice-versa. Peer relations are automatically established internally across the units of the service that declares the relation, and enable easily clustering together these units to setup masters and slaves, rings, or any other structural organization that the underlying software supports.

The relation name uniquely identifies the given relation within the charm, and allows a single charm (and service and service units that use it) to have multiple relations with the same interface but different purposes. That identifier is then used in hook names relative to the given relation, user interfaces, and so on.

For example, the two communicating services described in examples might hold relations defined as:

wordpress-mysql-relation-details

When that service model is realized, juju will eventually inform all service units of the wordpress service that a relation was established with the respective service units of the mysql service. That event is communicated via hooks being called on both units, in a way resembling the following representation:

wordpress-mysql-relation-workflow

As depicted above, such an exchange might take the following form:

  1. The administrator establishes a relation between the wordpress service and the mysql service, which causes the service units of these services (wordpress/1 and mysql/0 in the example) to relate.
  2. Both service units concurrently call the relation-joined hook for the respective relation. Note that the hook is named after the local relation name for each unit. Given the conventions established for the mysql interface, the requirer side of the relation does nothing, and the provider informs the credentials and database name that should be used.
  3. The requirer side of the relation is informed that relation settings have changed via the relation-changed hook. This hook implementation may pick up the provided settings and configure the software to talk to the remote side.
  4. The Wordpress software itself is run, and establishes the required TCP connection to the configured database.

In that workflow, neither side knows for sure what service is being related to. It would be feasible (and probably welcome) to have the mysql service replaced by a mariadb service that provided a compatible mysql interface, and the wordpress charm wouldn’t have to be changed to communicate with it.

Also, although this example and many real world scenarios will have relations reflecting TCP connections, this may not always be the case. It’s reasonable to have relations conveying any kind of metadata across the related services.


Configuration

Service configuration follows the same model of metadata plus executable hooks that was described above for relations. A charm can declare what configuration settings it expects in its metadata, and how to react to setting changes in an executable hook named config-changed. Then, once a valid setting is changed for a service, all of the respective service units will have that hook called to reflect the new configuration.

Changing a service setting via the command line may be as simple as:

$ juju set wordpress title="My Blog"

This will communicate with the juju state server, record the new configuration, and consequently incite the service units to realize the new configuration as described. For clarity, this process may be represented as:

config-changed


Taking from here

This conceptual overview hopefully provides some insight into the original thinking that went into designing the juju project. For more in-depth information on any of the topics covered here, the following resources are good starting points:

Read more
niemeyer

Since relatively early in the public life of the Go language, I’ve been involved in pushing forward packages that might be used in Ubuntu, including making the compiler suite itself happier in such packaged environments. In due time, these packages were moved over to an automatic build system, so that people wouldn’t have to rely on my good will to have up-to-date packages, nor would I have to be regularly spending time maintaining those packages. Or so was the theory.

It’s well known that the real world is not so plain, though, and issues became much more regular than hoped. Some of the issues were caused by changes in the build conventions of Go, others self-inflicted due to my limited knowledge of the extensive conventions around packaging, or bugs in indirect dependencies of the process, and more recently the sub-optimal scheduling algorithm used by the build farm has driven the builds to a halt.

So, the question is how to get out of this rabbit hole, but still give people a convenient way to use Go in Ubuntu.

Enter godeb, an experiment that dynamically translates the upstream builds of Go into deb packages. In practice, it’s a simple standalone Go program that can parse the build list, fetch the requested version, and in memory translate the contents into a correct binary deb package.

Since you cannot build a Go application without a Go compiler first, there’s an x86 32-bit binary and an x86 64-bit binary of godeb available for download. After the compiler is installed, godeb may be fetched and rebuilt locally by running go get launchpad.net/godeb.

Once the godeb binary is available, it’s easy to get up-to-date packages:

$ ./godeb install
processing https://go.googlecode.com/files/go1.1.1.linux-amd64.tar.gz
package go_1.1.1-godeb1_amd64.deb ready
Selecting previously unselected package go.
(Reading database ... 488515 files and (...) installed.)
Unpacking go (from go_1.1.1-godeb1_amd64.deb) ...
Setting up go (1.1.1-godeb1) ...

It figures what the most recent build available is, downloads, translates, and installs it, asking for a password via sudo if necessary. Running godeb install again will fetch the latest version (or the requested one) and replace the currently installed package. Package installs default to the same architecture of godeb itself, and may be changed by setting the GOARCH environment variable to 386 or amd64, borrowing from a Go convention.

New releases of Go are immediately available, and so are the old ones:

$ ./godeb list
1.2
1.2rc5
1.2rc4
1.2rc3
1.2rc2
1.2rc1
1.1.2
1.1.1
1.1
(...)

$ ./godeb -h
Usage: godeb <command> [<options> ...]

Available commands:

    list
    install [<version>]
    download [<version>]
    remove

For the time being, I’m holding up maintenance of the Go PPA in Launchpad in favor of this system. Of course, you can still install the golang-* packages on Ubuntu 12.10 and 13.04 from the official repositories as usual.

Read more
niemeyer

A few years ago, when I started pondering about the possibility of porting juju to the Go language, one of the first pieces of the puzzle that were put in place was goyaml: a Go package to parse and serialize a yaml document. This was just an experiment and, as a sane route to get started, a Go layer that does all the language-specific handling was written on top of the libyaml C scanner, parser, and serializer library.

This was a good initial plan, but for a number of reasons the end goal was always to have a pure Go implementation. Having a C layer in a Go program slows down builds significantly due to the time taken to build the C code, makes compiling in other platforms and cross-compiling harder, has certain runtime penalties, and also forces the application to drop the memory safety guarantees offered by Go.

For these reasons, over the last couple of weeks I took a few hours a day to port the C backend to Go. The total time, considering full time work days, would be equivalent to about a week worth of work.

The work started on the scanner and parser side of the library. This took most of the time, not only because it encompassed more than half of the code base, but also because the shared logic had to be ported too, and there was a need to understand which patterns were used in the old code and how they would be converted across in a reasonable way.

The whole scanner and parser plus header files, or around 5000 code lines of C, were ported over in a single shot without intermediate runs. To steer the process in a sane direction, gofmt was called often to reformat the converted code, and then the project was compiled every once in a while to make sure that the pieces were hanging together properly enough.

It’s worth highlighting how useful gofmt was in that process. The C code was converted in the most convenient way to type it, and then gofmt would quickly put it all together in a familiar form for analysis. Not rarely, it would also point out trivial syntactic issues. A double win.

After the scanner and parser were finally converted completely, the pre-existing Go unmarshaling logic was shifted to the new pure implementation, and the reading side of the test suite could run as-is. Naturally, though, it didn’t work out of the box.

To quickly pick up the errors in the new implementation, the C logic and the Go port were put side-by-side to run the same tests, and tracing was introduced in strategic points of the scanner and parser. With that, it was easy to spot where they diverged and pinpoint the human errors.

It took about two hours to get the full suite to run successfully, with a handful of bugs uncovered. Out of curiosity, the issues were:

  • An improperly dropped parenthesis affected the precedence of an expression
  • A slice was being iterated with copying semantics where a reference was necessary
  • A pointer arithmetic conversion missed the base where there was base+offset addressing
  • An inner scoped variable improperly shadowed the outer scope

The same process of porting and test-fixing was then repeated on the the serializing side of the project, in a much shorter time frame for the reasons cited.

The resulting code isn’t yet idiomatic Go. There are several signs in it that it was ported over from C: the name conventions, the use of custom solutions for buffering and reader/writer abstractions, the excessive copying of data due to the need of tracking data ownership so the simple deallocating destructors don’t double-free, etc. It’s also been deoptimized, due to changes such as the removal of macros and in many cases its inlining, and the direct expansion of large unions which causes some core objects to grow significantly.

At this point, though, it’s easy to gradually move the code base towards the common idiom in small increments and as time permits, and cleaning up those artifacts that were left behind.

This code will be made public over the next few days via a new goyaml release. Meanwhile, some quick facts about the process and outcome follows.

Lines of code

According to cloc, there was a total of 7070 lines of C code in .c and .h files. Of those, 6727 were ported, and 342 were 12 functions that were left unconverted as being unnecessary right now. Those 6727 lines of C became 5039 lines of Go code in a mostly one-to-one dumb translation.

That difference comes mainly from garbage collection, lack of forward declarations, standard helpers such as append, range-based for loops, first class slice type with length and capacity, internal OOM handling, and so on.

Future work code can easily increase the difference further by replacing some of the logic ported with more sensible options available in Go, such as standard abstractions for readers and writers, buffered writing support as availalbe in the standard library, etc.

Code clarity and safety

In the specific context of the work done, which is of a scanner, parser and serializer, the slice abstraction is responsible for noticeable clarity gains in the code, when compared to the equivalent logic based on pointer arithmetic. It also gives a much more comforting guarantee of correctness of the written code due to bound-checking.

Performance

While curious, this shouldn’t be taken as a performance comparison between the two languages, as it is comparing a fine tuned C implementation with something that is worse than a direct one-to-one port: not only it hasn’t seen any time at all on preventing waste, but the original logic was deoptimized due to changes such as the removal of inlining macros and the expansion of large unions. There are many obvious changes to be done for improving performance.

With that out of the way, in a simple decoding benchmark the C-backed decoder runs on about 37% of the time taken by the out-of-the-box deoptimized Go port.

Output size

The previous goyaml.a Go package file had 1463kb. The new one has 1016kb. This difference includes glue code generated for the integration.

Considering only the .c and .h files involved in the port, the C object code generated with the standard flags used by the go build tool (-g -O2) sums up to 789kb. The equivalent Go code with the standard settings compiles to 664kb. The 12 functions not ported are also part of that difference, so the difference is pretty much negligible.

Build time

Building the 8 .c files alone takes 3.6 seconds with the standard flags used by the go build tool (-g -O2). After the port, building the entire Go project with the standard settings takes 0.3 seconds.

Mechanical changes

Many of the mechanical changes were done using regular expressions. Excluding the trivial ones, about a dozen regular expressions were used to swap variable and type names, drop parenthesis, place brackets in the right locations, convert function declarations, and so on.

Read more
niemeyer

Last week I was part of a rant with a couple of coworkers around the fact Go handles errors for expected scenarios by returning an error value instead of using exceptions or a similar mechanism. This is a rather controversial topic because people have grown used to having errors out of their way via exceptions, and Go brings back an improved version of a well known pattern previously adopted by a number of languages — including C — where errors are communicated via return values. This means that errors are in the programmer’s face and have to be dealt with all the time. In addition, the controversy extends towards the fact that, in languages with exceptions, every unadorned error comes with a full traceback of what happened and where, which in some cases is convenient.

All this convenience has a cost, though, which is rather simple to summarize:

Exceptions teach developers to not care about errors.

A sad corollary is that this is relevant even if you are a brilliant developer, as you’ll be affected by the world around you being lenient towards error handling. The problem will show up in the libraries that you import, in the applications that are sitting in your desktop, and in the servers that back your data as well.

Raymond Chen described the issue back in 2004 as:

Writing correct code in the exception-throwing model is in a sense harder than in an error-code model, since anything can fail, and you have to be ready for it. In an error-code model, it’s obvious when you have to check for errors: When you get an error code. In an exception model, you just have to know that errors can occur anywhere.

In other words, in an error-code model, it is obvious when somebody failed to handle an error: They didn’t check the error code. But in an exception-throwing model, it is not obvious from looking at the code whether somebody handled the error, since the error is not explicit.
(…)
When you’re writing code, do you think about what the consequences of an exception would be if it were raised by each line of code? You have to do this if you intend to write correct code.

That’s exactly right. Every line that may raise an exception holds a hidden “else” branch for the error scenario that is very easy to forget about. Even if it sounds like a pointless repetitive task to be entering that error handling code, the exercise of writing it down forces developers to keep the alternative scenario in mind, and pretty often it doesn’t end up empty.

It isn’t the first time I write about that, and given the controversy that surrounds these claims, I generally try to find one or two examples that bring the issue home. So here is the best example I could find today, within the pty module of Python’s 3.3 standard library:

def spawn(argv, master_read=_read, stdin_read=_read):
    """Create a spawned process."""
    if type(argv) == type(''):
        argv = (argv,)
    pid, master_fd = fork()
    if pid == CHILD:
        os.execlp(argv[0], *argv)
    (...)

Every time someone calls this logic with an improper executable in argv there will be a new Python process lying around, uncollected, and unknown to the application, because execlp will fail, and the process just forked will be disregarded. It doesn’t matter if a client of that module catches that exception or not. It’s too late. The local duty wasn’t done. Of course, the bug is trivial to fix by adding a try/except within the spawn function itself. The problem, though, is that this logic looked fine for everybody that ever looked at that function since 1994 when Guido van Rossum first committed it!

Here is another interesting one:

$ make clean
Sorry, command-not-found has crashed! Please file a bug report at:

https://bugs.launchpad.net/command-not-found/+filebug

Please include the following information with the report:

command-not-found version: 0.3
Python version: 3.2.3 final 0
Distributor ID: Ubuntu
Description:    Ubuntu 13.04
Release:        13.04
Codename:       raring
Exception information:

unsupported locale setting
Traceback (most recent call last):
  File "/.../CommandNotFound/util.py", line 24, in crash_guard
    callback()
  File "/usr/lib/command-not-found", line 69, in main
    enable_i18n()
  File "/usr/lib/command-not-found", line 40, in enable_i18n
    locale.setlocale(locale.LC_ALL, '')
  File "/usr/lib/python3.2/locale.py", line 541, in setlocale
    return _setlocale(category, locale)
locale.Error: unsupported locale setting

That’s a pretty harsh crash for the lack of locale data in a system-level application that is, ironically, supposed to tell users what packages to install when commands are missing. Note that at the top of the stack there’s a reference to crash_guard. This function has the intent of catching all exceptions right at the edge of the call stack, and displaying a detailed system specification and traceback to aid in fixing the problem.

Such “parachute catching” is a fairly common pattern in exception-oriented programming and tends to give developers the false sense of having good error handling within the application. Rather than actually guarding the application, though, it’s just a useful way to crash. The proper thing to have done in the case above would be to print a warning, if at all, and then let the program run as usual. This would have been achieved by simply wrapping that one line as in:

try:
    locale.setlocale(locale.LC_ALL, '')
except Exception as e:
    print("Cannot change locale:", e)

Clearly, it was easy to handle that one. The problem, again, is that it was very natural to not do it in the first place. In fact, it’s more than natural: it actually feels good to not be looking at the error path. It’s less code, more linear, and what’s left is the most desired outcome.

The consequence, unfortunately, is that we’re immersing ourselves in a world of brittle software and pretty whales. Although more verbose, the error result style builds the correct mindset: does that function or method have a possible error outcome? How is it being handled? Is that system-interacting function not returning an error? What is being done with the problem that, of course, can happen?

A surprising number of crashes and plain misbehavior is a result of such unconscious negligence.

Read more
niemeyer

This weekend the proper environment settled out for sorting a pet peeve that shows up every once in a while when coding: writing logic that interacts with other applications in the system via their stdin and stdout streams is often more involved than it should be, which seems pretty ironic when sitting in front of a Unix-like system.

Rather than going over the trouble of setting up pipes and hooking them up in a custom way, often applications end up just delegating the job to /bin/sh, which is not ideal for a number of reasons: argument formatting isn’t straightforward, injecting custom application-defined logic is hard, which means even simple tasks that might be easily achieved by the language end up shelling out to further external applications, and so on.

In an attempt to address that, I’ve spent some time working on an experimental Go package that is being released today: pipe.

I hope you like it as well, and please drop me a note if you find any issues.

Read more
niemeyer

There are a number of common misconceptions in software development surrounding the idea of concurrency. This has been coming for decades, and some of the issues have just been reinforced one more time in an otherwise interesting post in LinkedIn’s engineering blog that recommends their development framework.

Such issues may be observed throughout the post, but can be elucidated via this short paragraph:

As we saw with the Scala and JavaScript examples above, for very simple cases, the Evented (asynchronous) code is generally more complicated than Threaded (synchronous) code. However, in most real-world scenarios, you’ll have to make several I/O calls, and to make them fast, you’ll need to do them in parallel.

At a glance, this may look like a sane proposition. There’s agreement that an asynchronous API or framework is one that does not block the flow of execution when faced with a task that has a long or non-predictable deadline, and this coding style is harder for human beings to get right. For example, if you see code such as:

data = read(filename)

There’s less brain work to process and build on it than so called asynchronous logic such as:

read(filename, callback)

It’s also true that there are important interfaces that follow the asynchronous style to prevent resource waste. Some of these exist in the kernel I/O API.

So what’s the issue, then?

There are a few. The first one is the statement that to make I/O scale you have to do it in parallel. That’s clearly not true. Scalable I/O requires your program to not waste an irresponsible amount of memory and CPU per operation. This may be achieved with simple concurrent techniques, and concurrency is not parallelism.

This drives to the next point, which is the strong association between synchronous programming and threads. You can have synchronous programming, and its simplified mental model, without operating system threads. This can be done by having a compiler and runtime that is mindful about performance and resource consumption, building on the efficient interfaces to implement its abstractions.

These ideas have also been covered in this paper from 2003, including benchmark results that debunk the performance myth. What seems most interesting about this paper is that it theorizes such a compiler and runtime that would allow “overcom[ing] limitations in current threads packages and improv[ing] safety, programmer productivity, and performance”, by using techniques such as dynamic stack growth, stack moving, cheaper synchronization, and compile-time data race detection.

That exact mix, including all of the properties described in the paper, are available today in the Go language. You can have synchronous programming, concurrency, parallelism, and performance. We live in the future.

Read more
niemeyer

Lately I’ve been considering the amount of waste we produce during software development, and how to increase the amount of recycled content. I’m not talking about actual trash, though, but rather about software development artifacts.

Over the years, we’ve learned about and put in practice several means for improving the quality and success rate of projects we create or contribute to. We have practices such as sprints to get people together with high communication bandwidth; we have code reviews for sharing knowledge and improving project quality; we’ve got technical leadership roles to mentor developers and guide the progress of projects; we’ve created kanban boards and burndown charts to help people visualize what they’re going through; and so on.

While all of that seems to have helped tremendously, there’s a sad fact about where we stand: the artifacts of most of these processes are local to their context, and very sensitive to time. That burndown chart is meaningless after it’s burned, and a kanban has no relevant history. Our technical leads indeed guide their teams, but their wisdom stays with the few people that had the chance to interact with them, and subjectively so. That brilliant code review from our best developers has a very limited audience, and rarely carries any meaning just days after it has been accomplished.

That last one is specially interesting. The process of reviewing code is an intense task, very expensive, and that takes a significant portion of the life of an active developer, and even then very little is carried forward as the outcome of that process. We have no effective means or even culture of sharing the generated wisdom to other teams. In fact, we rarely share these details even within the team itself. Why was that line changed like this? Why an interface like that is a bad idea? Who will instruct the new guy next week, and where did we record a bit of the wisdom of the brilliant guy that has left the company recently?

Unfortunately there’s probably no easy solution for this problem. At this point, I mainly recognize that most of the efforts I’ve lead to improve software development for the past several years had a very limited scope. The software itself became immediately better as a result of my efforts, its design became more sensible, and hopefully I contributed a bit to the growth of people around me, but at a company or even community-wide scope, all of these code reviews, sprints, and IRC conversations are buried for very rare revives.

I want to start doing something about this, though. There must be a way to shape these conversations in a more reusable format; in a way that knowledge and agreement can be more proactively preserved and scattered. Perhaps it’s more about how than it is about what. Perhaps we just need to write more posts like this, and cover more topics related to daily development findings. Not sure. I’ll be thinking…

Read more
niemeyer

I’m glad to announce experimental support for multi-document transactions in the mgo driver that integrates MongoDB with the Go language. The support is done via a driver extension, so it works with any MongoDB release supported by the driver (>= 1.8).

Features

Here is a quick highlight list to get your brain ticking before the details:

  • Supports sharding
  • Operations may span multiple collections
  • Handles changes, inserts and removes
  • Supports pre-conditions
  • Self-healing
  • No additional locks or leases
  • Works with existing data

Let’s see what these actually mean and how the goodness is done.


The problem being addressed

The typical example is a bank transaction: imagine you have two documents representing accounts for different people, and you want to transfer 100 bucks from Aram to Ben. Despite the apparent simplicity in that description, there are a number of edge cases that turn it into a non-trivial change.

Imagine an agent processing the change following these steps:

  1. Is Ben’s account valid?
  2. Take 100 bucks out of Aram’s account if its balance is above 100
  3. Insert 100 bucks into Ben’s account

Note that this description already assumes the availability of some single-document atomic operations as supported by MongoDB. Even then, how many race conditions and crash-related problems can you count? Here are some spoilers that hint at the problem complexity:

  • What if Ben cancels his account after (1)?
  • What if the agent crashes after (2)?

How it works

Thanks to the availability of single-document atomic operations, it is be possible to craft a sequence of changes that manipulate documents in a way that supports multi-document transactional behavior. This works as long as the clients agree to use the same conventions.

This isn’t exactly news, though, and there’s even documentation describing how one can explore these ideas. The challenge is in crafting a generic mechanism that not only does the basics but goes beyond by supporting inserts and removes, being workload agnostic, behaving correctly on crashes (!), and yet remaining pleasant to use. That’s the territory being explored.

The implemented semantics offers an isolation level that allows non-repeatable reads to occur (a partially committed transaction is visible), but the changes are guaranteed to only be visible in the order specified in the transaction, and once any change is done the transaction is guaranteed to be applied completely without intervening changes in the affected documents (no dirty reads). Among other things, this means one can use any existing mechanism at read time.

When writing documents that are affected by the transaction mechanism, one must necessarily use the API of the new mgo/txn package, which ended up surprisingly thin and straightforward. In other words for emphasis: if you modify fields that are affected by the transaction mechanism both with and without mgo/txn, it will misbehave arbitrarily. Fields that are read or written by mgo/txn must only be changed using mgo/txn.

Using the example described above, the bank account transfer might be done as:

runner := txn.NewRunner(tcollection)
ops := []txn.Op{{
        C:      "accounts", 
        Id:     "aram",
        Assert: M{"balance": M{"$gte": 100}},
        Update: M{"$inc": M{"balance": -100}},
}, {
        C:      "accounts",
        Id:     "ben",
        Assert: M{"valid": true},
        Update: M{"$inc": M{"balance": 100}},
}}
id := bson.NewObjectId() // Optional
err := runner.Run(ops, id, nil)

The assert and update values are usual MongoDB querying and updating documents. The tcollection is a MongoDB collection that is used to atomically insert the transaction details into the database. As long as that document makes it into the database, the transaction is guaranteed to be eventually entirely applied or entirely aborted. The exact moment when this happens is defined by whether there are other transactions in progress and whether a communication problem occurs and when it occurs, as described below.

Concurrency and crash-proofness

Perhaps the most interesting piece of the puzzle when coming up with a nice transaction mechanism is defining what happens when an agent misbehaves, even more in a world where there are multiple distributed transaction runners. If there are locks, someone must unlock when a runner crashes, and must know the difference between running slowly and crashing. If there are leases, the lease boundary becomes an issue. In both cases, the speed of the overall system would become bounded by the speed of the slowest runner.

Instead of falling onto those issues, the implemented mechanism observes the transactions being attempted on the affected documents, orders them in a globally agreed way, and pushes all of their operations concurrently.

To illustrate the behavior, imagine again the described scenario of bank transferences:

In this diagram there are two transactions being attempted, T1 and T2. The first is a transference from Aram to Ben, and the second is a transference from Ben to Carl. If a runner starts executing T2 while T1 is still being applied by a different runner, the first runner will pick T1 up and complete it before starting to work on T2 which is its real goal. This works even if the original runner of T1 died while it was in progress. In reality, there’s little difference between the original runner of T1 and another runner that observes T1 on its way.

There’s a chance that T1′s runner died too soon, though, and it hasn’t had a chance to even start the transaction by tagging Ben’s account document as participating in it. In that case, T2 will be pushed forward by its own runner independently, since there’s nothing on its way. T1 isn’t lost, though, and it may be resumed at any point by calling the runner’s Resume or ResumeAll methods.

The whole logic is implemented without introducing any new globally shared point of coordination. It works if documents are in different collections, different shards, and it works even if the transaction collection itself is sharded across multiple backends for scalability purposes.

The testing approach

While a lot of thinking was put onto the way the mechanism works, this is of course non-trivial and bug-inviting logic. In an attempt to nail down bugs early on, a testing environment was put in place to simulate multiple runners in a conflicting workload. To make matters more realistic, this simulation happens in a harsh scenario with faults and artificial slowdowns being randomly injected into the system. At the end, the result is evaluated to see if the changes performed respected the invariants established.

While hundreds of thousands of transactions have been successfully run in this fashion, the package should still be considered experimental at this point, and its API is still prone to change.

There’s one race

There’s one known race that’s worth mentioning, and it was consciously left there for the moment as a tradeoff. The race shows itself when inserting a new document, at the point in time when the decision has been made that the insert was genuinely good. At this exact moment, if that runner is frozen for long enough that would allow for a different runner to insert the document and remove it again, and then the original runner is unfrozen without any errors or timeouts, it will naturally go on and insert the new document.

There are multiple solutions for this problem, but they present their own disadvantages. One solution would be to manipulate the document instead of removing it, but that would leave the collection with ghost content that has to be cared for, and that’s an unwanted side effect. A second solution would be to use the internal applyOps machinery that MongoDB uses in its sharding implementation, but that would mean that collections affected by transactions couldn’t be sharded, which is another unwanted side effect (please vote for SERVER-1439 so we can use it).

Have fun!

I hope the package serves you well, and if you would like to talk further about it, please join the mgo-users mailing list and drop a message.

Read more
niemeyer

?Rob Pike just wrote an article/talk that is the best background on the origins of Go yet.

It surprises me how much his considerations match my world view pre-Go, and in a sense give me a fulfilling explanation about why I got hooked into the language. I still recall sitting in a hotel years ago with Jamu Kakar while we went through the upcoming C++0x standard (now C++11) and got perplexed about how someone could think that having details such as rvalue references and move constructors into the language specification was something reasonable.

Rob also expressed again the initial surprise that developers using languages such as Python and Ruby were more often the ones willing to migrate towards Go, rather than ones using C++, with some reasonable explanations about why that is so. While I agree with his considerations, I see Python going through the same kind of issue that caused C++ to be what it is today.

Consider this excerpt from PEP 0380 as evidence:

If yielding of values is the only concern, this can be performed without much difficulty using a loop such as

for v in g:
    yield v

However, if the subgenerator is to interact properly with the caller in the case of calls to send(), throw() and close(), things become considerably more difficult. As will be seen later, the necessary code is very complicated, and it is tricky to handle all the corner cases correctly.

A new syntax will be proposed to address this issue. In the simplest use cases, it will be equivalent to the above for-loop, but it will also handle the full range of generator behaviour, and allow generator code to be refactored in a simple and straightforward way.

This description has the same DNA that creates the C++ problem Rob talks about. Don’t get me wrong, I’m sure yield from will make a lot of people very happy, and that’s exactly the tricky part. It’s easy and satisfying to please a selection of users, but often that leads to isolated solutions that create new cognitive load and new corner cases that in turn lead to new requirements.

The history of generators in Python is specially telling:

  • PEP 0234 [30-Jan-2001] – Iterators – Accepted
  • PEP 0255 [18-May-2001] – Simple Generators – Accepted
  • PEP 0288 [21-Mar-2002] – Generators Attributes and Exceptions – Withdrawn
  • PEP 0289 [30-Jan-2002] – Generator Expressions – Accepted
  • PEP 0325 [25-Aug-2003] – Resource-Release Support for Generators – Rejected
  • PEP 0342 [10-May-2005] – Coroutines via Enhanced Generators – Accepted
  • PEP 0380 [13-Feb-2009] – Syntax for Delegating to a Subgenerator – Accepted

You see the rabbit hole getting deeper? I’ll clarify it further by rephrasing the previous quote from PEP 0380:

If [feature from PEP 0255] is the only concern, this can be performed without much difficulty using a loop [...] However, if the subgenerator is to interact properly with [changes from PEP 0342] things become considerably more difficult. [So we need feature from PEP 0380.]

Yet, while the language grows handling self-inflicted micro-problems, the real issue is still not solved. All of these features are simplistic forms of concurrency and communication, that don’t satisfy the developers, causing community fragmentation.

This happened to C++, to Python, and to many other languages. Go seems slightly special in that regard in the sense that its core development team has an outstanding respect for simplicity, yet dares to solve the difficult problems at their root, while keeping these solutions orthogonal so that they support each other. Less is more, and is not always straightforward.

Read more
Gustavo Niemeyer

A long time before I seriously got into using distributed version control systems (DVCS) such as Bazaar and Git for developing software, it was already well known to me how the mechanics of these systems worked, and why people benefited from them. That said, it wasn’t until I indeed started to use DVCS tools that I understood how much my daily workflow around code bases would be changed and improved.

This weekend, while flying home from MongoSV, I could experience that same feeling in relation to first class concurrency support in programming languages. Everybody knows how the feature may be used, but I have the feeling that until one actually experiences it in practice, it’s very hard to really understand how much the relationship with ordering while developing software may be improved.

I was having some fun working on improvements to Goetveld. This package allows Go programs to communicate with Rietveld servers to manipulate code review entries. The Rietveld API is a bit rough in a few places, and as a result some features of the package actually parse an HTML form to extract some data, before sending it back. You may have done something similar before while attempting to script a web site that wasn’t originally intended to be.

The interesting fact here is that this is an intrinsically serial procedure: load a form, change it, and send it back, right? Well, not really. As one might intuitively expect, establishing an SSL session and its underlying TCP connection are not instantaneous operations.

To give an idea, here is part of a dump of an SSL connection being initiated (that is, no HTTP data was sent yet) to codereview.appspot.com, originated from my home location:

# tcpdump -ttttt -i wlan0 'host codereview.appspot.com and port 443'
(...)
00:00:00.000000 IP (...)
00:00:00.000063 IP (...)
00:00:00.000562 IP (...)
00:00:00.341627 IP (...)
00:00:00.357009 IP (...)
00:00:00.357118 IP (...)
00:00:00.360362 IP (...)
00:00:00.360550 IP (...)
00:00:00.366011 IP (...)
00:00:00.689446 IP (...)
00:00:00.727693 IP (...)

That’s more than half a second before the application layer was even touched. So, turns out that to save that roundtrip time, we can start both the form loading and the form sending requests at the same time. By the time the form loading ends, processing the data locally is extremely fast, and we can complete the sending side by just providing the request body.

At this time you may be thinking something like “Ugh, that’s too much trouble.. why bother?”, and that highlights precisely the point I’d like to make: it is too much trouble because most people are used to languages that turn it into too much trouble, but the issue is not inherently complex. In fact, this is the entire implementation of this logic in Go:

func (r *Rietveld) UpdateIssue(issue *Issue) error {
        op := &opInfo{r: r, issue: issue}
        errs := make(chan error)
        ch := make(chan map[string]string, 1)
        go func() {
                errs <- r.do(&editLoadHandler{op: op, form: ch})
                close(ch)
        }()
        go func() {
                errs <- r.do(&editHandler{op: op, form: ch})
        }()
        return firstError(2, errs)
}

I'm not cheating. The procedure was being done serially before, with very similar logic. Previously it had to take the form variable itself from the first request and manually provide it to the next one. Now, instead of providing the form, it's providing a channel that will be used to send the form across. One might even argue that the channel makes the algorithm more natural, curiously.

This is the kind of procedure that becomes fun and natural to write, after having first class concurrency at hand for some time. But, as in the case of DVCS, it takes a while to get used to the idea that concurrency and simplicity are not necessarily at opposing ends.

Read more
niemeyer

Certainly one of the reasons why many people are attracted to the Go language is its first-class concurrency aspects. Features like communication channels, lightweight processes (goroutines), and proper scheduling of these are not only native to the language but are integrated in a tasteful manner.

If you stay around listening to community conversations for a few days there’s a good chance you’ll hear someone proudly mentioning the tenet:

Do not communicate by sharing memory; instead, share memory by communicating.

There is a blog post on the topic, and also a code walk covering it.

That model is very sensible, and being able to approach problems this way makes a significant difference when designing algorithms, but that’s not exactly news. What I address in this post is an open aspect we have today in Go related to this design: the termination of background activity.

As an example, let’s build a purposefully simplistic goroutine that sends lines across a channel:

type LineReader struct {
        Ch chan string
        r  *bufio.Reader
}

func NewLineReader(r io.Reader) *LineReader {
        lr := &LineReader{
                Ch: make(chan string),
                r:  bufio.NewReader(r),
        }
        go lr.loop()
        return lr
}

The type has a channel where the client can consume lines from, and an internal buffer
used to produce the lines efficiently. Then, we have a function that creates an initialized
reader, fires the reading loop, and returns. Nothing surprising there.

Now, let’s look at the loop itself:

func (lr *LineReader) loop() {
        for {
                line, err := lr.r.ReadSlice('n')
                if err != nil {
                        close(lr.Ch)
                        return
                }
                lr.Ch <- string(line)
        }
}

In the loop we'll grab a line from the buffer, close the channel in case of errors and stop, or otherwise send the line to the other side, perhaps blocking while the other side is busy with other activities. Should sound sane and familiar to Go developers.

There are two details related to the termination of this logic, though: first, the error information is being dropped, and then there's no way to interrupt the procedure from outside in a clean way. The error might be easily logged, of course, but what if we wanted to store it in a database, or send it over the wire, or even handle it taking in account its nature? Stopping cleanly is also a valuable feature in many circumstances, like when one is driving the logic from a test runner.

I'm not claiming this is something difficult to do, by any means. What I'm saying is that there isn't today an idiom for handling these aspects in a simple and consistent way. Or maybe there wasn't. The tomb package for Go is an experiment I'm releasing today in an attempt to address this problem.

The model is simple: a Tomb tracks whether the goroutine is alive, dying, or dead, and the death reason.

To understand that model, let's see the concept being applied to the LineReader example. As a first step, creation is tweaked to introduce Tomb support:

type LineReader struct {
        Ch chan string
        r  *bufio.Reader
        t  tomb.Tomb
}

func NewLineReader(r io.Reader) *LineReader {
        lr := &LineReader{
                Ch: make(chan string),
                r:  bufio.NewReader(r),
        }
        go lr.loop()
        return lr
}

Looks very similar. Just a new field in the struct, and the function that creates it hasn't even been touched.

Next, the loop function is modified to support tracking of errors and interruptions:

func (lr *LineReader) loop() {
        defer lr.t.Done()
        for {
                line, err := lr.r.ReadSlice('n')
                if err != nil {
                        close(lr.Ch)
                        lr.t.Kill(err)
                        return
                }
                select {
                case lr.Ch <- string(line):
                case <-lr.t.Dying():
                        close(lr.Ch)
                        return
                }
        }
}

Note a few interesting points here: first, Done is called to track the goroutine termination right before the loop function returns. Then, the previously loose error now goes into the Kill Tomb method, flagging the goroutine as dying. Finally, the channel send was tweaked so that it doesn't block in case the goroutine is dying for whatever reason.

A Tomb has both Dying and Dead channels returned by the respective methods, which are closed when the Tomb state changes accordingly. These channels enable explicit blocking until the state changes, and also to selectively unblock select statements in those cases, as done above.

With the loop modified as above, a Stop method can trivially be introduced to request the clean termination of the goroutine synchronously from outside:

func (lr *LineReader) Stop() error {
        lr.t.Kill(nil)
        return lr.t.Wait()
}

In this case the Kill method will put the tomb in a dying state from outside the running goroutine, and Wait will block until the goroutine terminates itself and notifies via the Done method as seen before. This procedure behaves correctly even if the goroutine was already dead or in a dying state due to internal errors, because only the first call to Kill with an actual error is recorded as the cause for the goroutine death. The nil value provided to t.Kill is used as a reason when terminating cleanly without an actual error, and it causes Wait to return nil once the goroutine terminates, flagging a clean stop per common Go idioms.

This is pretty much all that there is to it. When I started developing in Go I wondered if coming up with a good convention for this sort of problem would require more support from the language, such as some kind of goroutine state tracking in a similar way to what Erlang does with its lightweight processes, but it turns out this is mostly a matter of organizing the workflow with existing building blocks.

The tomb package and its Tomb type are a tangible representation of a good convention for goroutine termination, with familiar method names inspired in existing idioms. If you want to make use of it, go get the package with:

$ go get launchpad.net/tomb

The API documentation with details is available at:

http://gopkgdoc.appspot.com/pkg/launchpad.net/tomb

Have fun!

UPDATE 1: there was a minor simplification in the API since this post was originally written, and the post was changed accordingly.

UPDATE 2: there was a second simplification in the API since this post was originally written, and the post was changed accordingly once again to serve as reference.

Read more