Canonical Voices

Posts tagged with 'architecture'

niemeyer

One more Go library oriented towards building distributed systems hot off the presses: govclock. This one offers full vector clock support for the Go language. Vector clocks allow recording and analyzing the inherent partial ordering of events in a distributed system in a comfortable way.

The following features are offered by govclock, in addition to basic event tracking:

  • Compact serialization and deserialization
  • Flexible truncation (min/max entries, min/max update time)
  • Unit-independent update times
  • Traditional merging
  • Fast and memory efficient

If you’d like to know more about vector clocks, the Basho guys did a great job in the following pair of blog posts:

The following sample program demonstrates some sequential and concurrent events, dumping and loading, as well as merging of clocks. For more details, please look at the web page. The project is available under a BSD license.


package main

import (
    "launchpad.net/govclock"
    "fmt"
)

func main() {
    vc1 := govclock.New()
    vc1.Update([]byte("A"), 1)

    vc2 := vc1.Copy()
    vc2.Update([]byte("B"), 0)

    fmt.Println(vc2.Compare(vc1, govclock.Ancestor))   // => true
    fmt.Println(vc1.Compare(vc2, govclock.Descendant)) // => true

    vc1.Update([]byte("C"), 5)


    fmt.Println(vc1.Compare(vc2, govclock.Descendant)) // => false
    fmt.Println(vc1.Compare(vc2, govclock.Concurrent)) // => true

    vc2.Merge(vc1)

    fmt.Println(vc1.Compare(vc2, govclock.Descendant)) // => true

    data := vc2.Bytes()
    fmt.Printf("%#vn", string(data))
    // => "x01x01x01x01Ax01x01x01Bx01x00x01C"

    vc3, err := govclock.FromBytes(data)
    if err != nil { panic(err.String()) }

    fmt.Println(vc3.Compare(vc2, govclock.Equal))      // => true
}

Read more
Gustavo Niemeyer

ZooKeeper is a clever generic coordination server for distributed systems, and is one of the core softwares which facilitate the development of Ensemble (project for automagic IaaS deployments which we push at Canonical), so it was a natural choice to experiment with.

Gozk is a complete binding for ZooKeeper which explores the native features of Go to facilitate the interaction with a ZooKeeper server. To avoid reimplementing the well tested bits of the protocol in an unstable way, Gozk is built on top of the standard C ZooKeeper library.

The experience of integrating ZooKeeper with Go was certainly valuable on itself, and worked as a nice way to learn the details of integrating the Go language with a C library. If you’re interested in learning a bit about Go, ZooKeeper, or other details related to the creation of bindings and asynchronous programming, please fasten the seatbelt now.

Basics of C wrapping in Go

Creating the binding on itself was a pretty interesting experiment already. I have worked on the creation of quite a few bindings and language bridges before, and must say I was pleasantly surprised with the experience of creating the Go binding. With Cgo, the name given to the “foreign function interface” mechanism for C integration, one basically declares a special import statement which causes a pre-processor to look at the comment preceding it. Something similar to this:

// #include <zookeeper.h>
import "C"

The comment doesn’t have to be restricted to a single line, or to #include statements even. The C code contained in the comment will be transparently inserted into a helper C file which is compiled and linked with the final object file, and the given snippet will also be parsed and inclusions processed. In the Go side, that “C” import is simulated as if it were a normal Go package so that the C functions, types, and values are all directly accessible.

As an example, a C function with this prototype:

int zoo_wexists(zhandle_t *zh, const char *path, watcher_fn watcher,
                void *context, struct Stat *stat);

In Go may be used as:

cstat := C.struct_Stat{}
rc, cerr := C.zoo_wexists(zk.handle, cpath, nil, nil, &cstat)

When the C function is used in a context where two result values are requested, as done above, Cgo will save the well known errno variable after the function has finished executing and will return it wrapped into an os.Errno value.

Also, note how the C struct is defined in a way that can be passed straight to the C function. Interestingly, the allocation of the memory backing the structure is going to be performed and tracked by the Go runtime, and will be garbage collected appropriately once no more references exist within the Go runtime. This fact has to be kept in mind since the application will crash if a value allocated normally within Go is saved with a foreign C function and maintained after all the Go references are gone. The alternative in these cases is to call the usual C functions to get hold of memory for the involved values. That memory won’t be touched by the garbage collector, and, of course, must be explicitly freed when no longer necessary. Here is a simple example showing explicit allocation:

cbuffer := (*C.char)(C.malloc(bufferSize))
defer C.free(unsafe.Pointer(cbuffer))

Note the use of the defer statement above. Even when dealing with foreign functionality, it comes in handy. The above call will ensure that the buffer is deallocated right before the current function returns, for instance, so it’s a nice way to ensure no leaks happen, even if in the future the function suddenly gets a new exit point which didn’t consider the allocation of resources.

In terms of typing, Go is more strict than C, and Cgo-based logic will also ensure that the types returned and passed into the foreign C functions are correctly typed, in the same way done for the native types. Note above, for instance, how the call to the free() function has to explicitly convert the value into an unsafe.Pointer, even though in C no casting would be necessary to pass a pointer into a void * parameter.

The unsafe.Pointer is in fact a very special type within Go. Using it, one can convert any pointer type into any other pointer type in an unsafe way (thus the package name), and also back and forth into a uintptr value with the address of the memory referenced by the pointer. For every other type conversion, Go will ensure at compilation time that doing the conversion at runtime is a safe operation.

With all of these resources, including the ability to use common Go syntax and functionality even when dealing with foreign types, values, and function calls, the integration task turns out to be quite a pleasant experience. That said, some of the things may still require some good thinking to get right, as we’ll see shortly.

Watch callbacks and channels

One of the most interesting (and slightly tricky) aspects of mapping the ZooKeeper concepts into Go was the “watch” functionality. ZooKeeper allows one to attach a “watch” to a node so that the server will report back when changes happen to the given node. In the C library, this functionality is exposed via a callback function which is executed once the monitored node aspect is modified.

It would certainly be possible to offer this functionality in Go using a similar mechanism, but Go channels provide a number of advantages for that kind of asynchronous notification: waiting for multiple events via the select statement, synchronous blocking until the event happens, testing if the event is already available, etc.

The tricky bit, though, isn’t the use of channels. That part is quite simple. The tricky detail is that the C callback function execution happens in a C thread started by the ZooKeeper library, and happens asynchronously, while the Go application is doing its business elsewhere. Right now, there’s no straightforward way to transfer the execution of this asynchronous C function back into the Go land. The solution for this problem was found with some help from the folks at the golang-nuts mailing list, and luckily it’s not that hard to support or understand. That said, this is a good opportunity to get some coffee or your preferred focus-enhancing drink.

The solution works like this: when the ZooKeeper C library gets a watch notification, it executes a C callback function which is inside a Gozk helper file. Rather than transferring control to Go right away, this C function simply appends data about the event onto a queue, and signals a pthread condition variable to notify that an event is available. Then, on the Go side, once the first ZooKeeper connection is initialized, a new goroutine is fired and loops waiting for events to be available. The interesting detail about this loop, is that it blocks within a foreign C function waiting for an event to be available, through the signaling of the shared pthread condition variable. In the Go side, that’s how the call looks like, just to give a more practical feeling:

// This will block until there's a watch available.
data := C.wait_for_watch()

Then, on the C side, here is the function definition:

watch_data *wait_for_watch() {
    watch_data *data = NULL;
    pthread_mutex_lock(&watch_mutex);
    if (first_watch == NULL)
        pthread_cond_wait(&watch_available, &watch_mutex);
    data = first_watch;
    first_watch = first_watch->next;
    pthread_mutex_unlock(&watch_mutex);
    return data;
}

As you can see, not really a big deal. When that kind of blocking occurs inside a foreign C function, the Go runtime will correctly continue the execution of other goroutines within other operating system threads.

The result of this mechanism is a nice to use interface based on channels, which may be explored in different ways depending on the application needs. Here is a simple example blocking on the event synchronously, for instance:

stat, watch, err := zk.ExistsW("/some/path")
if stat == nil && err == nil {
    event := <-watch
    // Use event ...
}

Concluding

Those were some of the interesting aspects of implementing the ZooKeeper binding. I would like to speak about some additional details, but this post is rather long already, so I'll keep that for a future opportunity. The code is available under the LGPL, so if you're curious about some other aspect, or would like to use ZooKeeper with Go, please move on and check it out!

Read more
Gustavo Niemeyer

Continuing the sequence of experiments I’ve been running with the Go language, I’ve just made available a tiny but useful new package: gommap. As one would imagine, this new package provides access to low-level memory mapping for files and devices, and it allowed exploring a few new edges of the language implementation. Note that, strictly speaking, some of the details ahead are really more about the implementation than the language itself.

There were basically two main routes to follow when implementing support for memory mapping in Go. The first one is usually the way higher-level languages handle it. In Python, for instance, this is the way one may use a memory mapped file:

>>> import mmap
>>> file = open("/etc/passwd")
>>> mm = mmap.mmap(file.fileno(), size, access=PROT_READ)
>>> mm[0:4]
'root'

The way this was done has an advantage and a disadvantage which are perhaps non entirely obvious on a first look. The advantage is that the memory mapped area is truly hidden behind that interface, so any improper attempt to access a region which was already unmapped, for instance, may be blocked within the application with a nice error message which explains the issue. The disadvantage, though, is that this interface usually comes with a restriction that the way to use the memory region with normal libraries, is via copying of data. In the above example, for instance, the “root” string isn’t backed by the original mapped memory anymore, and is rather a copy of its contents (see PEP 3118 for a way to improve a bit this aspect with Python).

The other path, which can be done with Go, is to back a normal native array type with the allocated memory. This means that normal libraries don’t need to copy data out of the mapped memory, or to use a special memory saving interface, to deal with the memory mapped region. As a simple example, this would get the first line in the given file:

mmap, err := gommap.Map(file.Fd(), PROT_READ, MAP_PRIVATE)
if err == nil {
    end := bytes.Index(mmap, []byte{'\n'})
    firstLine := mmap[:end]
}

In the procedure above, mmap is defined as an alias to a native []byte array, so even though the standard bytes module was used, at no point was the data from the memory mapped region copied out or any auxiliary buffers allocated, so this is a very fast operation. To give an idea about this, let’s pretend for a moment that we want to increase a simple 8 bit counter in a file. This might be done with something as simple as:

mmap[13] += 1

This line of code would be compiled into something similar to the following assembly (amd64):

MOVQ    mmap+-32(SP),BX
CMPL    8(BX),$13
JHI     ,68
CALL    ,runtime.panicindex+0(SB)
MOVQ    (BX),BX
INCB    ,13(BX)

As you can see, this is just doing some fast index checking before incrementing the value directly in memory. Given that one of the important reasons why memory mapped files are used is to speed up access to disk files (sometimes large disk files), this advantage in performance is actually meaningful in this context.

Unfortunately, though, doing things this way also has an important disadvantage, at least right now. There’s no way at the moment to track references to the underlying memory, which was allocated by means not known to the Go runtime. This means that unmapping this memory is not a safe operation. The munmap system call will simply take the references away from the process, and any further attempt to touch those areas will crash the application.

To give you an idea about the background “magic” which is going on to achieve this support in Go, here is an interesting excerpt from the underlying mmap syscall as of this writing:

addr, _, errno := syscall.Syscall6(syscall.SYS_MMAP, (...))
(...)
dh := (*reflect.SliceHeader)(unsafe.Pointer(&mmap))
dh.Data = addr
dh.Len = int(length)
dh.Cap = dh.Len

As you can see, this is taking apart the memory backing the slice value into its constituting structure, and altering it to point to the mapped memory, including information about the length mapped so that bound checking as observed in the assembly above will work correctly.

In case the garbage collector is at some point extended to track references to these foreign regions, it would be possible to implement some kind of UnmapOnGC() method which would only unmap the memory once the last reference is gone. For now, though, the advantages of being able to reference memory mapped regions directly, at least to me, surpass the danger of having improper slices of the given region being used after unmapping. Also, I expect that usage of this kind of functionality will generally be encapsulated within higher level libraries, so it shouldn’t be too hard to keep the constraint in mind while using it this way.

For those reasons, gommap was implemented with the latter approach. In case you need memory mapping support for Go, just move ahead and goinstall launchpad.net/gommap.

UPDATE (2010-12-02): The interface was updated so that mmap itself is an array, rather than mmap.Data, and this post was changed to reflect this.

Read more
Gustavo Niemeyer

A while ago Martin Pool made a very interesting post on the design of interfaces, inspired by a talk from Rusty Russel from 2003.

Besides the interesting scale of interface quality explained there, this is a very insightful comment, often overlooked:


Once the code gets too big for one person, it’s all about damage control. Interfaces make damage control possible… except when the interfaces themselves are the problem.

Designing a system as a group of people requires splitting tasks up among team members, and/or the community, and perhaps even separate teams. Interfaces are the touch points of that splitting, and is what represents the functionality offered within the module/library/file/command/service/whatever. Too often, people spend a long time working on the implementation details, thinking really deep about how to obtain the desired behavior, and forget to define clearly what is the interface to that behavior.

Having good interfaces is a key aspect of software development, and getting it correctly offers a number of important benefits:

Good encapsulation

Having good encapsulation is pretty much a synonym of having good interfaces. Too often, though, people focus on the encapsulation of the small pieces (the functions, the classes, etc), and forget about the encapsulation of the larger blocks (the libraries, modules, packages, commands, etc).

Also, in my experience trying to encourage good architectures, I have found that stating “We need good encapsulation!” gives developers no tangible line of action. It reminds me of a parent telling the child “You should be responsible!”. Sure, encapsulation and responsibility both sound great, but.. what does that really mean?

When inviting developers to think about the interfaces of the system parts they are responsible for, encapsulation becomes a natural outcome. It’s clear that there must be a line drawn between that part of the system and the rest, and the shape of this line must be considered while (or even better, before) the behavior is implemented.

Given well designed interfaces, the additional requirement of only using other parts of the system through their public interfaces seals the achievement of good encapsulation. Ideally, this barrier would be a natural property of the language used to develop the system (see the interface quality scale in Martin’s post). In other cases, this must be achieved through conventions, agreements, and good documentation.

Improved scope and communication

By inviting developers to think about the interfaces of the parts they are responsible for, one is basically encouraging the consideration of the interaction between those pieces and the rest of the system. This process gives an interesting perspective, both in terms of the external expectations (what do I need to offer other people?), as well as the internal goals (what do I need to implement for satisfying what other people need?).

Besides helping people to figure the scope and goal of the piece being developed, this will also give a nice structure to some of the communication which must inevitably happen to integrate correctly the separate parts of the system being developed.

Improved testing and experimentation

If an interface is well designed and defined, and encapsulates well part of the functionality of the system, it improves significantly the testing and experimentation related to that part of the system. Again, this has an effect internally and externally to the interface.

Internally in the sense that there’s a clear boundary between the part in development and the rest of the system, and thus it should be easier to verify that the bits which compose it are working according to plan without dragging the whole system together, and also to verify that the interface itself is behaving as intended (and hopefully as documented).

Externally in the sense that, given that there’s agreement regarding what is the public interface to the part being considered, one may easily provide a test double (a fake, or dummy, or mock) to simulate that part of the system. This is well known to be useful in a number of ways:

  • Dependent work may be run in parallel by different people
  • Real implementation backing the given interface may be postponed, until the idea is proven useful, and the interface feels suitable
  • External systems which would be hard to run locally may be simulated so that tests run fast and cheap, even without network connections
  • Faults may be injected in the system via the test doubles to verify behavior in hostile conditions

and so on.

Quality isolation

This point is also my understanding of what Rusty refers to as damage control in his talk. This property is very useful when designing a system, but even then it’s often missed when discussing interfaces and encapsulation.

If there’s a well defined interface to a piece of functionality in the system, and that interface was carefully considered to cover the needs of the system, the implementation of that interface may not start as the most beautiful, or most scalable, or even most reliable piece of software. As any developer responsible for a successful startup will happily point out, a half-baked implementation is often good enough to get things going, prove the concept, and extend the project runway.

Good interfaces play an important role in this kind of situation. They are, in this sense, a way to be better prepared for success (or, for failure, depending on the perspective). If the interface implementation suddenly becomes an issue for whatever reason, the implementation itself may be replaced by something which better suits the current reality, while preserving the interaction with the rest of the system.

Of course, it’s still very hard to predict future system behavior when facing a completely different reality. Changing the scale requirements for the system a few orders of magnitude, for instance, may easily break existing assumptions, and interfaces designed around these assumptions. Still, even if good interfaces won’t be enough to avoid modifications in the architecture and integration points in many cases, they will certainly help framing the conversations which will take place when this happens and new interfaces must be developed.

Conclusion

When developing non-trivial software products, there’s no other way but to split out the problem solving in several layers and components. Looking at the points where these layers and components touch each other is a very useful and natural way to organize conversations and structure work which must take place to push the product forward.

It’s quite revealing to look at the points above, and note that it’s not simply the existence of interfaces themselves which presents the advantages described, but the process which they encourage around them. Software architecture is essentially about people.

Read more
Gustavo Niemeyer

It’s time to release my “side project” which has been evolving over the last several months: Gocheck. I’ve been watching Go for some time, and have been getting more and more interested in the language. My first attempt to write something interesting in it made it obvious that there would be benefit in having a richer testing platform than what is available in the standard library. That said, I do understand why the standard one is slim: it’s pretty minimalist, because it’s used by itself to test the rest of the platform. With Gocheck, though, I don’t have that requirement. I’m able to trust that the standard library works well, and focus on having features which will make me more productive while writing tests, including features such as:

  • Better error reporting
  • Richer test helpers: assertions which interrupt the test immediately, deep multi-type comparisons, string matching, etc
  • Suite-based grouping of tests
  • Fixtures: per suite and/or per test set up and tear down
  • Management of temporary directories
  • Panic-catching logic, with proper error reporting
  • Proper counting of successes, failures, panics, missed tests, skips, etc
  • Support for expected failures
  • Fully tested (yes, it manages to test itself reliably!)

That last point was actually quite fun to get right. It’s the first time I wrote a testing framework from the ground up, and of course I wanted to have it fully tested by itself, but I didn’t want to simply use a foreign testing framework to test it. So what it does is basically to have a “bootstrapping” phase, which ensures that the very basic parts of the library work, without trusting on pretty much any internal functionality (e.g. it verifies the number of executed functions, and works with low-level panics). Then, once the lower layers are trusted, tests for higher functionality was introduced by building on the trusted bits.

Gocheck is actually mostly ready for some time now, but I’ve been polishing edges with some real world usage before releasing it. Since both the real world usage and Gocheck itself are side projects, you can imagine that took a bit of time. Today, though, I’ve managed to fix the last few things which were bothering me, so it’s up for world consumption.

I hope you enjoy it, and make some good use of it so that we can all have more reliable software. ;-)

Read more
Gustavo Niemeyer

A bit of history

I don’t know exactly why, but I’ve always enjoyed IRC bots. Perhaps it’s the fact that it emulates a person in an easy-to-program way, or maybe it’s about having a flexible and shared “command line” tool, or maybe it’s just the fact that it helps people perceive things in an asynchronous way without much effort. Probably a bit of everything, actually.

My bot programming started with pybot many years ago, when I was still working at Conectiva. Besides having many interesting features, this bot eventually got in an abandonware state, since Canonical already had pretty much equivalent features available when I joined, and I had other interests which got in the way. The code was a bit messy as well.. it was a time when I wasn’t very used to testing software properly (a friend has a great excuse for that kind of messy software: “I was young, and needed the money!”).

Then, a couple of years ago, while working in the Landscape project, there was an opportunity of getting some information more visible to the team. Coincidently, it was also a time when I wanted to get some practice with the concepts of Erlang, so I decided to write a bot from scratch with some nice support for plugins, just to get a feeling of how the promised stability of Erlang actually took place for real. This bot is called mup (Mup Pet, more formally), and its code is available publicly through Launchpad.

This was a nice experiment indeed, and I did learn quite a bit about the ins and outs of Erlang with it. Somewhat unexpected, though, was the fact that the bot grew up a few extra features which multiple teams in Canonical started to appreciate. This was of course very nice, but it also made it more obvious that the egocentric reason for having a bot written in Erlang would now hurt, because most of Canonical’s own coding is done in Python, and that’s what internal tools should generally be written in for everyone to contribute and help maintaining the code.

That’s where the desire of migrating mup into a Python-based brain again came from, and having a new feature to write was the perfect motivator for this.

LDAP and two-way SMSing over IRC

Canonical is a very distributed company. Employees are distributed over dozens of countries, literally. Not only that, but most people also work from their homes, rather than in an office. Many different countries also means many different timezones, and working from home with people from different timezones means flexible timing. All of that means communication gets… well.. interesting.

How do we reach someone that should be in an online meeting and is not? Or someone that is traveling to get to a sprint? Or how can someone that has no network connectivity reach an IRC channel to talk to the team? There are probably several answers to this question, but one of them is of course SMS. It’s not exactly cheap if we consider the cost of the data being transfered, but pretty much everyone has a mobile phone which can do SMS, and the model is not that far away from IRC, which is the main communication system used by the company.

So, the itch was itching. Let’s scratch it!

Getting the mobile phone of employees was already a solved problem for mup, because it had a plugin which could interact with the LDAP directory, allowing people to do something like this:

<joe> mup: poke gustavo
<mup> joe: niemeyer is Gustavo Niemeyer <…@canonical.com> <time:…> <mobile:…>

This just had to be migrated from Erlang into a Python-based brain for the reasons stated above. This time, though, there was no reason to write something from scratch. I could even have used pybot itself, but there was also supybot, an IRC bot which started around the same time I wrote the first version of pybot, and unlike the latter, supybot’s author was much more diligent in evolving it. There is quite a comprehensive list of plugins for supybot nowadays, and it includes means for testing plugins and so on. The choice of using it was straighforward, and getting “poke” support ported into a plugin wasn’t hard at all.

So, on to SMSing. Canonical already had a contract with an SMS gateway company which we established to test-drive some ideas on Landscape. With the mobile phone numbers coming out of the LDAP directory in hands and an SMS contract established, all that was needed was a plugin for the bot to talk to the SMS gateway. That “conversation” with the SMS gateway allows not only sending messages, but also receiving SMS messages which were sent to a specific number.

In practice, this means that people which are connected to IRC can very easily deliver an SMS to someone using their nicks. Something like this:

<joe> @sms niemeyer Where are you? We’re waiting!

And this would show up in the mobile screen as:

joe> Where are you? We’re waiting!

In addition to this, people which have no connectivity can also contact individuals and channels on IRC, with mup working as a middle man. The message would show up on IRC in a similar way to:

<mup> [SMS] <niemeyer> Sorry, the flight was delayed. Will be there in 5.

The communication from the bot to the gateway happens via plain HTTPS. The communication back is a bit more complex, though. There is a small proxy service deployed in Google App Engine to receive messages from the SMS gateway. This was done to avoid losing messages when the bot itself is taken down for maintenance. The SMS gateway doesn’t handle this case very well, so it’s better to have something which will be up most of the time buffering messages.

A picture is worth 210 words, so here is a simple diagram explaining how things got linked together:

This is now up for experimentation, and so far it’s working nicely. I’m hoping that in the next few weeks we’ll manage to port the rest of mup into the supybot-based brain.

Read more
Gustavo Niemeyer

I was just rambling randomly yesterday, in the usual microblogging platforms, about how result checking seems to be ignored or done badly. The precise wording was:

It’s really amazing how little attention error handling receives in most software development. Even *tutorials* often ignore it.

It indeed does amaze me. It sometimes feels like we write code for theoretical perfect worlds.. “If the processor executes exactly in this order, and the weather is calm, this program will work.”. There are countless examples of bad assumptions.. someday I will come with some statistics of the form “Every N seconds someone forgets to check the result of write().”.

If you are a teacher, or a developer that enjoys writing snippets of code to teach people, please join me in the quest of building a better future. Do not tell us that you’re “avoiding result checking for terseness”, because that’s exactly what we people will do (terseness is good, right?). On the contrary, take this chance to make us feel bad about avoiding result checking. You might do this by putting a comment like “If you don’t do this, you’re a bad programmer.” right next to the logic which is handling the result, and might take this chance to teach people how proper result handling is done.

Of course, there’s another forgotten art related to result checking. It sits on the other side of the fence. If you are a library author, do think through about how you plan to make us check conditions which happen inside your library, and try to imagine how to make our lives easier. If we suck at handling results when there are obvious ways to handle it, you can imagine what happens when you structure your result logic badly.

Here is a clear example of what not to do, coming straight from Python’s standard library, in the imaplib module:

    def login(self, user, password):
        typ, dat = self._simple_command('LOGIN', user, self._quote(password))
        if typ != 'OK':
            raise self.error(dat[-1])
        self.state = 'AUTH'
        return typ, dat

You see the problem there? How do you handle errors from this library? Should we catch the exception, or should we verify the result code? “Both!” is the right answer, unfortunately, because the author decided to do us a little favor and check the error condition himself in some arbitrary cases and raise the error, while letting it go through and end up in the result code in a selection of other arbitrary cases.

I may provide some additional advice on result handling in the future, but for now I’ll conclude with the following suggestion: please check the results from your actions, and help others to check theirs. That’s a good life-encompassing recommendation, actually.

Read more
Gustavo Niemeyer

In the last post, we’ve seen some security issues which exist in the Android password manager gbaSafe version 1.1.0a, by analyzing the security description provided in its web site. As described there, even though the system depends on a “master key” which might be secure, the security of the system is seriously compromised by the encouragement of very weak keys (a few digits only) in what is named an “unlock key”, used to encrypt the master key itself. All of that in an application which claims to strongly protect people’s data from unwanted eyes.

In this post, we will play a bit with the Linux-based Android OS to actually explore these security deficiencies, demonstrating that such issues are very real, and that the claims of being hard to unveil the data is unfounded. Since the most serious weakness lies in the key itself, we’ll run a simple brute force attack to try to find arbitrary unlock keys.

This procedure is actually mentioned by the author of gbaSafe himself in the web page, except he overestimates the work involved in producing such a mechanism:

Theoretically, somebody could write a program that tries to decrypt the master key by trying all possible values of the short key (with 4 digits there are only 10000 possibilities), but this would still be much work, as more information about the crypting algorithm is needed (e.g. salt bytes, iteration count).

So let’s get started.

As a first step, we’ll need the Android SDK with a working emulator (I’ve used API level 5, revision 1), and a copy of the application itself. I got a trial version of the application from AndAppStore.com.

The application downloaded is bundled within that .apk file, which is really a .zip file that may be opened up normally with any tool which understands this file format.

Once that’s done, we get access to all the information needed to run the application, including icons, interface layouts, and most importantly in this case, the bytecode which targets the Dalvik VM. This bytecode is the end result of a sequence of translations which happen when the program’s Java source code is compiled, so that’s what we’ll have to fiddle with to figure details of the application we want to investigate.

The bytecode is located inside the classes.dex file, and as expected it’s not easy to read in its native format. Luckily, though, a smart guy has already written a couple of tools, smali and baksmali, which allow people to decompile and recompile that bytecode format to/from something that is much easier to understand.

After downloading these tools, the following command should decompile the file:

$ java -jar baksmali.jar –output classes classes.dex

We now have a classes/ directory full of .smali files.

Before going any further, let’s ponder for a moment about what we want to do. A brute force attack is when we attempt sequentially many possible keys, and given the context already presented, what we’re looking after is to attempt different “unlock keys”. With that in mind, we’ll introduce a very small modification in the application so that it will attempt to enter the unlock key automatically, rather than reporting an error when the key entered in the unlock dialog is invalid.

With that in mind, after some quick research, it looks like the onClick() method within the DlgUnlock.smali file is a pretty good candidate. This is supposedly called when the button in the unlock dialog is clicked, so something interesting about the password being correct or not must happen there.

Before doing anything there, I’ve increased the number of registers in the function to 12, to get some additional registers to play with, and then initialized a register with the value zero, to serve as a monotonically increasing number (our keys!):

.method public onClick(Landroid/view/View;)V
- .registers 9
+ .registers 12
.parameter “view”
+ const/16 v9, 0×0

Then, we have to instruct the program to use these keys rather than whatever is typed in the dialog box. Some lines down, we’ll see a call to the checkUnlockKey() method, which is certainly what we’re looking after. Let’s do this now:

+ :mykey
+ invoke-static {v9}, Ljava/lang/String;->valueOf(I)Ljava/lang/String;
+ move-result-object v2
invoke-static {v2}, Lcom/gbizapps/safeA/Crypt;->checkUnlockKey(Ljava/lang/String;)I

Now, what if this key is wrong? We don’t want the master key to be removed as mentioned in the software description. We want to simply attempt the next key. With some analysis, we see that in case of errors, the next couple of lines below the above code will instruct the VM to jump to an error branch. Rather than following up with the normal error logic, we’ll increment the key, and jump back to the above code:

:cond_6c
+ add-int/lit8 v9, v9, 0×1
+ goto :mykey

Now we just have to rebundle this and put it into the emulator. I won’t go over it in too much detail here, since there’s plenty of information available online, but the steps to do that are:

  1. Recreate a modified classes.dex with smali
  2. Recreate a modified .apk file by just zipping the modified content
  3. Sign and zipalign the new .apk file
  4. Install it

And that’s it, seriously! This would be enough to break the software security if it was working correctly.

Interestingly, though, the software wasn’t working correctly with this change. Instead, it was Force Closing on certain keys. To test it out, use the master key “master key”, and the unlock key “999999″, and then once you close and open the application again, try to unlock it with the key “1175″. Instead of showing an error message, it will break badly.

Now, for the proof of concept to work, I actually had to fix the bug, which felt a bit funny to do given the context.

Looking at the traceback trough adb logcat, I found out that there was a null being dereferenced in the file Crypt.smali, so I fixed the problem by injecting some error checking at this position and jumping the flow into an existing error branch:

+ if-eqz v3, :cond_5a
const-string v4, “ucpmhkexov85MDKhdfdfFGQPYxywq7209fcrqhghjkuiopy”

With this in place came the biggest surprise of the experiment. The keys which were crashing the application were special, in the sense that they actually decode the master key successfully! That’s right: whatever the algorithm is doing, that six-digit “999999″ encrypts the master key in such a way that attempting the “1175″ key works, so even big keys are rendered extremely weak with the logic used to encrypt the master key.

At this point, I added some trivial logic to display the key found with a Toast, just to ensure the whole thing was working correctly:

Toast displaying unlock key found

Note that the key generation implemented above is a bit simplistic, in the sense that it doesn’t attempt keys with leading zeros, but this would be trivial to implement, and my intention here isn’t to actually break any keys for real, but just to show how the promised security in this application is not to be trusted at all. Just the logic above will already be enough for a brute force attack against the application, and has broken all the keys I’ve tried in mere seconds, in a slow emulator.

As a conclusion, if you want to put your data in a secure place, rather than picking an application which promises security because the salt is hidden somewhere or because it’s too much work to figure its logic, pick an open source application with logic which is publicly verifiable and has already had many eyes over it. Chances are that doing something like what was described in this post won’t be so trivial. Then, choose your keys wisely! The most secure application won’t be enough if you pick a bad key.

Read more
Gustavo Niemeyer

For some time now I’ve been wanting to research more deeply about the internals of Android. Until now, though, this was just a sentiment. Then, a couple of weeks ago I’ve finally managed to replace my iPhone for an Android phone, and that was the final motivator for me to actually get into learning more about the inner workings of the Linux-based OS.

Now, I just had to pick an actual task for digging into. The Dalvik VM is certainly one of the most innovative and advertised technical details about the OS, so something around it would be a nice start.. some kind of bytecode fiddling perhaps, but what? Luckily, even without trying too hard, I eventually stumbled upon an interesting case for researching upon.

The “victim” of this research is the application gbaSafe version 1.1.0a, which claims to protect user passwords using unbreakable algorithms (how’s that for a hint of a Snake oil case?).

Before we get into some hacking, let’s see some words on the software security by the author himself, and then render some analysis on conceptual issues on it:

The confidential data can only be decrypted if the master key is known. You should choose a long key (at least 16 characters) with mixed case and unreadable text. Of course you cannot enter this key each time you want to access the confidential data, so it is stored in the user settings encrypted with a shorter key (4 to 6 digits) and normally you only have to enter this unlock key. Theoretically it is possible to try all possible values (brute force attack), but then you must use another program, since gbaSafe deletes the encrypted master key from the user settings when you enter the unlock key wrong three times repeatedly, and then you must enter the master key. If you wrote a program to decrypt the master key, you would have to know the algorithm used, the salt bytes and iteration count (used to augment the short unlock key), which are very hard to extract from the binary program module gbaSafe.

If you have some security background, I’m sure that by now you’re already counting the issues on this single paragraph.

The most obvious issue is the fact that there’s a “strong key” and a “weak key”, and the strong key is encrypted with the weak one. This is a very common cryptography sin, as would say my friend and coworker Andreas Hasenack (a security researcher himself). A security system is only as secure as its weakest spot. It obviously makes little difference for an attacker if he has to attempt decrypting a master key or the actual data, since decrypting the master key will give access to the data.

Then, it mentions en passant that the software enforces the use of digits for the weak key. This ensures that the weak key is really weak! Four digits is basically ten thousand attempts, which is absolutely nothing for nowadays’s hardware. This number would move up to about 15 million by simply allowing upper and lowercase letters as well (which isn’t great either, but a few orders of magnitude never hurt in this scenario).

It follows up encouraging people to think that it’s actually hard to figure the algorithm and other implementation details. Considering that there’s absolutely nothing preventing people from getting their hands in the implementation itself, this is in fact asserting that the security mechanism is based on the ignorance of the attacker. Counting on the ignorance of people is bad at all times, and in a security context it’s a major error.

There’s a final security issue in this description which is a bit more subtle, but further analysis on the logic used leaves no doubt. In cryptography, the salt is supposed to increase the work needed in a brute force attack by strengthening the number of bits of the actual passphrase, in a case where the salt is actually unavailable, or at least prevent that a single large word dictionary can be used to attack several encryptions or hashes at once, in a case where the salt is known but variable. In the latter case, it helps because encrypting a single key with two different salts must be done twice, rather than once, so it increases the computational task when attacking multiple items. A salt which is known and does not change across all processed items is worth pretty close to nothing.

So, indeed, considering the many security issues here, this isn’t something I’d store my passwords or credit card numbers on, and I suggest you don’t do it either.

In my next post on this topic I’ll actually implement a trivial brute force attack to prove that these issues are very real, and that, actually, it’s not even hard to break into a security system like this.

The application author has been contacted about this blog post, since he’ll likely want to fix some of these issues.

Read more
Gustavo Niemeyer

Some interesting changes have been happening in my professional life, so I wanted to share it here to update friends and also for me to keep track of things over time (at some point I will be older and will certainly laugh at what I called “interesting changes” in the ol’days). Given the goal, I apologize but this may come across as more egocentric than usual, so please feel free to jump over to your next blog post at any time.

It’s been little more than four years since I left Conectiva / Mandriva and joined Canonical, in August of 2005. Shortly after I joined, I had the luck of spending a few months working on the different projects which the company was pushing at the time, including Launchpad, then Bazaar, then a little bit on some projects which didn’t end up seeing much light. It was a great experience by itself, since all of these projects were abundant in talent. Following that, in the beginning of 2006, counting on the trust of people which knew more than I did, I was requested/allowed to lead the development of a brand new project the company wanted to attempt. After a few months of research I had the chance to sit next to Chris Armstrong and Jamu Kakar to bootstrap the development of what is now known as the Landscape distributed systems management project.

Fast forward three and a half years, in mid 2009, and Landscape became a massive project with hundreds of thousands of very well tested lines, sprawling not only a client branch, but also external child projects such as the Storm Object Relational Mapper, in use also by Launchpad and Ubuntu One. In the commercial side of things it looks like Landscape’s life is just starting, with its hosted and standalone versions getting more and more attention from enterprise customers. And the three guys which started the project didn’t do it alone, for sure. The toy project of early 2006 has grown to become a well structured team, with added talent spreading areas such as development, business and QA.

While I wasn’t watching, though, something happened. Facing that great action, my attention was slowly being spread thinly among management, architecture, development, testing, code reviews, meetings, and other tasks, sometimes in areas not entirely related, but very interesting of course. The net result of increased attention sprawl isn’t actually good, though. If it persists, even when the several small tasks may be individually significant, the achievement just doesn’t feel significant given the invested effort as a whole. At least not for someone that truly enjoys being a software architect, and loves to feel that the effort invested in the growth of a significant working software is really helping people out in the same magnitude of that investment. In simpler words, it felt like my position within the team just wasn’t helping the team out the same way it did before, and thus it was time for a change.

Last July an external factor helped to catapult that change. Eucalyptus needed a feature to be released with Ubuntu 9.10, due in October, to greatly simplify the installation of some standard machine images.. an Image Store. It felt like a very tight schedule, even more considering that I hadn’t been doing Java for a while, and Eucalyptus uses some sexy (and useful) new technology called the Google Web Toolkit, something I had to get acquainted with. Two months looked like a tight schedule, and a risky bet overall, but it also felt like a great opportunity to strongly refocus on a task that needed someone’s attention urgently. Again I was blessed with trust I’m thankful for, and by now I’m relieved to look back and perceive that it went alright, certainly thanks to the help of other people like Sidnei da Silva and Mathias Gug. Meanwhile, on the Landscape side, my responsibilities were distributed within the team so that I could be fully engaged on the problem.

Moving this forward a little bit we reach the current date. Right now the Landscape project has a new organizational structure, and it actually feels like it’s moving along quite well. Besides the internal changes, a major organizational change also took place around Landscape over that period, and the planned restructuring led me to my current role. In practice, I’m now engaging into the research of a new concept which I’m hoping to publish openly quite soon, if everything goes well. It’s challenging, it’s exciting, and most importantly, allows me to focus strongly on something which has a great potential (I will stop teasing you now). In addition to this, I’ll definitely be spending some of that time on the progress of Landscape and the Image Store, but mostly from an architectural point of view, since both of these projects will have bright hands taking care of them more closely.

Sit by the fireside if you’re interested in the upcoming chapters of that story. ;-)

Read more
Gustavo Niemeyer

More than 40 years ago, a guy named Douglas Parkhill described the concept of utility computing. He described it as containing features such as:

  • Essentially simultaneous use of the system by many remote users.
  • Concurrent running of different multiple programs.
  • Availability of at least the same range of facilities and capabilities at the remote stations as the user would expect if he where the sole operator of a private computer.
  • A system of charging based upon a flat service charge and a variable charge based on usage.
  • Capacity for indefinite growth, so that as the customer load increases, the system can expanded without limit by various means.

Fast forward 40 years, and we now name pretty much this same concept as Cloud Computing, and everyone is very excited about the possibilities that exist within this new world. Different companies are pushing this idea in different ways. One of the pioneers in that area is of course Amazon, which managed to create a quite good public cloud offering through their Amazon Web Services product.

This kind of publicly consumable infrastructure is very interesting, because it allows people to do exactly what Douglas Parkhill described 40 years ago, so individuals and organizations can rent computing resources with minimum initial investment, and pay for as much as they need, no more no less.

This is all good, but one of the details is that not every organization can afford to send data or computations to a public cloud like Amazon’s AWS. There are many potential reasons for this, from legal regulations to volume cost. Out of these issues the term Private Cloud was coined. It basically represents exactly the same ideas that Douglas Parkhill described, but rather than using third party infrastructure, some organizations opt to use the same kind of technology, such as the Eucalyptus project deployed in a private infrastructure, so that the teams within the organization can still benefit from the mentioned features.

So we have the Public Cloud and the Private Cloud. Now, what would a Virtual Private Cloud be?

Well, it turns out that this is just a marketing term, purposefully coined to blur the line between a Private and a Public cloud .

The term was used in the announcement Amazon has made yesterday:

Amazon VPC enables enterprises to connect their existing infrastructure to a set of isolated AWS compute resources via a Virtual Private Network (VPN) connection, (…)

So, what is interesting about this is that this is actually not a Private Cloud, because the resources on the other side of the VPN are actually public infrastructure, and as such it doesn’t solve any of the problems which private clouds were created for solving in the first place.

Not only that, but it creates the false impression that organizations would have their own isolated resources. What isolated resources? A physical computer? Storage? Network? Of course, isolating these is not economically viable if you are charging 10 cents an hour per computer instance:

Each month, you pay for VPN Connection-hours and the amount of data transferred via the VPN connections. VPCs, subnets, VPN gateways, customer gateways, and data transferred between subnets within the same VPC are free. Charges for other AWS services, including Amazon EC2, are billed separately at published standard rates.

That doesn’t quite fit together, does it?

To complete the plot, Werner Vogels runs to his blog and screams out loud “Private Cloud is not the Cloud”, while announcing the Virtual Private Cloud which is actually a VPN to his Public Cloud, with infrastructure shared with the world.

Sure. What can I say? Well, maybe that Virtual Private Cloud is not the Private Cloud.

Read more
Gustavo Niemeyer

Alright, so I appreciate the idea of RESTful Web Services, but I’ve got a small dilemma I’d appreciate some opinions on.

In the RESTful Web Services book, by Leonard Richardson and Sam Ruby, there’s emphasis on making the programmable web look like the human web, by following an architecture oriented to having addressable resources rather than oriented to remote procedure calls. Through the book, the RPC (or REST-RPC, when mixed with some RESTful characteristics), is clearly downplayed. In some cases, though, it’s unclear to me what’s the extent of this advice. Humans and computers are of course very different in the nature of tasks they perform, and how well they perform them. To illustrate the point clearly, let me propose a short example.

Let’s imagine the following scenario: we are building a web site with information on a large set of modern books. In this system, we want to follow RESTful principles strictly: each book is addressable at http://example.com/book/<id>, and we can get a list of book URIs by accessing http://example.com/book/list?filter=<words>.

Now, we want to allow people to easily become aware of the newest edition of a given book. To do that, we again follow RESTful characteristics and add a, let’s say, new-editions field to the data which composes a book resource. This field contains a list of URIs of books which are more recent editions of the given book. So far so good. Looks like a nice design.

Now, we want to implement a feature which allows people to access the list of all recent editions of books in their home library, given that they know the URIs for the books because a client program stored the URIs locally in their machines. How would we go about implementing this? We certainly wouldn’t want to do 200 queries to learn about updates for 200 books which a given person has, since that’s unnecessarily heavy on the client computer, on the network, and on the server. It’s also hard to encode the resource scope (as defined in the book) in the URI, since the amount of data to define the scope (the 200 books in our case) can be arbitrarily large. This actually feels like perfectly fit for an RPC: “Hey, server, here are 200 URIs in my envelope.. let me know what are the updated books and their URIs.” I can imagine some workarounds for this, like saving a temporary list of books with PUT, and then doing the query on that temporary list’s URI, but this feels like a considerably more complex design just for the sake of purity.

When I read examples of RESTful interfaces, I usually see examples about how a Google Search API can be RESTful, for instance. Of course, Google Search is actually meant to be operated by humans, with a simple search string. But computers, unlike humans, can thankfully handle a large volume of data for us, and let us know about the interesting details only. It feels a bit like once the volume of data and the complexity of operations on that data goes up, the ability for someone to do a proper RESTful design goes down, and an RPC-style interface becomes an interesting option again.

I would be happy to learn about a nice RESTful approach to solve this kind of problem, though.

Read more