Canonical Voices

kevin gunn

Wanted to quickly share some work-in-progress. I’d previously posted about running unconfined snaps on unity8 , this post is about a small but important incremental step in the progress. Since the mir interface is now part of snapd – you can begin confining apps as snaps targeting Unity8. What this means is, those developers who’ve been delivering and maintaining clicks for Unity8 will have a method to test out their snap on the Unity8 desktop. This will help those developers eventually migrate completely to snaps quickly and easily while keeping their confinement.

So if you want to give it a shot….

Here’s the pull request for adding mir to implicit interfaces , which will make the mir interface an available slot on the system. You can use the mod and rebuild snapd using Zyga’s tools  (If you’ve never modified snapd, recommend reading the README  on github  )

And one other hot tip that wasn’t obvious when I started tinkering with this…when rebuilding/running snapd, just use the one-shot method like “$ ./refresh-bits snap snapd setup run-snapd” this will build snap, snapd, stop the running installed snapd and run the newly built snapd all in one go.

Once you’ve got snapd rebuilt and running, you just need to rebuild ubuntu-clock-app from snappy playpen. Again, it’s only a one-line change to add “mir” as a plug to the ubuntu-clock-app’s yaml. Here’s a branch with the change.

Anyhow, I hope you give it a shot and let us know about any struggles you might encounter. You can find me in either #snappy or #unity8 on freenode.

Read more

The release of Ubuntu 16.10 Yakkety Yak in the coming months will bring about the public release of Unity 8 as a pre-installed desktop session (though not as the default session). It’s been a long time coming, and there’s a lot of new features which will break older applications. Canonical has unveiled snappy as the preferred packaging system for Unity 8, but what about all those old deb packages?

There have been a few other good posts about X applications on Unity 8 including this one on dogfooding, this one on Ubuntu Touch, and this one on how it works under the covers. This blog post is explicitly about Unity 8 on desktop using the Libertine CLI, though can be applied to most devices running Ubuntu Touch.

Disclaimer: I work for Canonical on one of the teams making all of this fancy stuff work.

A (Very) Brief Explanation

The toolchain we’ll be relying on is called libertine, and it’s essentially a wrapper around unprivileged LXC and chroot-based containers. We prefer to use LXC containers on newer OSes, but we must continue supporting chroot containers on many devices due to kernel limitations.

What You’ll Need

For desktop Unity 8, you’ll need the packages for libertine, libertine-tools, and lxc to get started. This will install a CLI and GUI for maintaining Libertine containers and applications.

If you’re running Wily or newer, you can just run the following in your terminal:

1
$ sudo apt install libertine

Otherwise, you’ll need to add the stable overlay PPA first:

1
2
3
$ sudo add-apt-repository ppa:ci-train-ppa-service/stable-phone-overlay
$ sudo apt-get update
$ sudo apt-get install libertine

The GUI

At this point, if you’re on desktop you can open up the GUI which will guide you through creating a new container and installing applications. Search the Dash (or Apps scope) for libertine and, given that we haven’t pushed a buggy version recently, you’ll be presented with a Qt application for maintaining containers. I highly recommend using the GUI, because then you are guaranteed not to be following out-of-date console commands.

…But maybe you prefer the terminal. Or maybe you’re secretly SSH’d into the target machine or Ubuntu Touch device and need to use the terminal. If so…

The CLI

The CLI we’ll be using is libertine-container-manager. It has a manpage, a --help option, and autocomplete to help you out in a jam.

The first thing you’ll want to do is create a container. There are a lot of options, but to create an optimal container for your current machine you only need to specify the id and name parameters:

1
$ libertine-container-manager create --id desktopapps --name "Desktop Applications"

A couple of things to note here: Your id must be unique and conform to the simple click name regex - this is what will identify your container on a system level. The name should be human-readable so you can easily identify what might be inside your container. If you don’t specify a name, your id will be used. The CLI will likely ask you for a password to use in the container in case you ever need it. You can leave this blank if you’re not concerned with that kind of thing.

At this point, a bunch of things should be happening in your terminal. This will pull a container image for your current distro and install all the requirements to get started maintaining and running X apps. This could take anywhere from a few minutes to the next hour depending on your network and disk speeds. Once you’re done, you can use the list subcommand to list all installed containers (note you probably just have one at this point). If you ever want to delete your container, you can run libertine-container-manager destroy -i desktopapps.

Once that’s finished, we can start installing apps. To find apps available, you can use the search-cache subcommand:

1
$ libertine-container-manager search-cache --id desktopapps --search-string "office"

This will return a few strings from the apt-cache of the container with id “desktopapps” that match “office”. Now, if you want to install “libreoffice”:

1
$ libertine-container-manager install-package --id desktopapps --package libreoffice

This will install the full libreoffice suite. Nice! Similarly, you can use the remove-package subcommand to remove applications. Don’t remember what apps you’ve installed? Use the list-apps command:

1
$ libertine-container-manager list-apps --id desktopapps

Maybe you’re an avid Steam for Linux gamer and want to try to get some games working. Since Steam still only comes in a 32-bit binary, you’ll need to enable the multiarch repos, and then you can just install Steam like any other app:

1
2
3
$ libertine-container-manager configure --id desktopapps --multiarch enable
...
$ libertine-container-manager install-package --id desktopapps --package steam

Steam will ask you to agree to their user agreement from the command line, which you should be able to do easily. If you need to use the readline frontend for dpkg, you can append --readline to the install-package command to enable it.

There are many other commands to explore to maintain your container, but for now I’ll let you check the manpage or open the GUI to explore further.

Running Apps

Now that you’ve installed some apps, you probably want to run them. You can install the Libertine Scope, which will allow you to peruse your installed apps in a Unity 8 session. You can either install it from the App Store on a device (search for “Desktop Apps Scope”) or through apt on desktop with:

1
$ sudo apt install libertine-scope

In a Unity 8 session, you can now find the scope and click on apps to run them. Note that there are many apps which still don’t work, such as those requiring a terminal or sudo; consider these a work in progress.

The Future

I’ve been toiling away the past few weeks getting a scope ready which can be used explicitly to install/remove X apps in Unity 8, like the current Ubuntu Software Center (or app store on Touch devices). This scope should be available anywhere the libertine scope is available, meaning that it will alleviate a lot of the pain associated with installing/removing apps for a large chunk of users. Using the Libertine GUI or Libertine CLI will still allow for much more customization, but those tools are largely designed with power users in mind.

Are you able to get libertine working on your system? Can you launch X applications to your heart’s content? Let me know in the comments!

Read more
abeato

Occasionally I find myself processing input data which arrives as a stream, like data from files or from a socket, but that has a known structure that can be modeled with C types. For instance, let’s say we are receiving from a socket a parcel that consists on a header of one byte, and a payload that is an integer. A naive way to handle this is the following (simplified for readability) code snippet:

int main(void)
{
    int fd;
    char *buff;
    struct sockaddr_in addr;
    int vint;
    char vchar;

    fd = socket(AF_INET, SOCK_STREAM, 0);
    buff = malloc(BUFF_SIZE);
    /* Init socket address */
    ...
    connect(fd, (struct sockaddr *) &addr, sizeof(addr));

    read(fd, buff, BUFF_SIZE);

    vchar = buff[0];
    vint  = *(int *) &buff[1];
    /* Do something with extracted data, free resources */
    ...
    return 0;
}

Here we get the raw data with a read() call, we read the first byte, then we read an integer by taking a pointer to the second read byte and casting it to a pointer to an integer. (for this example we are assuming that the integer inserted in the stream has the same size and endianness as the CPU ones).

There is a big issue with this: the cast to int *, which is undefined behavior according to the C standard 1. And it is because things can go wrong in at least two ways, first due to pointer aliasing rules, second due to type alignment.

Strict pointer aliasing tells the compiler that it can assume that pointers to different types point to different places in memory. This allows some optimizations, like reordering. Therefore, we could be in trouble if, say, we take &buff[1] into a char * and use it to write a byte in that location, as reordering could hit us. So just do not do that. Let’s also hope that we have a compiler that is not completely insane and does not move our reading by int pointer before the read() system call. We could also disable strict aliasing if we are using GCC with option -fno-strict-aliasing, which by the way is something that the Linux kernel does. At any rate, this is a complex subject and I will not dig into it this time.

We will concentrate in this article on how to solve the other problem, that is, how to access safely types that are not stored in memory in their natural alignment.

The C Standard-Compliant Solution

Before moving further, keep in mind that it is always possible to be strictly compliant with the standard and access safely memory without breaking language rules or using compiler or machine specific tricks. In the example, we could retrieve vint by doing

    vint  =   buff[1] + (buff[2] << 8)
            + (buff[3] << 16) + (buff[4] << 24);

(supposing stored data is little endian).

The issue here is performance: we are implicitly transforming four bytes to integers, then we have to bit-shift three of them, and finally we have to add them up. Note however that this is what we want if data and CPU have different endianness.

Doing Unaligned Memory Accesses

In all machine architectures there is a natural alignment for the different data types. This alignment is usually the size of the types, for instance in 32 bits architectures the alignment for integers is 4, for doubles it is 8, etc. If instances of these types are not stored in memory positions that are multiple of their alignment, we are talking about unaligned access. If we try to access unaligned data either of these can happen:

  • The hardware let’s us access it – but always at a performance penalty.
  • An exception is triggered by the CPU. This type of exception is called bus error 2.

We might be willing to accept the performance penalty 3, which is mitigated by CPU caches and not that noticeable in certain architectures like x86-64 , but we certainly do not want our program to crash. How possible is this? To be honest it is not something I have seen that often. Therefore, as a first analysis step, I checked how easy it was to get bus errors. To do so, I created the following C++ program, access1.cpp (I could not resist to use templates here to reduce the code size):

#include <iostream>
#include <typeinfo>
#include <cstring>

using namespace std;

template <typename T>
void print_unaligned(char *ptr)
{
    T *val = reinterpret_cast<T *>(ptr);

    cout << "Type is \"" << typeid(T).name()
         << "\" with size " << sizeof(T) << endl;
    cout << val << " *val: " << *val << endl;
}

int main(void)
{
    char *mem = new char[128];

    memset(mem, 0, 128);

    print_unaligned<int>(mem);
    print_unaligned<int>(mem + 1);
    print_unaligned<long long>(mem);
    print_unaligned<long long>(mem + 1);
    print_unaligned<long double>(mem);
    print_unaligned<long double>(mem + 1);

    delete[] mem;
    return 0;
}

The program allocates memory using new char[], which as malloc() in C is guaranteed to allocate memory with the same alignment as the strictest fundamental type. After zeroing the memory, we access mem and mem + 1 by casting to different pointer types, knowing that the second address is odd, and therefore unaligned except for char * access.

I compiled the file with g++ on my laptop, ran it, and got

$ g++ access1.cpp -o access1
$ file access1
access1: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=09d0fb19340a10941eef4c3dc4d6eb29383e717d, not stripped
$ ./access1
Type is "i" with size 4
0x16c3c20 *val: 0
Type is "i" with size 4
0x16c3c21 *val: 0
Type is "x" with size 8
0x16c3c20 *val: 0
Type is "x" with size 8
0x16c3c21 *val: 0
Type is "e" with size 16
0x16c3c20 *val: 0
Type is "e" with size 16
0x16c3c21 *val: 0

No error for x86-64. This was expected as Intel architecture is known to support unaligned access by hardware, at a performance penalty (which is apparently quite small these days, see 4).

The second try was with an ARM CPU, compiling for arm-32:

$ g++ access1.cpp -o access1
$ file access1
access1: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=8c3c3e7d77fddd5f95d18dbffe37d67edc716a1c, not stripped
$ ./access1
Type is "i" with size 4
0x47b008 *val: 0
Type is "i" with size 4
0x47b009 *val: 0
Type is "x" with size 8
0x47b008 *val: 0
Type is "x" with size 8
Bus error (core dumped)

Now we get what we were searching for, a legitimate bus error, in this case when accessing a long long from an unaligned address. Commenting out the offending line and letting the program run further showed the error also when accessing a long double from mem + 1.

Fixing Unaligned Memory Accesses

After proving that this could be a real problem, at least for some architectures, I tried to find a solution that would let me do unaligned memory accesses in the most generic way. I could not find anything safe that was strictly following the C standard. However, all C/C++ compilers have ways to define packed structures, and that came to the rescue.

Packed structures are intended to minimize the padding that is introduced by alignment needed by the structure members. They are used when minimizing storage is a big concern. But what is interesting for us is that its members can be unaligned due to the packing, so dereferencing them must take that into account. Therefore, if we are accessing a type in a CPU that does not support unaligned access for that type the compiler must synthesize code that handles this transparently from the point of view of the C program.

To test that this worked as expected, I wrote access2.cpp, which uses GCC attribute __packed__ to define a packed structure:

#include <iostream>
#include <typeinfo>
#include <cstring>

using namespace std;

template <typename T>
struct __attribute__((__packed__)) struct_safe
{
    T val;
};

template <typename T>
void print_unaligned(char *ptr)
{
    struct_safe<T> *safe = reinterpret_cast<struct_safe<T> *>(ptr);

    cout << "Type is \"" << typeid(T).name()
         << "\" with size " << sizeof(T) << endl;
    cout << safe << " safe->val: " << safe->val << endl;
}

int main(void)
{
    char *mem = new char[128];

    memset(mem, 0, 128);

    print_unaligned<int>(mem);
    print_unaligned<int>(mem + 1);
    print_unaligned<long long>(mem);
    print_unaligned<long long>(mem + 1);
    print_unaligned<long double>(mem);
    print_unaligned<long double>(mem + 1);

    delete[] mem;
    return 0;
}

In this case, instead of directly casting to the type, I cast to a pointer to the packed struct and access the type through it.

Compiling and running for x86-64 got the expected result: no error, all worked as before. Then I compiled and ran it in an ARM device:

$ g++ access2.cpp -o access2
$ file access2
access2: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=9a1ee8c2fcd97393a4b53fe12563676d9f2327a3, not stripped
$ ./access2
Type is "i" with size 4
0x391008 safe->val: 0
Type is "i" with size 4
0x391009 safe->val: 0
Type is "x" with size 8
0x391008 safe->val: 0
Type is "x" with size 8
0x391009 safe->val: 0
Type is "e" with size 8
0x391008 safe->val: 0
Type is "e" with size 8
0x391009 safe->val: 0

No bus errors anymore! It worked as expected. To gain some understanding of what was happening behind the curtains, I disassembled the generated ARM binaries. For both access1 and access2, the same instruction was being used when I was getting a value after casting to int: LDR, which unsurprisingly loads a 32-bit word into a register. But for the long long, I found that access1 was using LDRD, which loads double words (8 bytes) from memory, while access2 was using two LDR instructions instead.

This all made a lot of sense, as ARM states that LDR supports access to unaligned data, while LDRD does not 5. Indeed the later is faster, but has this restriction. It was also good to check that there was no penalty for using the packed structure for integers: GCC does a good job to discriminate when the CPU really needs to handle differently unaligned accesses.

GCC cast-align Warning

GCC has a warning that can help to identify points in the code when we might be accessing unaligned data, which is activated with -Wcast-align. It is not part of the warnings that are activated by options -Wall or -Wextra, so we will have to add it explicitly if we want it. The warning is only triggered when compiling for architectures that do not support unaligned access for all types, so you will not see it if compiling only for x86.

When triggered, you will see something like

file.c:28:23: warning: cast increases required alignment of target type [-Wcast-align]
   int *my_int_ptr = (int *) &buf[i];
                     ^

Conclusion

The moral of this post is that you need to be very careful when casting pointers to a type different to the original one 6. When you need to do that, think about alignment issues first, and also think on your target architectures. There are programs that we want to run on more than one CPU type and too many times we only test in our reference.

Unfortunately the C standard does not give us a standard way of doing efficient access to unaligned data, but most if not all compilers seem to provide ways to do this. If we are using GCC, __attribute__((__packed__)) can help us when we might be doing unaligned accesses. The ARM compiler has a __packed attribute for pointers 7, and I am sure other compilers provide similar machinery. I also recommend to activate -Wcast-align if using GCC, which makes easier to spot alignment issues.

Finally, a word of caution: in most cases you should not do this type of casts. Some times you can define structures and read directly data onto them, some times you can use unions. Bear in mind always the strict pointer aliasing rules, which can hit back. To summarize, think twice before using the sort of trick showed in the post, and use them only when really needed.

Read more
facundo

Terminator moralinesco


AVISO: el siguiente texto puede tener spoilers sobre las pelis de Terminator. Yo te avisé.

No voy a entender nunca como en una película permiten violencia, violencia, violencia, violencia, violencia, y no un desnudo.

Y no estoy hablando de sexo, estoy hablando de cuerpos desnudos. Y ni siquiera estoy hablando de una película para niños (que lo podríamos charlar), sino para adultos (toda esa violencia es inaceptable para niños).

El caso puntual es Terminator: Genisys.

Hay varias situaciones, varias, donde el desnudo sería natural. Me refiero a que no es como se ve mucho en esas pelis clase B donde el asesino en vez de matar a la chica durmiendo la mata en la ducha y aprovechan para mostrarla en tetas.

No. En el caso de Terminator, el aparato que usan para moverse en el tiempo sólo lleva cosas cubiertas de material biológico (sí, la base científica de esto es discutible, pero ese es otro tema). Entonces, los humanos (todo biológicos) y robots (estructura artificial pero todos cubiertos de piel "viva") tienen que viajar desnudos.

Por dicha razón, sucede que en la Terminator original sí se ve algo: por ejemplo el culo del terminator...

El terminator de 1984, desde atrás

Pero vemos que en la nueva, que reproduce lo que pasó en la primera (con los cambios necesarios acorde a la vuelta de tuerca del guión), en la misma escena, ¡no muestra nada!

El terminator nuevo, desde atrás

También, en la original Kyle y Sarah se enamoran y tienen sexo (escenas muy ochentosas, pero con desnudos frontales de ambos). En la última los dos personajes se mantienen coqueteando pero sin concretar, prolijamente castos.

Los Kyle y Sarah originales

En la última hay una chica con papel protagónico que viaja en el tiempo. Acá tenemos una situación donde un desnudo sería totalmente natural, y la cámara está "forzadamente alta".

Cuidando la altura de la cámara

Incluso, cuando termina ese viaje en el tiempo, al "llegar al futuro", aparecen en una autopista y casi los pasan por arriba, pero sí: el foco está en taparse, no vaya a ser cosa.

Apareciendo en la autopista

Como vuelta de tuerca, ¿se podría argumentar que a Emilia Clarke le cueste o no quiera aparecer desnuda? No creo, muestra mucho más como Daenerys Targaryen en Game of Thrones.

También podemos pensar que a una película le resulte más complicado distribuirse y hacerse disponible en muchos paises o lugares en función de si muestran una teta o no. Puede ser. Ahí volvemos a que "asusta" o "indigna" más una teta o un culo que gente matándose entre ellos. Y ni hablar de un pene o una vulva... la escena siguiente es de la peli nueva, pero en la vieja es igual...

El terminator, desde adelante

En fin, no entiendo.

Read more
Alan Griffiths

MirAL: Mir is not all about Unity8

Introducing The Mir Abstraction Layer

The principle Open Source project I’ve been working on for the last few years is Mir. Mir is a library for writing Linux display servers and shells that are independent of the underlying graphics stack. It fits into a similar role as an X server or Weston (a Wayland server) but was initially motivated by Canonical’s vision of “convergent” computing.

The Mir project has had some success in meeting Canonical’s immediate needs – it is running in the Ubuntu Touch phones and tablets, and as an experimental option for running the Unity8 shell on the desktop. But because of the concentration of effort on delivering the features needed for this internal use it hasn’t really addressed the needs of potential users outside of Canonical.

Mir provides two APIs for users: the “client” API is for applications that run on Mir and that is largely used by toolkits. There is support for Mir in the GTK and Qt toolkits, and in SDL. This works pretty well and the Mir client API has remained backwards compatible for a couple of years and can do so for the foreseeable future.

The problem is that the server-side ABI compatibility is broken by almost every release of Mir. This isn’t a big problem for Canonical, as the API is fairly stable and both Mir and Unity8 are in rapid development: rebuilding Unity8 every time Mir is released is a small overhead. But for independent developers the lack of a stable ABI is problematic as they cannot easily synchronize their releases to updates of Mir.

My answer to this is to provide a stable “abstraction layer” written over the top of the current Mir server API that will provide a stable ABI. There are a number of other goals that can be addressed at the same time:

  • The API can be considerably narrowed as a lot of things can be customized that are of no interest to shell development;
  • A more declarative design style can be followed than the implementation focused approach that the Mir server API follows; and,
  • Common facilities can be provided that don’t belong in the Mir libraries.

At the time of writing the Mir Abstraction Layer (miral) is a proof-of-concept, but work is in progress to package an initial version that can support the functionality needed by some examples, and by qtmir (the Qt support used by Unity8).

Building and using MirAL

These instructions assume that you’re using Ubuntu 16.04LTS or later, I’ve not tested earlier Ubuntu versions or other distributions.

You’ll need a few development and utility packages installed, along with the Mir development packages:

$ sudo apt-get install cmake g++ make bzr python-pil uuid-dev libglib2.0-dev
$ sudo apt-get install libmirserver-dev libmirclient-dev mirtest-dev
$ sudo apt-get install mir-graphics-drivers-desktop libgles2-mesa-dev

(If you’re working on a phone or tablet use mir-graphics-drivers-android in place of mir-graphics-drivers-desktop.)

With these installed you can checkout and build miral:

$ bzr branch lp:miral
$ mkdir miral/build
$ cd  miral/build
$ cmake ..
$ make

This creates libmiral.so in the lib directory and an example shell (miral-shell) in the bin directory. This can be run directly:

 $ bin/miral-shell

With the default options this runs in a window on X (which is convenient for development).

The miral-shell example is simple, don’t expect to see a sophisticated launcher by default. You can start mir apps from the command-line. For example:

 $ bin/miral-run gnome-terminal

That’s right, a lot of standard GTK+ applications will “just work” (the GDK toolkit has a Mir backend). Any that assume the existence of an X11 and bypass the toolkit by making X11 protocol calls will have problems though.

To exit from miral-shell press Ctrl-Alt-BkSp.

To run independently of X11 you need to grant access to the graphics hardware (by running as root) and specify a VT to run in. For example:

$ sudo bin/miral-shell --vt 4 --arw-file --file $XDG_RUNTIME_DIR/mir_socket

For convenient testing there’s a “testrun” script that wraps this command to start the server (as root) and then launches the gnome-terminal (as the current user):

$ ../scripts/testrun

Running applications on MirAL

If you have a terminal session running in the MirAL desktop (as described above) you can start programs from it. GTK, Qt and SDL applications will “just work” provided that they don’t bypass the toolkit and attempt to make X11 protocol calls that are not available.

$ gedit
$ 7kaa

From outside the MirAL session the “miral-run” script sets a few environment variables to configure the Mir support in the various toolkits. (There’s some special treatment for gnome-terminal as starting that can conflict with the desktop default.)


$ bin/miral-run gnome-calculator
$ bin/miral-run 7kaa

There are also some examples of native Mir client applications in the mir-demos package. These are typically basic graphics demos:

$ sudo apt-get install mir-demos
$ mir_demo_client_egltriangle

Support for X11 applications

If you want to run X11 applications that do not have native Mir support in the toolkit they use then the answer is Xmir: an X11 server that runs on Mir. First you need Xmir installed:

$ sudo apt install xmir

Then you can use the testrun script to start miral-shell with Xmir:

$ ../scripts/testrun -Xmir

This starts an X11 server on DISPLAY=:1. This is set in the terminal the script starts so that applications launched from it will automatically connect to miral through Xmir.

Running Qt applications

To run Qt applications under Mir you may need to install qtubuntu-desktop:

$ sudo apt-get install qtubuntu-desktop

Read more
Rae Shambrook

Recently I have been working on the visual design for RCS (which stands for rich communications service) group chat. While working on the “Group Info” screen, we found ourselves wondering what the best way to display an online/offline status. Some of us thought text would be more explicit but others thought  it adds more noise to the screen. We decided that we needed some real data in order to make the best decision.

Usually our user testing is done by a designated Researcher but usually their plates are full and projects bigger, so I decided to make my first foray into user testing. I got some tips from designers who had more experience with user testing on our cloud team; Maria  Vrachni, Carla Berkers and Luca Paulina.

I then set about finding my user testing group. I chose 5 people to start with because you can uncover up to 80% of usability issues from speaking to 5 people. I tried to recruit a range of people to test with and they were:

  1. Billy: software engineer, very tech savvy and tech enthusiast.
  2. Magda: Our former PM and very familiar with our product and designs.
  3. Stefanie: Our Office Manager who knows our products but not so familiar with our designs.
  4. Rodney: Our IS Associate who is tech savvy but not familiar with our design work
  5. Ben: A copyeditor who has no background in tech or design and a light phone user.

The tool I decided to use was Invision. It has a lot of good features and I already had some experience creating lightweight prototypes with it. I made four minimal prototypes where the group info screen had a mixture of dots vs text to represent online status and variations on placement.  I then put these on my phone so my test subjects could interact with it and feel like they were looking at a full fledged app and have the same expectations.

group_chat_testing

During testing, I made sure not to ask my subjects any leading questions. I only asked them very broad questions like “Do you see everything you expect to on this page?” “Is anything unclear?” etc. When testing, it’s important not to lead the test subjects so they can be as objective as possible. Keeping this in mind, it was interesting to to see what the testers noticed and brought up on their own and what patterns arise.

My findings were as follows:

Online status: Text or Green Dot

Unanimously they all preferred online status to be depicted with colour and 4 out of 5 preferred the green dot rather than text because of its scannability.

Online status placement:

This one was close but having the green dot next to the avatar had the edge, again because of its scannability. One tester preferred the dot next to the arrow and another didn’t have a preference on placement.

Pending status:

What was also interesting is that three out of the four thought “pending” had the wrong placement. They felt it should have the same placement as online and offline status.

Overall, it was very interesting to collect real data to support our work and looking forward to the next time which will hopefully be bigger in scope.

group_chat_final

The finished design

Read more
kevin gunn

Mir interface landed!

So the Mir interface has landed in snapd. This means that the mir-server snap can be downloaded and taken into use in a fully confined mode, along with matching mir-client snap. I’ve been keeping updated instructions here on the best way to operate and develop your mir-client snap. The one caveat is there’s still a snappy bug (launchpad bug# 1577897)  with snap auto-connections.  This means that you have to manually make the plug-slot connection of the mir-server and mir-client, and then restart the mir-client service. I’ve just uploaded a new mir-server snap which is on Mir version 0.23.5, it should be reflected in the store soon.

At any rate, I’d love to hear back from folks trying this out. In the coming weeks I’m hoping to spend some time on dragonboard and getting this working there.

Read more
Jouni Helminen

We have been looking at ways of making the Terminal app more pleasing, in terms of the user experience, as well as the visuals.

I would like to share the work so far, invite users of the app to comment on the new designs, and share ideas on what other new features would be desirable.

On the visual side, we have brought the app in line with our Suru visual language. We have also adopted the very nice Solarized palette as the default palette – though this will of course be completely customisable by the user.

On the functionality side we are proposing a number of improvements:

-Keyboard shortcuts
-Ability to completely customise touch/keyboard shortcuts
-Ability to split the screen horizontally/vertically (similar to Terminator)
-Ability to easily customise the palette colours, and window transparency (on desktop)
-Unlimited history/scrollback
-Adding a “find” action for searching the history

email-desktop

Tabs and split screen

On larger screens tabs will be visually persistent. In addition it’s desirable to be able split a panel horizontally and vertically, and use keyboard shortcuts or focusing with a mouse/touch to move between the focused panel.

On mobile, the tabs will be accessed through the bottom edge, as on the browser app.

terminal-blog-phone

Quick mobile access to shortcuts and commands

We are discussing the option of having modifier (Ctrl, Alt etc) keys working together with the on-screen keyboard on touch – which would be a very welcome addition. While this is possible to do in theory with our on-screen keyboard, it’s something that won’t land in the immediate near future. In the interim modifier key combinations will still be accessible on touch via the shortcuts at the bottom of the screen. We also want to make these shortcuts ordered by recency, and have the ability to add your own custom key shortcuts and commands.

We are also discussing with the on-screen keyboard devs about adding an app specific auto-correct dictionary – in this case terminal commands – that together with a swipe keyboard should make a much nicer mobile terminal user experience.

email-desktop

More themability

We would like the user to be able to define their own custom themes more easily, either via in-app settings with colour picker and theme import, or by editing a JSON configuration file. We would also like to be able to choose the window transparency (in windowed mode), as some users want a see-through terminal.

We need your help!

These visuals are work in progress – we would love to hear what kind of features you would like to see in your favourite terminal app!

Also, as Terminal app is a fully community developed project, we are looking for one or two experienced Qt/QML developers with time to contribute to lead the implementation of these designs. Please reach out to alan.pope@canonical.com or jouni.helminen@canonical.com to discuss details!

EDIT: To clarify – these proposed visuals are improvements for the community developed terminal app currently available for the phone and tablet. We hope to improve it, but it is still not as mature as older terminal apps. You should still be able to run your current favourite terminal (like gnome-terminal, Terminator etc) in Unity8.

Read more
liam zheng

第三届硬件自由日于6月23日在北京清华科技园 ∙ 科技大厦国际会议中心举行。Ubuntu应邀参加了以开源硬件为主题的第三届硬件自由日活动。作为Ubuntu Core合作伙伴的LeMaker、威控睿博现场展出了Ubuntu Core与96开发板、LeMaker Hikey在IoT应用场景。

第三届硬件自由日于6月23日在北京清华科技园 ∙ 科技大厦国际会议中心举行。Ubuntu应邀参加了以开源硬件为主题的第三届硬件自由日活动。作为Ubuntu Core合作伙伴的LeMaker、威控睿博现场展出了Ubuntu Core与96开发板、LeMaker Hikey在IoT应用场景。

威控睿博演示:

Ubuntu Core可以适用于很多场景,在硬件自由日活动上威控睿博利用装有Ubuntu Core的96开发板,控制多台3D打印机,从而解决单人同时控制、操作多台3D打印的问题。

LeMaker应用演示:

Lemaker是集开发板与解方案提供商的公司,Lemaker版的HiKey开发板可运行Ubuntu Core,现场还展出了另外一款LeMaker Guitar开发板,同样可运行Ubuntu Core。现场演示内容为:基于HiKey的硬件平台联动RosBot机器人,用Ubuntu Core加ROS的软件搭建的解决方案,可用于室内SLAM、机器人教学。

活动现场有超过100位的观众,在另外一个演讲会议厅有超过40人的行业参观者在场听取了Ubuntu Core技术以及应用的演说。

Read more
Dustin Kirkland


I hope you'll enjoy a shiny new 6-part blog series I recently published at Linux.com.
  1. The first article is a bit of back story, perhaps a behind-the-scenes look at the motivations, timelines, and some of the work performed between Microsoft and Canonical to bring Ubuntu to Windows.
  2. The second article is an updated getting-started guide, with screenshots, showing a Windows 10 user exactly how to enable and run Ubuntu on Windows.
  3. The third article walks through a dozen or so examples of the most essential command line utilities a Windows user, new to Ubuntu (and Bash), should absolutely learn.
  4. The fourth article shows how to write and execute your first script, "Howdy, Windows!", in 6 different dynamic scripting languages (Bash, Python, Perl, Ruby, PHP, and NodeJS).
  5. The fifth article demonstrates how to write, compile, and execute your first program in 7 different compiled programming languages (C, C++, Fortran, Golang).
  6. The sixth and final article conducts some performance benchmarks of the CPU, Memory, Disk, and Network, in both native Ubuntu on a physical machine, and Ubuntu on Windows running on the same system.
I really enjoyed writing these.  Hopefully you'll try some of the examples, and share your experiences using Ubuntu native utilities on a Windows desktop.  You can find the source code of the programming examples in Github and Launchpad:
Cheers,
Dustin

Read more
Steph Wilson

Back in June we hosted a competition that asked developers to use the AdaptivePageLayout component from the UI toolkit to create an app that converges across devices. We had some very impressive entires that used the component in different ways; choosing a winner was hard. However, after testing all the apps the design team chose a winner: the Timer App.

How does the AdaptivepageLayout work?

The AdaptivePageLayout component eliminates guesswork for developers when adapting from one form factor to another. It works by tracking an infinite number of virtual columns that may be displayed on a screen at once. For example, an app will automatically switch between a 1-panel and 2-panel layout when the user changes the size of the window or surface, by dragging the app from the main stage to the side stage.

You can read more about convergence and how the adaptive page layout works in the App design guides.

Timer app

The Timer app impressed the design team the most with its slick transitions, well thought-out design and ease of use. It used the AdaptivepageLayout well when it translated to different screens.

Design feedback

  • Design: well-considered touches with design, animation and various cool themes.
  • Usability: a favourite is the ability to drag seconds / minutes / hours directly on the clock.
  • Convergence: adjusts beautifully to different screen sizes.

Screen Shot 2016-08-09 at 09.10.32

Screen Shot 2016-08-09 at 09.14.14

Timer app

Other entries

Thank you to all who participated in making their apps look even more slick. Here are the other entries:

2nd: AIDA64 App

  • Design: clean, readable with clear content
  • Usability: pretty flawless
  • Convergence: Adaptive Page Layout suits this type of application well and is used well

3rd: Movie Time

  • Design: functional with good management of all the content
  • Usability: live results for search works smoothly as well as trailer links
  • Convergence: the gridview of poster art lends itself well to various screen sizes

4th: Ubuntu Hangups

  • Design: clean and follows guidelines well
  • Usability: easy to message / chat, with user-friendly functionality
  • Convergence: easy to use particularly on Phone and Tablet

5th: uBeginner

  • Design: basic and clean
  • Usability: information is well-presented
  • Convergence: uses the Adaptive Page Layout well

Try it yourself!

To get involved in building apps, click here.

Read more
Anthony Dillon

Web team hack day

Last week the developers in the web team swapped the office for the lobby of the hotel across the street. The day was geared up to allow us to leave our daily tasks in the office and think of ideas that we would like to work on.

The morning started with coffee and sitting in sofas brainstorming ideas. We collected a list of ideas from each person would like to work on. The ideas ranged from IRC bots to a performance audit of a few of our sites.

Choosing ideas

We wrote all the ideas on post it notes and lay them out on the table. Then each of us chose the idea we were most interested in working on by putting our hand on it. I worked out an almost perfect split of two people per idea, so we broke up into our teams and got to work.

Here are the things we worked on during this “hack day”.

IRC bots

These are bots that can listen to an action and report it to our IRC channel. For example, the creator of a pull request wouldn’t have to paste a link to their PR into our channel to be picked up for review.

This task was picked up by Karl and Will, who started by setting up a Hubot on Heroku. They attached a webhook to all projects under the ubuntudesign to listen for pull requests and report this in the web team channel.

This bot can be used for many other things like reporting deployments, CI failures, etc. We also discussed a method of subscribing to the notifications you are interested in, instead of the whole team being notified about everything.

Asset manager search improvements

Our asset manager and server acts as an internal asset storage. By storing an asset here we get a link to it which can be used by any website. As there are many assets stored in the asset manager we usually need to search existing assets to see if one already exists before making a new one.

Graham and myself picked this task and started by working out how to setup both the manager and server locally.

Previously the search would return results that contained either of a two-word query, but now the result has to contain all search terms to return.

We added our Vanilla framework to the front end, as it is obviously good to use our framework for all internal and external projects.

We have also implemented filtering results by file type, which makes it easy to go through what can sometimes be dozens of search results.

GitHub CMS

GitHub CMS is a nicer and more restricted interface that the marketing team can use to edit the GitHub repository containing page content for example, www.ubuntu.com.

Rich and Robin picked this task and began work on it straight away, by discussing the best approach and list of possible features.

Even though Robin was also helping out with. setting up the asset manager and server locally for Graham and me, he still managed to investigate the best Python framework to use and selected one. Rich on the other hand went ahead with the front end and developed a bunch of page templates using the new MAAS GUI Vanilla theme.

Commit linting

Commit linting is a service that gives a committer to a project a nice step by step wizard to build a high quality commit message.

Barry picked up this task and got the service up and running, and working, but hit a blocker at the point of choosing between different methods of committing. For instance, Tower would bypass this step and also we do not necessarily want to dictate to contributors which way they should commit code. This is something we will leave as an investigation for the time being.

Conclusion

Our hack day went well, in fact better than I imagined it would. We all had fun, got to work on things we found important but struggle to get prioritised in our day to day. It gave the developers a feeling of achievement and the buzz of landing and releasing something at the end of the day.

We will be attempting to do a hack day once every month or so, so watch this space!

Read more
Andrea Bernabei

Refreshed scrollbars!

You may have noticed that the scrollbars available on Ubuntu Touch and the Unity8 environment have recently received a huge overhaul in both visual appearance and user experience. More specifically, we redesigned the Scrollbar component (which is already provided in the Ubuntu UI Toolkit) and added a new ScrollView component that builds on top of it and caters for convergence.

Technical note to app developers: the Scrollbar component is still available for compatibility purposes, but we recommend transitioning to the new and convergent ScrollView.

How did we do it?

The process was as follows:

  • Interaction design was specified
  • Applied visual style to the component
  • Prototyped the component
  • Iterated over design and performed user testing
  • Sent for review to the SDK team and integrated into the UI toolkit.

We started by researching the field, exploring the possibilities and thoroughly analyzing the history of the scrollbar component. The output of this step was an interaction specification. The visual design team picked that up and applied our visual style to the component, ensuring a consistent visual language across all the elements provided by the UI Toolkit.

Once we had a first draft of the interaction and visual specs, we created a prototype of the component.
We then iterated over the design choices and refined the prototype. While iterating we also took into account the results of the user testing research that we conducted in the meanwhile. The testers found the new scrollbars easy to use and visually appealing. The only pain point highlighted by the research were the stepper buttons, that were deemed a bit small. We refined them by creating a new crispier graphic asset and by tweaking their size as well as the visual feedback they provided, especially when being hovered with a mouse.

Once we were happy with the result, we submitted our work to be reviewed by the SDK team. The SDK team are the final gatekeeper and decide whether the implementation of a component is ready to be merged to the UI Toolkit or not. The review process can be lengthy, but it is of great help in ensuring a higher code quality, less bugs and clearer documentation. Once the SDK team gave the green light, the component was merged to the next release of the UI Toolkit.

What did we change?

A critical requirement of the new solution was to be “convergence-ready”. Convergence means implementing a UI that scales not just across form factors, but also across different input devices. This is particularly important in the case of scrolling, as it must be responsive to all input devices.

The new ScrollView can be interacted with using the touchscreen of your phone or tablet, but thanks to convergence, now your tablet can turn into a full fledged computing device with a hardware keyboard and a pointer device, such as a mouse. The component must be up to the task and adapt to the capabilities provided by the input device which is currently in use.

At any time, the new scrollbar can be in one of the 3 following modes:

  • Indicator mode, where the scrollbar is a non-interactive visual aid;
  • Thumb mode, that allows quick scrolling using a touch screen;
  • Steppers mode, optimized for pointer devices interactions.

Let’s go through the modes in more detail.

Indicator mode

Whenever the user scrolls content without directly interacting with the scrollbar, i.e. he or she performs a flick or uses the mouse wheel or keyboard keys, the scrollbar gently fades in as an overlay on top of the content. In this mode the scrollbar is not interactive, and just acts as a visual aid to provide information about the position of the content. The indicator gently fades out following a short timeout after the surface stops scrolling.

Please note: we will be replacing these images with GIFs soon.

scrollview-touch

 

Thumb mode

Imagine you want to send a picture to a friend of yours, but the file is somewhere down the very lengthy grid of pictures. Let’s also suppose you’re using a smartphone or tablet and you have no mouse or keyboard connected to it. Wouldn’t it be handy to have a way to quickly scroll a long distance without having to repeatedly flick the list? We designed Thumb mode to address that use case.

When the content on screen reaches a length of 10 or more pages, the thin indicator provided by the indicator mode grows thicker into an interactive thumb. That marks the transition to the Thumb mode. While the scrollbar is in Thumb mode you can drag the thumb using touchscreen to quickly scroll the content.

The component still fades out when the user stops interacting with the surface and the surface stops scrolling, in order to leave as much real estate to the application content as possible.

Stepper mode

scrollview-pointer

When the user is interacting with the UI using a pointer device, they expect a different experience than a touchscreen. Pointer devices allow for much more precise interactions and also provide different ways of interacting with the UI components, such as hovering. A good convergent component exploits those additional capabilities to provide the best user experience for the input device which is currently in use.

When the user hovers over the area occupied by the (initially hidden) scrollbar, the bar reveals itself, this time in what we call Stepper mode.

This mode is optimized for pointer device interactions, although it can be interacted with using touchscreen as well. More generally, for a component to be defined convergent the user must be able to start interacting with it using one input device (in this case, a mouse) and switch to another (e.g. touch screen) whenever they wish to. That transition must be effortless and non disruptive.

When in Stepper mode, the scrollbar has a thick and interactive thumb. This is similar to the Thumb mode we presented in the previous section. However, Stepper mode also provides a semi-transparent background and the two clickable stepper buttons desktop users are already accustomed to. The steppers buttons can be clicked to scroll a short distance. Holding a stepper button pressed will scroll multiple times.

The areas above and below the thumb are also interactive. You can click/tap or press-and-hold to scrolling by one or more pages.

Once the user moves the pointer away from the scrollbar area and the surface stops scrolling, the component elegantly fades out, just like in the other modes.

Visual convergence

We put a lot of efforts into making the transitions between the different modes as smooth and visually pleasing as possible. The alignment of the sub components (the thumb, its background, the stepper buttons), their sizes, their colours have been carefully chosen to achieve that goal.

When the bar grows from Indicator to Thumb modes and vice versa, it does so by anchoring one side and expanding only the opposite one. This minimizes unexpected movements and produces a simple yet crisp animated transition. Those same principles also apply to the transitions from thumb to stepper modes and indicator to stepper and vice versa. We wanted to create transitions that would look elegant but not distrustful.

The new scrollbar also provides visual aids to indicate when a pointer device is hovering over any of the sub components. Both the stepper buttons and the thumb react to hovering by adjusting their colours.

Scroller variations

Interaction handling convergence

A lot of effort has gone into tweaking the interactions to provide an effortless interaction model. Here’s a summary of how we handle touch screen and pointer devices:

  • Thumb mode features a thicker interactive thumb to allow quick scrolling using touch screen;
  • Press-and-holding the steppers buttons provides an effortless way to perform multiple short scrolls;
  • Press-and-holding the areas above and below the thumb makes provides easy multiple page scrolling;
  • Mouse hovering is exploited to reveal or hide the scrollbar;
  • Visual feedback on press/tap;
  • Visual feedback on pointer device hover.

It is a lot of small (and sometimes trivial!) details that make up for a great user experience.

Some of you might be wondering: “what about keyboard input?”

I’m glad you asked! That is an important feature to realize full convergence. The ScrollView components handles that transparently for you. Scrolling content using the keyboard is just as easy as scrolling using the touchscreen or any pointer device:

  • Arrow keys trigger a short scroll;
  • PageUp/PageDown trigger a page scroll;
  • Home/End keys trigger scrolling to the top/bottom of the content, respectively
  • Holding a key down triggers multiple scrolls;

What did we achieve?

The new scrollbars fully implement our vision of convergence. The user can interact with any of the input device he has available and switch from one to the other at any time. The interactions feel snappy and we think the component looks great too!

We can’t wait for you to try it and let us know your opinion!

What does the future hold?

The focus so far has been on getting the right visual appearance and user experience.

However, in order to have a complete solution, we also need to make sure adding such a feature as scrollbars to the applications does not come with a big performance drawback. Ideally, all the scrollable surfaces (images, text fields, etc) should include a scrollbar and that means it’s very important to provide a component that is not just easy to use and visually appealing but also extremely performant.

There are two main aspects where the performance of this component comes into play: the performance of interactions, so that they happen immediately and without unexpected delays. I believe we’re in a very good shape there. The second is: the time it takes to create a scrollbar when an application needs one; this affects application startup time and the time it takes to load a new view that holds scrollable content.

A few changes have already been implemented, which has resulted in a speed-up of about 25%. These changes should be released with OTA13.

If you have ideas or want to provide any feedback, here are the contact details of the people who worked on this project.

IRC: #ubuntu-touch channel on FreeNode server

Alternatively, start a thread on the ubuntu-phone mailing list.

Read more
Justin McPherson

Introducing React Native Ubuntu

In the Webapps team at Canonical, we are always looking to make sure that web and near-web technologies are available to developers. We want to make everyone's life easier, enable the use of tools that are familiar to web developers and provide an easy path to using them on the Ubuntu platform.

We have support for web applications and creating and packaging Cordova applications, both of these enable any web framework to be used in creating great application experiences on the Ubuntu platform.

One popular web framework that can be used in these environments is React.js; React.js is a UI framework with a declarative programming model and strong component system, which focuses primarily on the composition of the UI, so you can use what you like elsewhere.

While these environments are great, sometimes you need just that bit more performance, or to be able to work with native UI components directly, but working in a less familiar environment might not be a good use of time. If you are familiar with React.js, it's easy to move into full native development with all your existing knowledge and tools by developing with React Native. React Native is the sister to React.js, you can use the same style and code to create an application that works directly with native components with native levels of performance, but with the ease of and rapid development you would expect.

We are happy to announce that along with our HTML5 application support, it is now possible to develop React Native applications on the Ubuntu platform. You can port existing iOS or Android React Native applications, or you can start a new application leveraging your web-dev skills.

You can find the source code for React Native Ubuntu here,

To get started, follow the instructions in README-ubuntu.md and create your first application.

The Ubuntu support includes the ability to generate packages. Managed by the React Native CLI, building a snap is as easy as 'react-native package-ubuntu --snap'. It's also possible to build a click package for Ubuntu devices; meaning React Native Ubuntu apps are store ready from the start.

Over the next little while there will be blogs posts on everything you need to know about developing a React Native Application for the Ubuntu platform; creating the app, the development process, packaging and releasing to the store. There will also be some information on how to develop new reusable modules, that can add extra functionality to the runtime and be distributed as Node Package Manager (npm) modules.

Go and experiment, and see what you can create.

Read more

This is the third in a series of blog posts detailing my experience acclimating to a fully remote work experience. You may also enjoy my original posts detailing my first week and my first month.

Has it really been 4 months since I started riding the raging river of remote work? After my first month, I felt pretty good about my daily schedule and work/life balance. Since the last post, I’ve met many of my co-workers IRL, learned to find my own work, and figured out how to shake things up when I get in a rut.

But First, Did I Handle My Action Items?

During my first month, I struggled to figure out what to work on after finishing a task. At this point, it’s extremely rare that I don’t have a dozen things in the queue like a pile of papers in an In Box about to topple over. I am happily busy all the time, either pulling things off the top of the stack, plucking things from the middle, or coming up with some new feature that I need to add.

I feel a lot more comfortable on chat. Our IRC channels are a bit daunting at first, especially when there’s a lot of action going on. I’ve learned some good ways to interject or reach out to the people that I need to talk to.

Oh, and I still haven’t gotten a new floormat. Yes, this one’s still broken. For some reason, it just doesn’t annoy me as much as it used to. It’s almost endearing: like a three-legged puppy that I roll my chair across and stand on top of for 8 hours a day.

Meeting IRL

Why would you guys actually meet IRL? - Elliot, Mr Robot

I know a bunch of my coworkers by IRC nicks, and I see a few of their faces in a Google Hangout for our daily standup. This has been sufficient for me, but we did something truly magical in June. We met IRL.

The team I’m on (~10 people) and our sibling team (~10 people) met in Montréal, Québec, Canada, for about a week, and it was unlike any meeting-of-the-minds I’ve ever been to. The engineers were largely from the US and Europe. We all stayed in the hotel downtown and used a small conference room to hang out and work all day, every day. We woke up and ate breakfast together, met in the conference room at 9, had snacks and coffee, ate lunch together, stopped working precisely at 6, and met back in the lobby a few minutes later to go out on the town until 11 or midnight. It’s a truly intense social experience, especially for a group of people who spend most of their days only interacting with other humans through IRC, especially for a group of people who only meet twice a year or so.

This coming together allowed us to hash out a lot of our plans for the coming months, but I believe the true victory in this type of event is the camaraderie it creates. It’s nice to be able to put faces to nicks, think about the inflection in a person’s voice coming out in the way they type, and know exactly who I need to ping on IRC to accomplish certain tasks. It’s fun to hang out with people from all over the world, and it’s fun to go drinking with your coworkers, a thing I had temporarily forgotten.

I’ll note that we also happened to be in Montréal during the 23rd annual Mondial de la Biere, a huge international beer festival that lasted several days. I’ll also note that it was fun to try to speak little bits of French in Montréal, and I’m really looking forward to wherever the next sprint abroad may take us (most likely Europe in October).

Finding Work

With a decentralized company, the issue of finding things to work on for new employees can be tough. Do you give them something small and easy and possibly belittling? Do you give them something massive that will leave them scratching their heads for weeks and completely out of touch with the rest of the company? How can you find a good middle ground here?

I’d say I was “eased in” with smaller tasks into my current project. After the first few tasks were completed, it was often unclear to me where I should go next. There was always a list of bugs - was I the right person to work on them? Was there any other impending feature or unlisted bug that I should start looking into instead? These are hard questions for someone who hasn’t been around for very long.

Over time, I gained more and more responsibilities. I needed a bug fixed pronto in another project, so I did it myself and submitted an MP. Oh, you understand this codebase? You can be a maintainer now! Oh, I think I remember from your resume that you have some golang experience. We need this fixed ASAP, using SD cards is completely broken without it!

It’s all a slippery slope to having a bottomless bucket of super-interesting things to choose from to do work, perform code reviews, and answer community questions. Yet somehow it’s all happened in a way that is not overwhelming, but freeing. When I get stuck on a problem or need a short break from my current project, there’s plenty of other work to go around. Which leads me to…

Shaking Things Up

Routine can be helpful, but it can also be terrible. For the parts of May and June that I wasn’t traveling, I was largely waking up at the same time, making the same lunch, listening to the same music every day, petting the same cats, and picking up the same kinds of tasks from the backlog. None of this was bad per se, but I found myself getting a tad sluggish. It would be harder to work on menial tasks, and easier to come up with elaborate solutions to simple problems. So what did I do?

I started varying my sleep schedule. Some days I get out of bed at 7, other days I get out of bed at 8:45. Because I work from home, I don’t have to worry about fighting traffic or skipping breakfast.

I started varying my lunches. I was making some fun rissotto recipes, but since it’s summer I’ve been mixing up a bunch of different vegetables into a menagerie of salads: sauteed green tomatoes, onions, and zucchini; cukes and onions; pineapple, succotash, and eggs.

As I’ve gained more responsibilities for various projects, I’ve been able to branch into different kinds of tasks. After finishing up a big rework of one codebase, I can start jumping into bugs somewhere else. I can dig deep somewhere, and pivot when I’m ready to go back. There’s always something interesting to work on, and I can see the way different tasks help us towards our end goal. Not to mention I can always do bug hunts.

Remote Life 4-eva

It’s not just working remotely that makes this possible - it’s the people, the culture, and the fun and interesting products we’re creating. Maybe I’ll start blogging more about those things in the future. I don’t have much in this area that I’m looking to improve on, so this could be the last post in the series. Maybe I’ll do one at the 1-year mark to note any new tricks I’ve learned. Until then, keep looking our for other non-diary-entry blog posts.

Looking for advice on working remotely? Not sure if you’d like it? Do you have a strong disagreement with me as a person or my lifestyle choices? Hit me up and we can chat!

Read more
David Callé

The latest version of snapd, the service powering snaps, has just landed in Ubuntu 16.04, here are some of the highlights of this release.

New commands: buy, find private, disable, revert

A lot of new commands are available, allowing you, for example, to downgrade, disable and buy snaps:

  • When logged into a store, snap find --private lets you see snaps that have been shared with you privately.
  • The new buy command presents you a choice of payment backends for non-free snaps.
  • snap disable allows you to disable specific snaps. A disabled snap won't be updated or launched anymore. It can be enabled with the snap enable command.
  • snap revert allows you to revert a snap to its previous installed version.
  • The refresh command now works with snaps installed in devmode.

Snap try and broken states handling

When using the snap try command to mount a folder containing a snap tree as an installed snap, you can end up with a broken snap if you happen to delete the folder without removing the snap first.

This "broken" state is now acknowledged as a potential snap state and handled gracefully by the system. The broken tag now appears next to the snap in the snap list output and you can remove it with snap remove.

Interfaces changes

  • getsockopt has been allowed for connected x11 plugs.
  • /usr/bin/locale access is now part of the default confinement policy.
  • A new hardware-observe interface that gives snaps read access to hardware information from the system. See the implementation for details.

Snapcraft 2.13

Snapcraft has also seen a new release (2.13) that brings:

  • Enhanced Ubuntu Store integration with the introduction of snapcraft push (which deprecates upload) and snapcraft release. These are very important pieces to the Continuous Integration aspect of snapcraft, you will have more to read on this front very soon!
  • A new plainbox plugin which allows parts containing a Plainbox test collection.
  • Many improvements on sanitizing cloud parts declarations.

Java plugins

There has also been a strong focus on improving Java plugins with, for example:

  • Improvements to the ant and maven plugins (support for targets).
  • Introduction of a gradle plugin

To learn how to use these plugins, the easiest way is to run snapcraft help ant, snapcraft help maven and snapcraft help gradle.

Usage examples can be found in the Playpen repository and guidance in the snapcraft documentation.

Read more
Steph Wilson

Last week we released phase 1 of the new App Design Guides, which included Get started and Building blocks. Now we have just released phase 2: Patterns. This includes handy guidance on gestures, navigation and layout possibilities to provide a great user experience in your app.

Navigation: user journeys

Find guidance for utilizing components for effective and natural user journeys within your UI.

750w_Navigation_UserJourney (2)

Layouts: using Grid Units

Use the Grid Unit System to help visualise how much space you have in order to create a consistent and proportionate UI.

750w_Layouts_GridUnitSystem

More to come…

More sections will be added to patterns in the future, such as search, accessibility and communication.

Up next is phase 3: System integration; which includes the number of a touchpoints your app can plug into inside the Ubuntu operating system shell, such as the launcher, notifications and indicators.

If you want to help us improve these guides, join our mailing list. We’d love to hear from you!

Read more
facundo

Redes sociales


Desde que existen sociedades, desde que existen comunidades, desde que existe acción colectiva, existen redes sociales.

Hoy existen plataformas que median esas relaciones sociales, y ahí lo que existen son empresas haciendo lucro y haciendo negocio con las redes sociales.

Entonces nosotros preferimos separar aquellas plataformas que son empresas con fines de lucro, llámese por ejemplo Facebook, de las redes sociales que nos permiten construir colectivamente, construir en comunidad.

Hay que recuperar las redes sociales para la ciudadanía, el concepto de red social. Red social no es Facebook, red social somos nosotros hablando, somos nosotros organizándonos, usemos los medios que usemos.

Por Bea Busaniche, miembro de la Fundación Via Libre, en el capítulo "Ciberactivismo" del programa "En el medio digital" para el Canal Encuentro.

Read more
deviceguy

Sensors are an important part of IoT. Phones, robots and drones all have a slurry of sensors. Sensor chips are everywhere, doing all kinds of jobs to help and entertain us. Modern games and game consoles can thank sensors for some wonderfully active games.

Since I became involved with sensors and wrote QtSensorGestures as part of the QtSensors team at Nokia, sensors have only gotten cheaper and more prolific.

I used Ubuntu Server, snappy, a raspberry pi 3, and the senseHAT sensor board to create a senseHAT sensors snap. Of course, this currently only runs in devmode on raspberry pi3 (and pi2 as well) .

To future proof this, I wanted to get sensor data all the way up to QtSensors, for future QML access.

I now work at Canonical. Snappy is new and still in heavy development so I did run into a few issues. First up was QFactoryLoader which finds and loads plugins, was not looking in the correct spot. For some reason, it uses $SNAP/usr/bin as it's QT_PLUGIN_PATH. I got around this for now by using a wrapper script and setting QT_PLUGIN_PATH to $SNAP/usr/lib/arm-linux-gnueabihf/qt5/plugins

Second issue was that QSensorManager could not see it's configuration file in /etc/xdg/QtProject which is not accessible to a snap. So I used the wrapper script to set up  XDG_CONFIG_DIRS as $SNAP/etc/xdg

[NOTE] I just discovered there is a part named "qt5conf" that can be used to setup Qt's env vars by using the included command qt5-launch  to run your snap's commands.

Since there is no libhybris in Ubuntu Core, I had to decide what QtSensor backend to use. I could have used sensorfw, or maybe iio-sensor-proxy but RTIMULib already worked for senseHAT. It was easier to write a QtSensors plugin that used RTIMULib, as opposed to adding it into sensorfw. iio-sensor-proxy is more for laptop like machines and lacks many sensors.
RTIMULib uses a configuration file that needs to be in a writable area, to hold additional device specific calibration data. Luckily, one of it's functions takes a directory path to look in. Since I was creating the plugin, I made it use a new variable SENSEHAT_CONFIG_DIR so I could then set that up in the wrapper script.

This also runs in confinement without devmode, but involves a simple sensors snapd interface.
One of the issues I can already see with this is that there are a myriad ways of accessing the sensors. Different kernel interfaces - iio,  sysfs, evdev, different middleware - android SensorManager/hybris, libhardware/hybris, sensorfw and others either I cannot speak of or do not know about.

Once the snap goes through a review, it will live here https://code.launchpad.net/~snappy-hwe-team/snappy-hwe-snaps/+git/sensehat, but for now, there is working code is at my sensehat repo.

Next up to snapify, the Matrix Creator sensor array! Perhaps I can use my sensorfw snap or iio-sensor-proxy snap for that.

Read more
abeato

So there I was. I did have to use a proprietary library, for which I had no sources and no real hope of support from the creators. I built my program against it, I ran it, and I got a segmentation fault. An exception that seemed to happen inside that insidious library, which was of course stripped of all debugging information. I scratched my head, changed my code, checked traces, tried valgrind, strace, and other debugging tools, but found no obvious error. Finally, I assumed that I had to dig deeper and do some serious debugging of the library’s assembly code with gdb. The rest of the post is dedicated to the steps I followed to find out what was happening inside the wily proprietary library that we will call libProprietary. Prerequisites for this article are some knowledge of gdb and ARM architecture.

Some background on the task I was doing: I am a Canonical employee that works as developer for Ubuntu for Phones. In most, if not all, phones, the BSP code is not 100% open and we have to use proprietary libraries built for Android. Therefore, these libraries use bionic, Android’s libc implementation. As we want to call them inside binaries compiled with glibc, we resort to libhybris, an ingenious library that is able to load and call libraries compiled against bionic while the rest of the process uses glibc. This will turn out to be critical in this debugging. Note also that we are debugging ARM 32-bits binaries here.

The Debugging Session

To start, I made sure I had installed glibc and other libraries symbols and started to debug by using gdb in the usual way:

$ gdb myprogram
GNU gdb (Ubuntu 7.9-1ubuntu1) 7.9
...
Starting program: myprogram
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
[New Thread 0xf49de460 (LWP 7101)]
[New Thread 0xf31de460 (LWP 7104)]
[New Thread 0xf39de460 (LWP 7103)]
[New Thread 0xf41de460 (LWP 7102)]
[New Thread 0xf51de460 (LWP 7100)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xf49de460 (LWP 7101)]
0x00000000 in ?? ()
(gdb) bt
#0  0x00000000 in ?? ()
#1  0xf520bd06 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) info proc mappings
process 7097
Mapped address spaces:

	Start Addr   End Addr       Size     Offset objfile
	   0x10000    0x17000     0x7000        0x0 /usr/bin/myprogram
	...
	0xf41e0000 0xf49df000   0x7ff000        0x0 [stack:7101]
	...
	0xf51f6000 0xf5221000    0x2b000        0x0 /android/system/lib/libProprietary.so
	0xf5221000 0xf5222000     0x1000        0x0 
	0xf5222000 0xf5224000     0x2000    0x2b000 /android/system/lib/libProprietary.so
	0xf5224000 0xf5225000     0x1000    0x2d000 /android/system/lib/libProprietary.so
	...
(gdb)

We can see here that we get the promised crash. I execute a couple of gdb commands after that to see the backtrace and part of the process address space that will be of interest in the following discussion. The backtrace shows that a segment violation happened when the CPU tried to execute instructions in address zero, and we can see by checking the process mappings that the previous frame lives inside the text segment of libProprietary.so. There is no backtrace beyond that point, but that should come as no surprise as there is no DWARF information in libProprietary, and also noting that usage of frame pointer is optimized away quite commonly these days.

After this I tried to get a bit more information on the CPU state when the crash happened:

(gdb) info reg
r0             0x0	0
r1             0x0	0
r2             0x0	0
r3             0x9	9
r4             0x0	0
r5             0x0	0
r6             0x0	0
r7             0x0	0
r8             0x0	0
r9             0x0	0
r10            0x0	0
r11            0x0	0
r12            0xffffffff	4294967295
sp             0xf49dde70	0xf49dde70
lr             0xf520bd07	-182403833
pc             0x0	0x0
cpsr           0x60000010	1610612752
(gdb) disassemble 0xf520bd02,+10
Dump of assembler code from 0xf520bd02 to 0xf520bd0c:
   0xf520bd02:	b	0xf49c9cd6
   0xf520bd06:	movwpl	pc, #18628	; 0x48c4	<UNPREDICTABLE>
   0xf520bd0a:	andlt	r4, r11, r8, lsr #12
End of assembler dump.
(gdb) 

Hmm, we are starting to see weird things here. First, in 0xf520bd02 (which probably has been executed little before the crash) we get an unconditional branch to some point in the thread stack (see mappings in previous figure). Second, the instruction in 0xf520bd06 (which should be executed after returning from the procedure that provokes the crash) would load into the pc (program counter) an address that is not mapped: we saw that the first mapped address is 0x10000 in the previous figure. The movw instruction has also a “pl” suffix that makes the instruction execute only when the operand is positive or zero… which is obviously unnecessary as 0x48c4 is encoded in the instruction.

I resorted to doing objdump -d libProprietary.so to disassemble the library and compare with gdb output. objdump shows, in that part of the file (subtracting the library load address gives us the offset inside the file: 0xf520bd02-0xf51f6000=0x15d02):

   15d02:	f7f3 eade 	blx	92c0 <__android_log_print@plt>;
   15d06:	f8c4 5304 	str.w	r5, [r4, #772]	; 0x304
   15d0a:	4628      	mov	r0, r5
   15d0c:	b00b      	add	sp, #44	; 0x2c
   15d0e:	e8bd 8ff0 	ldmia.w	sp!, {r4, r5, r6, r7, r8, r9, sl, fp, pc}

which is completely different from what gdb shows! What is happening here? Taking a look at addresses for both code chunks, we see that instructions are always 4 bytes in gdb output, while they are 2 or 4 in objdump‘s. Well, you have guessed, don’t you? We are seeing “normal” ARM instructions in gdb, while objdump is decoding THUMB-2 instructions. Certainly objdump seems to be right here as the output is more sensible: we have a call to an executable part of the process space in 0x15d02 (it is resolved to a known function, __android_log_print), and the following instructions seems like a normal function epilogue in ARM: a return value is stored in r0, the sp (stack pointer) is incremented (we are freeing space in the stack), and we restore registers.

If we get back to the register values, we see that cpsr (current program status register [1]) does not have the T bit set, so gdb thinks we are using ARM instructions. We can change this by doing

(gdb) set $cpsr=0x60000030
(gdb) disass 0xf520bd02,+15
Dump of assembler code from 0xf520bd02 to 0xf520bd11:
   0xf520bd02:	blx	0xf51ff2c0
   0xf520bd06:	str.w	r5, [r4, #772]	; 0x304
   0xf520bd0a:	mov	r0, r5
   0xf520bd0c:	add	sp, #44	; 0x2c
   0xf520bd0e:	ldmia.w	sp!, {r4, r5, r6, r7, r8, r9, r10, r11, pc}
End of assembler dump.

Ok, much better now [2]. The thumb bit in cpsr is determined by last bx/blx call: if the address is odd, the procedure to which we are calling contains THUMB instructions, otherwise they are ARM (a good reference for these instructions is [3]). In this case, after an exception the CPU moves to arm mode, and gdb is unable to know which is the right mode when disassembling. We can search for hints on which parts of the code are arm/thumb by looking at the values in registers used by bx/blx, or by looking at the lr (link register): we can see above that the value after the crash was 0xf520bd07, which is odd and indicates that 0xf520bd06 contains a thumb instruction. However, for some reason gdb is not able to take advantage of this information.

Of course this problem does not happen if we have debugging information: in that case we have special symbols that let gdb know if the section where the code is contains thumb instructions or not [4]. As those are not found, gdb uses the cpsr value. Here objdump seems to have better heuristics though.

After solving this issue with instruction decoding, I started to debug __android_log_print to check what was happening there, as it looked like the crash was happening in that call. I spent quite a lot of time there, but found nothing. All looked fine, and I started to despair. Until I inserted a breakpoint in address 0xf520bd06, right after the call to __android_log_print, run the program… and it stopped at that address, no crash happened. I started to execute the program instruction by instruction after that:

(gdb) b *0xf520bd06
(gdb) run
...
Breakpoint 1, 0xf520bd06 in ?? ()
(gdb) si
0xf520bd0a in ?? ()
(gdb) si
0xf520bd0c in ?? ()
(gdb) si
0xf520bd0e in ?? ()
Warning:
Cannot insert breakpoint 0.
Cannot access memory at address 0x0

Something was apparently wrong with instruction ldmia, which restores registers, including the pc, from the stack. I took a look at the stack in that moment (taking into account that ldmia had already modified the sp after restoring 9 registers == 36 bytes):

(gdb) x/16xw $sp-36
0xf49dde4c:	0x00000000	0x00000000	0x00000000	0x00000000
0xf49dde5c:	0x00000000	0x00000000	0x00000000	0x00000000
0xf49dde6c:	0x00000000	0x00000000	0x00000000	0x00000000
0xf49dde7c:	0x00000000	0x00000000	0x00000000	0x00000000

All zeros! At this point it is clear that this is the real point where the crash is happening, as we are loading 0 into the pc. This looked clearly like a stack corruption issue.

But, before moving forward, why are we getting a wrong backtrace from gdb? Well, gdb is seeing a corrupted stack, so it is not able to unwind it. It would not be able to unwind it even if having full debug information. The only hint it has is the lr. This register contains the return address after execution of a bl/blx instruction [3]. If the called procedure is non-leaf, it is saved in the prologue, and restored in the epilogue, because it gets overwritten when branching to other procedures. In this case, it is restored on the pc and sometimes it is also saved back in the lr, depending on whether we have arm-thumb interworking built in the procedure or not [5]. It is not overwritten if we have a leaf procedure (as there are no procedure calls inside these).

As gdb has no additional information, it uses the lr to build the backtrace, assuming we are in a leaf procedure. However this is not true and the backtrace turns out to be wrong. Nonetheless, this information was not completely useless: lr was pointing to the instruction right after the last bl/blx instruction that was executed, which was not that far away from the real point where the program was crashing. This happened because fortunately __android_log_print has interworking code and restores the lr, otherwise the value of lr could have been from a point much far away from the point where the real crash happens. Believe or not, but it could have been even worse!

Having now a clear idea of where and why the crash was happening, things accelerated. The procedure where the crash happened, as disassembled by objdump, was (I include here only the more relevant parts of the code)

00015b1c <ProprietaryProcedure@@Base>:
   15b1c:	e92d 4ff0 	stmdb	sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
   15b20:	b08b      	sub	sp, #44	; 0x2c
   15b22:	497c      	ldr	r1, [pc, #496]	; (15d14 <ProprietaryProcedure@@Base+0x1f8>)
   15b24:	2500      	movs	r5, #0
   15b26:	9500      	str	r5, [sp, #0]
   15b28:	4604      	mov	r4, r0
   15b2a:	4479      	add	r1, pc
   15b2c:	462b      	mov	r3, r5
   15b2e:	f8df 81e8 	ldr.w	r8, [pc, #488]	; 15d18 <ProprietaryProcedure@@Base+0x1fc>
   15b32:	462a      	mov	r2, r5
   15b34:	f8df 91e4 	ldr.w	r9, [pc, #484]	; 15d1c <ProprietaryProcedure@@Base+0x200>
   15b38:	ae06      	add	r6, sp, #24
   15b3a:	f8df a1e4 	ldr.w	sl, [pc, #484]	; 15d20 <ProprietaryProcedure@@Base+0x204>
   15b3e:	200f      	movs	r0, #15
   15b40:	f8df b1e0 	ldr.w	fp, [pc, #480]	; 15d24 <ProprietaryProcedure@@Base+0x208>
   15b44:	f7f3 ef76 	blx	9a34 <prctl@plt>
   15b48:	44f8      	add	r8, pc
   15b4a:	4629      	mov	r1, r5
   15b4c:	44f9      	add	r9, pc
   15b4e:	2210      	movs	r2, #16
   15b50:	44fa      	add	sl, pc
   15b52:	4630      	mov	r0, r6
   15b54:	44fb      	add	fp, pc
   15b56:	f7f3 ea40 	blx	8fd8 <memset@plt>
   15b5a:	a807      	add	r0, sp, #28
   15b5c:	f7f3 ef70 	blx	9a40 <sigemptyset@plt>
   15b60:	4b71      	ldr	r3, [pc, #452]	; (15d28 <ProprietaryProcedure@@Base+0x20c>)
   15b62:	462a      	mov	r2, r5
   15b64:	9508      	str	r5, [sp, #32]
   15b66:	4631      	mov	r1, r6
   15b68:	447b      	add	r3, pc
   15b6a:	681b      	ldr	r3, [r3, #0]
   15b6c:	200a      	movs	r0, #10
   15b6e:	9306      	str	r3, [sp, #24]
   15b70:	f7f3 ef6c 	blx	9a4c <sigaction@plt>
   ...
   15d02:	f7f3 eade 	blx	92c0 <__android_log_print@plt>
   15d06:	f8c4 5304 	str.w	r5, [r4, #772]	; 0x304
   15d0a:	4628      	mov	r0, r5
   15d0c:	b00b      	add	sp, #44	; 0x2c
   15d0e:	e8bd 8ff0 	ldmia.w	sp!, {r4, r5, r6, r7, r8, r9, sl, fp, pc}

The addresses where this code is loaded can be easily computed by adding 0xf51f6000 to the file offsets shown in the first column. We see that a few calls to different external functions [6] are performed by ProprietaryProcedure, which is itself an exported symbol.

I restarted the debug session, added a breakpoint at the start of ProprietaryProcedure, right after stmdb saves the state, and checked the stack values:

(gdb) b *0xf520bb20
Breakpoint 1 at 0xf520bb20
(gdb) cont
...
Breakpoint 1, 0xf520bb20 in ?? ()
(gdb) p $sp
$1 = (void *) 0xf49dde4c
(gdb) x/16xw $sp
0xf49dde4c:	0xf49de460	0x0007df00	0x00000000	0xf49dde70
0xf49dde5c:	0xf49de694	0x00000000	0xf77e9000	0x00000000
0xf49dde6c:	0xf75b4491	0x00000000	0xf49de460	0x00000000
0xf49dde7c:	0x00000000	0xfd5b4eba	0xfe9dd4a3	0xf49de460

We can see that the stack contains something, including a return address that looks valid (0xf75b4491). Note also that the procedure must never touch this part of the stack, as it belongs to the caller of ProprietaryProcedure.

Now it is a simply a matter of bisecting the code between the beginning and the end of ProprietaryProcedure to find out where we are clobbering the stack. I will save you of developing here this tedious process. Instead, I will just show, that, in the end, it turned out that the call to sigemptyset() is the culprit [7]:

(gdb) b *0xf520bb5c
Breakpoint 1 at 0xf520bb5c
(gdb) b *0xf520bb60
Breakpoint 2 at 0xf520bb60
(gdb) run
Breakpoint 1, 0xf520bb5c in ?? ()
(gdb) x/16xw 0xf49dde4c
0xf49dde4c:	0xf49de460	0x0007df00	0x00000000	0xf49dde70
0xf49dde5c:	0xf49de694	0x00000000	0xf77e9000	0x00000000
0xf49dde6c:	0xf75b4491	0x00000000	0xf49de460	0x00000000
0xf49dde7c:	0x00000000	0xfd5b4eba	0xfe9dd4a3	0xf49de460
(gdb) cont
Continuing.
Breakpoint 2, 0xf520bb60 in ?? ()
(gdb) x/16xw 0xf49dde4c
0xf49dde4c:	0x00000000	0x00000000	0x00000000	0x00000000
0xf49dde5c:	0x00000000	0x00000000	0x00000000	0x00000000
0xf49dde6c:	0x00000000	0x00000000	0x00000000	0x00000000
0xf49dde7c:	0x00000000	0x00000000	0x00000000	0x00000000

Note here that I am printing the part of the stack not reserved by the function (0xf49dde4c is the value of the sp before execution of the line at offset 0x15b20, see the code).

What is going wrong here? Now, remember that at the beginning of the article I mentioned that we were using libhybris. libProprietary assumes a bionic environment, and the libc functions it calls are from bionic’s libc. However, libhybris has hooks for some bionic functions: for them bionic is not called, instead the hook is invoked. libhybris does this to avoid conflicts between bionic and glibc: for instance having two allocators fighting for process address space is a recipe for disaster, so malloc() and related functions are hooked and the hooks call in the end the glibc implementation. Signals related functions were hooked too, including sigemptyset(), and in this case the hook simply called glibc implementation.

I looked at glibc and bionic implementations, in both cases sigemptyset() is a very simple utility function that clears with memset() a sigset_t variable. All pointed to different definitions of sigset_t depending on the library. Definition turned out to be a bit messy when looking at the code as it depended on build time definitions, so I resorted to gdb to print the type. For a executable compiled for glibc, I saw

(gdb) ptype sigset_t
type = struct {
    unsigned long __val[32];
}

and for one using bionic

(gdb) ptype sigset_t
type = unsigned long

This finally confirms where the bug is, and explains it: we are overwriting the stack because libProprietary reserves in the stack memory for bionic’s sigset_t, while we are using glibc’s sigemptyset(), which uses a different definition for it. As this definition is much bigger, the stack gets overwritten after the call to memset(). And we get the crash later when trying to restore registers when the function returns.

After knowing this, the solution was simple: I removed the libhybris hooks for signal functions, recompiled it, and… all worked just fine, no crashes anymore!

However, this is not the final solution: as signals are shared resources, it makes sense to hook them in libhybris. But, to do it properly, the hooks have to translate types between bionic in glibc, thing that we were not doing (we were simply calling glibc implementation). That, however, is “just work”.

Of course I wondered why the heck a library that is kind of generic needs to mess around with signals, but hey, that is not my fault ;-).

Conclusions

I can say I learned several things while debugging this:

  1. Not having the sources is terrible for debugging (well, I already knew this). Unfortunately not open sourcing the code is still a standard practice in part of the industry.
  2. The most interesting technical bit here is IMHO that we need to be very cautious with the backtrace that debuggers shows after a crash. If you start to see things that do not make sense it is possible that registers or stack have been messed up and the real crash happens elsewhere. Bear in mind that the very first thing to do when a program crashes is to make sure that we know the exact point where that happens.
  3. We have to be careful in ARM when disassembling, because if there is no debug information we could be seeing the wrong instruction set. We can check evenness of addresses used by bx/blx and of the lr to make sure we are in the right mode.
  4. Some times taking a look at assembly code can help us when debugging, even when we have the sources. Note that if I had had the C sources I would have seen the crash happening right when returning from a function, and it might not have been that immediate to find out that the stack was messed up. The assembly clearly pointed to an overwritten stack.
  5. Finally, I personally learned some bits of ARM architecture that I did not know, which was great.

Well, this is it. I hope you enjoyed the (lengthy, I know) article. Thanks for your reading!

[1] http://www.heyrick.co.uk/armwiki/The_Status_register
[2] We can get the same result by executing in gdb set arm fallback-mode thumb, but changing the register seemed more pedagogical here.
[3] http://infocenter.arm.com/help/topic/com.arm.doc.dui0068b/DUI0068.pdf
[4] http://reverseengineering.stackexchange.com/questions/6080/how-to-detect-thumb-mode-in-arm-disassembly
[5] http://www.mcternan.me.uk/ArmStackUnwinding/
[6] In fact the calls are to the PLT section, which is inside the library. The PLT calls in turn, by using addresses in the GOT data section, either directly the function or the dynamic loader, as we are doing lazy loading. See https://www.technovelty.org/linux/plt-and-got-the-key-to-code-sharing-and-dynamic-libraries.html, for instance.
[7] I had to use two breakpoints between consecutive instructions because the “ni” gdb command was not working well here.

Read more